icefall-asr-zipformer-multi-zh-en-2023-11-22 / decoding_result /greedy_search /log-decode-epoch-34-avg-19-context-2-max-sym-per-frame-1-use-averaged-model-2023-11-16-14-14-27
jinzr
added results
475fe4d
raw
history blame contribute delete
No virus
6.91 kB
2023-11-16 14:14:27,517 INFO [decode.py:688] Decoding started
2023-11-16 14:14:27,517 INFO [decode.py:694] Device: cuda:0
2023-11-16 14:14:27,523 INFO [decode.py:704] {'best_train_loss': inf, 'best_valid_loss': inf, 'best_train_epoch': -1, 'best_valid_epoch': -1, 'batch_idx_train': 0, 'log_interval': 50, 'reset_interval': 200, 'valid_interval': 3000, 'feature_dim': 80, 'subsampling_factor': 4, 'warm_step': 2000, 'env_info': {'k2-version': '1.24.3', 'k2-build-type': 'Release', 'k2-with-cuda': True, 'k2-git-sha1': '821ebc378e7fb99b8adc81950227963332821e01', 'k2-git-date': 'Wed Jul 19 15:38:25 2023', 'lhotse-version': '1.17.0.dev+git.b3dc9faf.dirty', 'torch-version': '1.11.0+cu102', 'torch-cuda-available': True, 'torch-cuda-version': '10.2', 'python-version': '3.9', 'icefall-git-branch': 'dev/bilingual', 'icefall-git-sha1': '4897f2c0-dirty', 'icefall-git-date': 'Thu Sep 28 11:38:28 2023', 'icefall-path': '/star-home/jinzengrui/lib/miniconda3/envs/dev39/lib/python3.9/site-packages/icefall-1.0-py3.9.egg', 'k2-path': '/star-home/jinzengrui/lib/miniconda3/envs/dev39/lib/python3.9/site-packages/k2-1.24.3.dev20230721+cuda10.2.torch1.11.0-py3.9-linux-x86_64.egg/k2/__init__.py', 'lhotse-path': '/star-home/jinzengrui/lib/miniconda3/envs/dev39/lib/python3.9/site-packages/lhotse-1.17.0.dev0+git.b3dc9faf.dirty-py3.9.egg/lhotse/__init__.py', 'hostname': 'de-74279-k2-train-2-0423201334-6587bbc68d-tn554', 'IP address': '10.177.74.211'}, 'epoch': 34, 'iter': 0, 'avg': 19, 'use_averaged_model': True, 'exp_dir': PosixPath('zipformer/exp-w-tal-csasr'), 'bpe_model': 'data/lang_bbpe_2000/bbpe.model', 'lang_dir': PosixPath('data/lang_bbpe_2000'), 'decoding_method': 'greedy_search', 'beam_size': 4, 'beam': 20.0, 'ngram_lm_scale': 0.01, 'max_contexts': 8, 'max_states': 64, 'context_size': 2, 'max_sym_per_frame': 1, 'num_paths': 200, 'nbest_scale': 0.5, 'use_tal_csasr': False, 'use_librispeech': False, 'use_aishell2': False, 'num_encoder_layers': '2,2,3,4,3,2', 'downsampling_factor': '1,2,4,8,4,2', 'feedforward_dim': '512,768,1024,1536,1024,768', 'num_heads': '4,4,4,8,4,4', 'encoder_dim': '192,256,384,512,384,256', 'query_head_dim': '32', 'value_head_dim': '12', 'pos_head_dim': '4', 'pos_dim': 48, 'encoder_unmasked_dim': '192,192,256,256,256,192', 'cnn_module_kernel': '31,31,15,15,15,31', 'decoder_dim': 512, 'joiner_dim': 512, 'causal': False, 'chunk_size': '16,32,64,-1', 'left_context_frames': '64,128,256,-1', 'use_transducer': True, 'use_ctc': False, 'manifest_dir': PosixPath('data/fbank'), 'max_duration': 300.0, 'bucketing_sampler': True, 'num_buckets': 30, 'concatenate_cuts': False, 'duration_factor': 1.0, 'gap': 1.0, 'on_the_fly_feats': False, 'shuffle': True, 'drop_last': True, 'return_cuts': True, 'num_workers': 2, 'enable_spec_aug': True, 'spec_aug_time_warp_factor': 80, 'enable_musan': True, 'input_strategy': 'PrecomputedFeatures', 'res_dir': PosixPath('zipformer/exp-w-tal-csasr/greedy_search'), 'suffix': 'epoch-34-avg-19-context-2-max-sym-per-frame-1-use-averaged-model', 'blank_id': 0, 'unk_id': 2, 'vocab_size': 2000}
2023-11-16 14:14:27,523 INFO [decode.py:706] About to create model
2023-11-16 14:14:28,124 INFO [decode.py:773] Calculating the averaged model over epoch range from 15 (excluded) to 34
2023-11-16 14:14:41,200 INFO [decode.py:807] Number of model parameters: 68625511
2023-11-16 14:14:41,201 INFO [multi_dataset.py:142] About to get multidataset test cuts
2023-11-16 14:14:41,201 INFO [multi_dataset.py:145] Loading Aishell-2 set in lazy mode
2023-11-16 14:14:41,201 INFO [multi_dataset.py:157] Loading TAL-CSASR set in lazy mode
2023-11-16 14:14:42,014 INFO [decode.py:831] Start decoding test set: tal_csasr_test
2023-11-16 14:14:44,880 INFO [decode.py:585] batch 0/?, cuts processed until now is 27
2023-11-16 14:15:54,388 INFO [decode.py:585] batch 50/?, cuts processed until now is 2119
2023-11-16 14:16:59,489 INFO [decode.py:585] batch 100/?, cuts processed until now is 4248
2023-11-16 14:17:55,206 INFO [zipformer.py:1853] name=None, attn_weights_entropy = tensor([1.5752, 2.5947, 2.3506, 2.2818, 2.2614, 2.6720, 2.1701, 2.6863],
device='cuda:0')
2023-11-16 14:18:00,298 INFO [decode.py:585] batch 150/?, cuts processed until now is 6676
2023-11-16 14:19:03,913 INFO [decode.py:585] batch 200/?, cuts processed until now is 9045
2023-11-16 14:19:44,165 INFO [zipformer.py:1853] name=None, attn_weights_entropy = tensor([5.8812, 5.3621, 5.5944, 5.7414], device='cuda:0')
2023-11-16 14:20:13,014 INFO [decode.py:585] batch 250/?, cuts processed until now is 11158
2023-11-16 14:20:41,518 INFO [zipformer.py:1853] name=None, attn_weights_entropy = tensor([2.0741, 3.4315, 2.9031, 2.7450, 2.9100, 3.3876, 2.6770, 3.3892],
device='cuda:0')
2023-11-16 14:21:12,327 INFO [decode.py:585] batch 300/?, cuts processed until now is 13809
2023-11-16 14:21:51,844 INFO [zipformer.py:1853] name=None, attn_weights_entropy = tensor([4.7611, 4.4941, 4.3225, 4.2480], device='cuda:0')
2023-11-16 14:21:52,987 INFO [decode.py:601] The transcripts are stored in zipformer/exp-w-tal-csasr/greedy_search/recogs-tal_csasr_test-greedy_search-epoch-34-avg-19-context-2-max-sym-per-frame-1-use-averaged-model.txt
2023-11-16 14:21:53,571 INFO [utils.py:565] [tal_csasr_test-greedy_search] %WER 6.69% [22424 / 334989, 3750 ins, 5288 del, 13386 sub ]
2023-11-16 14:21:54,715 INFO [decode.py:614] Wrote detailed error stats to zipformer/exp-w-tal-csasr/greedy_search/errs-tal_csasr_test-greedy_search-epoch-34-avg-19-context-2-max-sym-per-frame-1-use-averaged-model.txt
2023-11-16 14:21:54,718 INFO [decode.py:630]
For tal_csasr_test, WER of different settings are:
greedy_search 6.69 best for tal_csasr_test
2023-11-16 14:21:54,719 INFO [decode.py:831] Start decoding test set: tal_csasr_dev
2023-11-16 14:21:57,172 INFO [decode.py:585] batch 0/?, cuts processed until now is 26
2023-11-16 14:22:34,723 INFO [decode.py:585] batch 50/?, cuts processed until now is 2081
2023-11-16 14:23:17,362 INFO [decode.py:585] batch 100/?, cuts processed until now is 4451
2023-11-16 14:23:26,748 INFO [zipformer.py:1853] name=None, attn_weights_entropy = tensor([2.0047, 1.8146, 4.9271, 4.2765], device='cuda:0')
2023-11-16 14:23:29,751 INFO [decode.py:601] The transcripts are stored in zipformer/exp-w-tal-csasr/greedy_search/recogs-tal_csasr_dev-greedy_search-epoch-34-avg-19-context-2-max-sym-per-frame-1-use-averaged-model.txt
2023-11-16 14:23:29,950 INFO [utils.py:565] [tal_csasr_dev-greedy_search] %WER 6.65% [7575 / 113905, 1284 ins, 1773 del, 4518 sub ]
2023-11-16 14:23:30,343 INFO [decode.py:614] Wrote detailed error stats to zipformer/exp-w-tal-csasr/greedy_search/errs-tal_csasr_dev-greedy_search-epoch-34-avg-19-context-2-max-sym-per-frame-1-use-averaged-model.txt
2023-11-16 14:23:30,346 INFO [decode.py:630]
For tal_csasr_dev, WER of different settings are:
greedy_search 6.65 best for tal_csasr_dev
2023-11-16 14:23:30,346 INFO [decode.py:848] Done!