[2022-12-19 18:22:21,564] [WARNING] [runner.py:179:fetch_hostfile] Unable to find hostfile, will proceed with training with local resources only. [2022-12-19 18:22:21,575] [INFO] [runner.py:508:main] cmd = /home/milan/hf_env/bin/python3 -u -m deepspeed.launcher.launch --world_info=eyJsb2NhbGhvc3QiOiBbMF19 --master_addr=127.0.0.1 --master_port=29500 run_speech_recognition_seq2seq_streaming.py --deepspeed=ds_config.json --model_name_or_path=openai/whisper-medium --dataset_name=mozilla-foundation/common_voice_11_0 --dataset_config_name=sl --language=slovenian --train_split_name=train+validation --eval_split_name=test --model_index_name=Whisper Medium Slovenian CV11 --max_steps=5000 --output_dir=./ --per_device_train_batch_size=64 --per_device_eval_batch_size=32 --logging_steps=25 --learning_rate=1e-5 --warmup_steps=500 --evaluation_strategy=steps --eval_steps=1000 --save_strategy=steps --save_steps=1000 --generation_max_length=225 --length_column_name=input_length --max_duration_in_seconds=30 --text_column_name=sentence --freeze_feature_encoder=False --report_to=tensorboard --metric_for_best_model=wer --greater_is_better=False --load_best_model_at_end --gradient_checkpointing --fp16 --overwrite_output_dir --do_train --do_eval --predict_with_generate --do_normalize_eval --streaming=False --use_auth_token --push_to_hub [2022-12-19 18:22:23,159] [INFO] [launch.py:142:main] WORLD INFO DICT: {'localhost': [0]} [2022-12-19 18:22:23,159] [INFO] [launch.py:148:main] nnodes=1, num_local_procs=1, node_rank=0 [2022-12-19 18:22:23,159] [INFO] [launch.py:161:main] global_rank_mapping=defaultdict(, {'localhost': [0]}) [2022-12-19 18:22:23,159] [INFO] [launch.py:162:main] dist_world_size=1 [2022-12-19 18:22:23,159] [INFO] [launch.py:164:main] Setting CUDA_VISIBLE_DEVICES=0 [2022-12-19 18:22:27,335] [INFO] [comm.py:654:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl 12/19/2022 18:22:27 - WARNING - __main__ - Process rank: 0, device: cuda:0, n_gpu: 1distributed training: True, 16-bits training: True 12/19/2022 18:22:27 - INFO - __main__ - Training/evaluation parameters Seq2SeqTrainingArguments( _n_gpu=1, adafactor=False, adam_beta1=0.9, adam_beta2=0.999, adam_epsilon=1e-08, auto_find_batch_size=False, bf16=False, bf16_full_eval=False, data_seed=None, dataloader_drop_last=False, dataloader_num_workers=0, dataloader_pin_memory=True, ddp_bucket_cap_mb=None, ddp_find_unused_parameters=None, ddp_timeout=1800, debug=[], deepspeed=ds_config.json, disable_tqdm=False, do_eval=True, do_predict=False, do_train=True, eval_accumulation_steps=None, eval_delay=0, eval_steps=1000, evaluation_strategy=steps, fp16=True, fp16_backend=auto, fp16_full_eval=False, fp16_opt_level=O1, fsdp=[], fsdp_min_num_params=0, fsdp_transformer_layer_cls_to_wrap=None, full_determinism=False, generation_max_length=225, generation_num_beams=None, gradient_accumulation_steps=1, gradient_checkpointing=True, greater_is_better=False, group_by_length=False, half_precision_backend=auto, hub_model_id=None, hub_private_repo=False, hub_strategy=every_save, hub_token=, ignore_data_skip=False, include_inputs_for_metrics=False, jit_mode_eval=False, label_names=None, label_smoothing_factor=0.0, learning_rate=1e-05, length_column_name=input_length, load_best_model_at_end=True, local_rank=0, log_level=passive, log_level_replica=passive, log_on_each_node=True, logging_dir=./runs/Dec19_18-22-27_129-146-123-136, logging_first_step=False, logging_nan_inf_filter=True, logging_steps=25, logging_strategy=steps, lr_scheduler_type=linear, max_grad_norm=1.0, max_steps=5000, metric_for_best_model=wer, mp_parameters=, no_cuda=False, num_train_epochs=3.0, optim=adamw_hf, optim_args=None, output_dir=./, overwrite_output_dir=True, past_index=-1, per_device_eval_batch_size=32, per_device_train_batch_size=64, predict_with_generate=True, prediction_loss_only=False, push_to_hub=True, push_to_hub_model_id=None, push_to_hub_organization=None, push_to_hub_token=, ray_scope=last, remove_unused_columns=True, report_to=['tensorboard'], resume_from_checkpoint=None, run_name=./, save_on_each_node=False, save_steps=1000, save_strategy=steps, save_total_limit=None, seed=42, sharded_ddp=[], skip_memory_metrics=True, sortish_sampler=False, tf32=None, torch_compile=False, torch_compile_backend=None, torch_compile_mode=None, torchdynamo=None, tpu_metrics_debug=False, tpu_num_cores=None, use_ipex=False, use_legacy_prediction_loop=False, use_mps_device=False, warmup_ratio=0.0, warmup_steps=500, weight_decay=0.0, xpu_backend=None, ) 12/19/2022 18:22:27 - INFO - __main__ - Training/evaluation parameters Seq2SeqTrainingArguments( _n_gpu=1, adafactor=False, adam_beta1=0.9, adam_beta2=0.999, adam_epsilon=1e-08, auto_find_batch_size=False, bf16=False, bf16_full_eval=False, data_seed=None, dataloader_drop_last=False, dataloader_num_workers=0, dataloader_pin_memory=True, ddp_bucket_cap_mb=None, ddp_find_unused_parameters=None, ddp_timeout=1800, debug=[], deepspeed=ds_config.json, disable_tqdm=False, do_eval=True, do_predict=False, do_train=True, eval_accumulation_steps=None, eval_delay=0, eval_steps=1000, evaluation_strategy=steps, fp16=True, fp16_backend=auto, fp16_full_eval=False, fp16_opt_level=O1, fsdp=[], fsdp_min_num_params=0, fsdp_transformer_layer_cls_to_wrap=None, full_determinism=False, generation_max_length=225, generation_num_beams=None, gradient_accumulation_steps=1, gradient_checkpointing=True, greater_is_better=False, group_by_length=False, half_precision_backend=auto, hub_model_id=None, hub_private_repo=False, hub_strategy=every_save, hub_token=, ignore_data_skip=False, include_inputs_for_metrics=False, jit_mode_eval=False, label_names=None, label_smoothing_factor=0.0, learning_rate=1e-05, length_column_name=input_length, load_best_model_at_end=True, local_rank=0, log_level=passive, log_level_replica=passive, log_on_each_node=True, logging_dir=./runs/Dec19_18-22-27_129-146-123-136, logging_first_step=False, logging_nan_inf_filter=True, logging_steps=25, logging_strategy=steps, lr_scheduler_type=linear, max_grad_norm=1.0, max_steps=5000, metric_for_best_model=wer, mp_parameters=, no_cuda=False, num_train_epochs=3.0, optim=adamw_hf, optim_args=None, output_dir=./, overwrite_output_dir=True, past_index=-1, per_device_eval_batch_size=32, per_device_train_batch_size=64, predict_with_generate=True, prediction_loss_only=False, push_to_hub=True, push_to_hub_model_id=None, push_to_hub_organization=None, push_to_hub_token=, ray_scope=last, remove_unused_columns=True, report_to=['tensorboard'], resume_from_checkpoint=None, run_name=./, save_on_each_node=False, save_steps=1000, save_strategy=steps, save_total_limit=None, seed=42, sharded_ddp=[], skip_memory_metrics=True, sortish_sampler=False, tf32=None, torch_compile=False, torch_compile_backend=None, torch_compile_mode=None, torchdynamo=None, tpu_metrics_debug=False, tpu_num_cores=None, use_ipex=False, use_legacy_prediction_loop=False, use_mps_device=False, warmup_ratio=0.0, warmup_steps=500, weight_decay=0.0, xpu_backend=None, ) 12/19/2022 18:22:29 - INFO - datasets.info - Loading Dataset Infos from /home/milan/.cache/huggingface/modules/datasets_modules/datasets/mozilla-foundation--common_voice_11_0/f8e47235d9b4e68fa24ed71d63266a02018ccf7194b2a8c9c598a5f3ab304d9f 12/19/2022 18:22:29 - INFO - datasets.builder - Overwrite dataset info from restored data version. 12/19/2022 18:22:29 - INFO - datasets.info - Loading Dataset info from /home/milan/.cache/huggingface/datasets/mozilla-foundation___common_voice_11_0/sl/11.0.0/f8e47235d9b4e68fa24ed71d63266a02018ccf7194b2a8c9c598a5f3ab304d9f 12/19/2022 18:22:29 - WARNING - datasets.builder - Found cached dataset common_voice_11_0 (/home/milan/.cache/huggingface/datasets/mozilla-foundation___common_voice_11_0/sl/11.0.0/f8e47235d9b4e68fa24ed71d63266a02018ccf7194b2a8c9c598a5f3ab304d9f) 12/19/2022 18:22:29 - INFO - datasets.info - Loading Dataset info from /home/milan/.cache/huggingface/datasets/mozilla-foundation___common_voice_11_0/sl/11.0.0/f8e47235d9b4e68fa24ed71d63266a02018ccf7194b2a8c9c598a5f3ab304d9f 12/19/2022 18:22:31 - INFO - datasets.info - Loading Dataset Infos from /home/milan/.cache/huggingface/modules/datasets_modules/datasets/mozilla-foundation--common_voice_11_0/f8e47235d9b4e68fa24ed71d63266a02018ccf7194b2a8c9c598a5f3ab304d9f 12/19/2022 18:22:31 - INFO - datasets.builder - Overwrite dataset info from restored data version. 12/19/2022 18:22:31 - INFO - datasets.info - Loading Dataset info from /home/milan/.cache/huggingface/datasets/mozilla-foundation___common_voice_11_0/sl/11.0.0/f8e47235d9b4e68fa24ed71d63266a02018ccf7194b2a8c9c598a5f3ab304d9f 12/19/2022 18:22:31 - WARNING - datasets.builder - Found cached dataset common_voice_11_0 (/home/milan/.cache/huggingface/datasets/mozilla-foundation___common_voice_11_0/sl/11.0.0/f8e47235d9b4e68fa24ed71d63266a02018ccf7194b2a8c9c598a5f3ab304d9f) 12/19/2022 18:22:31 - INFO - datasets.info - Loading Dataset info from /home/milan/.cache/huggingface/datasets/mozilla-foundation___common_voice_11_0/sl/11.0.0/f8e47235d9b4e68fa24ed71d63266a02018ccf7194b2a8c9c598a5f3ab304d9f 12/19/2022 18:22:33 - INFO - datasets.info - Loading Dataset Infos from /home/milan/.cache/huggingface/modules/datasets_modules/datasets/mozilla-foundation--common_voice_11_0/f8e47235d9b4e68fa24ed71d63266a02018ccf7194b2a8c9c598a5f3ab304d9f 12/19/2022 18:22:33 - INFO - datasets.builder - Overwrite dataset info from restored data version. 12/19/2022 18:22:33 - INFO - datasets.info - Loading Dataset info from /home/milan/.cache/huggingface/datasets/mozilla-foundation___common_voice_11_0/sl/11.0.0/f8e47235d9b4e68fa24ed71d63266a02018ccf7194b2a8c9c598a5f3ab304d9f 12/19/2022 18:22:33 - WARNING - datasets.builder - Found cached dataset common_voice_11_0 (/home/milan/.cache/huggingface/datasets/mozilla-foundation___common_voice_11_0/sl/11.0.0/f8e47235d9b4e68fa24ed71d63266a02018ccf7194b2a8c9c598a5f3ab304d9f) 12/19/2022 18:22:33 - INFO - datasets.info - Loading Dataset info from /home/milan/.cache/huggingface/datasets/mozilla-foundation___common_voice_11_0/sl/11.0.0/f8e47235d9b4e68fa24ed71d63266a02018ccf7194b2a8c9c598a5f3ab304d9f 12/19/2022 18:22:44 - WARNING - datasets.arrow_dataset - Loading cached processed dataset at /home/milan/.cache/huggingface/datasets/mozilla-foundation___common_voice_11_0/sl/11.0.0/f8e47235d9b4e68fa24ed71d63266a02018ccf7194b2a8c9c598a5f3ab304d9f/cache-41ec946a05a262bb.arrow 12/19/2022 18:22:45 - INFO - datasets.arrow_dataset - Caching processed dataset at /home/milan/.cache/huggingface/datasets/mozilla-foundation___common_voice_11_0/sl/11.0.0/f8e47235d9b4e68fa24ed71d63266a02018ccf7194b2a8c9c598a5f3ab304d9f/cache-3645a625c071a58a.arrow 12/19/2022 18:25:44 - WARNING - datasets.arrow_dataset - Loading cached processed dataset at /home/milan/.cache/huggingface/datasets/mozilla-foundation___common_voice_11_0/sl/11.0.0/f8e47235d9b4e68fa24ed71d63266a02018ccf7194b2a8c9c598a5f3ab304d9f/cache-d896a0b0378699aa.arrow 12/19/2022 18:25:46 - WARNING - huggingface_hub.repository - /home/milan/whisper-medium-sl-cv11/./ is already a clone of https://huggingface.co/mikr/whisper-medium-sl-cv11. Make sure you pull the latest changes with `repo.git_pull()`. [2022-12-19 18:25:50,570] [INFO] [logging.py:68:log_dist] [Rank 0] DeepSpeed info: version=0.7.7, git-hash=unknown, git-branch=unknown [2022-12-19 18:25:51,215] [INFO] [logging.py:68:log_dist] [Rank 0] DeepSpeed Flops Profiler Enabled: False [2022-12-19 18:25:52,405] [WARNING] [cpu_adam.py:83:__init__] FP16 params for CPUAdam may not work on AMD CPUs Installed CUDA version 11.6 does not match the version torch was compiled with 11.7 but since the APIs are compatible, accepting this combination ninja: no work to do. Time to load cpu_adam op: 2.9568140506744385 seconds Adam Optimizer #0 is created with AVX2 arithmetic capability. Config: alpha=0.000010, betas=(0.900000, 0.999000), weight_decay=0.000000, adam_w=1 [2022-12-19 18:25:57,177] [INFO] [logging.py:68:log_dist] [Rank 0] Using DeepSpeed Optimizer param name adamw as basic optimizer [2022-12-19 18:25:57,350] [INFO] [logging.py:68:log_dist] [Rank 0] DeepSpeed Basic Optimizer = DeepSpeedCPUAdam [2022-12-19 18:25:57,350] [INFO] [utils.py:52:is_zero_supported_optimizer] Checking ZeRO support for optimizer=DeepSpeedCPUAdam type= [2022-12-19 18:25:57,350] [INFO] [logging.py:68:log_dist] [Rank 0] Creating fp16 ZeRO stage 2 optimizer [2022-12-19 18:25:57,350] [INFO] [stage_1_and_2.py:140:__init__] Reduce bucket size 200000000 [2022-12-19 18:25:57,350] [INFO] [stage_1_and_2.py:141:__init__] Allgather bucket size 200000000 [2022-12-19 18:25:57,350] [INFO] [stage_1_and_2.py:142:__init__] CPU Offload: True [2022-12-19 18:25:57,350] [INFO] [stage_1_and_2.py:143:__init__] Round robin gradient partitioning: False ninja: no work to do. Time to load utils op: 0.4852731227874756 seconds Rank: 0 partition count [1] and sizes[(763857920, False)] [2022-12-19 18:25:59,864] [INFO] [utils.py:827:see_memory_usage] Before initializing optimizer states [2022-12-19 18:25:59,865] [INFO] [utils.py:828:see_memory_usage] MA 1.52 GB Max_MA 1.52 GB CA 2.86 GB Max_CA 3 GB [2022-12-19 18:25:59,865] [INFO] [utils.py:836:see_memory_usage] CPU Virtual Memory: used = 10.81 GB, percent = 5.5% [2022-12-19 18:26:01,836] [INFO] [utils.py:827:see_memory_usage] After initializing optimizer states [2022-12-19 18:26:01,837] [INFO] [utils.py:828:see_memory_usage] MA 1.52 GB Max_MA 1.52 GB CA 2.86 GB Max_CA 3 GB [2022-12-19 18:26:01,837] [INFO] [utils.py:836:see_memory_usage] CPU Virtual Memory: used = 20.58 GB, percent = 10.5% [2022-12-19 18:26:01,837] [INFO] [stage_1_and_2.py:525:__init__] optimizer state initialized [2022-12-19 18:26:01,907] [INFO] [utils.py:827:see_memory_usage] After initializing ZeRO optimizer [2022-12-19 18:26:01,908] [INFO] [utils.py:828:see_memory_usage] MA 1.52 GB Max_MA 1.52 GB CA 2.86 GB Max_CA 3 GB [2022-12-19 18:26:01,908] [INFO] [utils.py:836:see_memory_usage] CPU Virtual Memory: used = 20.58 GB, percent = 10.5% [2022-12-19 18:26:01,926] [INFO] [logging.py:68:log_dist] [Rank 0] DeepSpeed Final Optimizer = adamw [2022-12-19 18:26:01,926] [INFO] [logging.py:68:log_dist] [Rank 0] DeepSpeed using configured LR scheduler = WarmupDecayLR [2022-12-19 18:26:01,926] [INFO] [logging.py:68:log_dist] [Rank 0] DeepSpeed LR Scheduler = [2022-12-19 18:26:01,926] [INFO] [logging.py:68:log_dist] [Rank 0] step=0, skipped=0, lr=[1e-05], mom=[[0.9, 0.999]] [2022-12-19 18:26:01,927] [INFO] [config.py:1020:print] DeepSpeedEngine configuration: [2022-12-19 18:26:01,928] [INFO] [config.py:1024:print] activation_checkpointing_config { "partition_activations": false, "contiguous_memory_optimization": false, "cpu_checkpointing": false, "number_checkpoints": null, "synchronize_checkpoint_boundary": false, "profile": false } [2022-12-19 18:26:01,928] [INFO] [config.py:1024:print] aio_config ................... {'block_size': 1048576, 'queue_depth': 8, 'thread_count': 1, 'single_submit': False, 'overlap_events': True} [2022-12-19 18:26:01,928] [INFO] [config.py:1024:print] amp_enabled .................. False [2022-12-19 18:26:01,928] [INFO] [config.py:1024:print] amp_params ................... False [2022-12-19 18:26:01,928] [INFO] [config.py:1024:print] autotuning_config ............ { "enabled": false, "start_step": null, "end_step": null, "metric_path": null, "arg_mappings": null, "metric": "throughput", "model_info": null, "results_dir": "autotuning_results", "exps_dir": "autotuning_exps", "overwrite": true, "fast": true, "start_profile_step": 3, "end_profile_step": 5, "tuner_type": "gridsearch", "tuner_early_stopping": 5, "tuner_num_trials": 50, "model_info_path": null, "mp_size": 1, "max_train_batch_size": null, "min_train_batch_size": 1, "max_train_micro_batch_size_per_gpu": 1.024000e+03, "min_train_micro_batch_size_per_gpu": 1, "num_tuning_micro_batch_sizes": 3 } [2022-12-19 18:26:01,928] [INFO] [config.py:1024:print] bfloat16_enabled ............. False [2022-12-19 18:26:01,928] [INFO] [config.py:1024:print] checkpoint_parallel_write_pipeline False [2022-12-19 18:26:01,928] [INFO] [config.py:1024:print] checkpoint_tag_validation_enabled True [2022-12-19 18:26:01,928] [INFO] [config.py:1024:print] checkpoint_tag_validation_fail False [2022-12-19 18:26:01,928] [INFO] [config.py:1024:print] comms_config ................. [2022-12-19 18:26:01,928] [INFO] [config.py:1024:print] communication_data_type ...... None [2022-12-19 18:26:01,928] [INFO] [config.py:1024:print] compression_config ........... {'weight_quantization': {'shared_parameters': {'enabled': False, 'quantizer_kernel': False, 'schedule_offset': 0, 'quantize_groups': 1, 'quantize_verbose': False, 'quantization_type': 'symmetric', 'quantize_weight_in_forward': False, 'rounding': 'nearest', 'fp16_mixed_quantize': False, 'quantize_change_ratio': 0.001}, 'different_groups': {}}, 'activation_quantization': {'shared_parameters': {'enabled': False, 'quantization_type': 'symmetric', 'range_calibration': 'dynamic', 'schedule_offset': 1000}, 'different_groups': {}}, 'sparse_pruning': {'shared_parameters': {'enabled': False, 'method': 'l1', 'schedule_offset': 1000}, 'different_groups': {}}, 'row_pruning': {'shared_parameters': {'enabled': False, 'method': 'l1', 'schedule_offset': 1000}, 'different_groups': {}}, 'head_pruning': {'shared_parameters': {'enabled': False, 'method': 'topk', 'schedule_offset': 1000}, 'different_groups': {}}, 'channel_pruning': {'shared_parameters': {'enabled': False, 'method': 'l1', 'schedule_offset': 1000}, 'different_groups': {}}, 'layer_reduction': {'enabled': False}} [2022-12-19 18:26:01,928] [INFO] [config.py:1024:print] curriculum_enabled ........... False [2022-12-19 18:26:01,928] [INFO] [config.py:1024:print] curriculum_params ............ False [2022-12-19 18:26:01,928] [INFO] [config.py:1024:print] dataloader_drop_last ......... False [2022-12-19 18:26:01,928] [INFO] [config.py:1024:print] disable_allgather ............ False [2022-12-19 18:26:01,928] [INFO] [config.py:1024:print] dump_state ................... False [2022-12-19 18:26:01,928] [INFO] [config.py:1024:print] dynamic_loss_scale_args ...... {'init_scale': 65536, 'scale_window': 1000, 'delayed_shift': 2, 'min_scale': 1} [2022-12-19 18:26:01,928] [INFO] [config.py:1024:print] eigenvalue_enabled ........... False [2022-12-19 18:26:01,928] [INFO] [config.py:1024:print] eigenvalue_gas_boundary_resolution 1 [2022-12-19 18:26:01,928] [INFO] [config.py:1024:print] eigenvalue_layer_name ........ bert.encoder.layer [2022-12-19 18:26:01,928] [INFO] [config.py:1024:print] eigenvalue_layer_num ......... 0 [2022-12-19 18:26:01,928] [INFO] [config.py:1024:print] eigenvalue_max_iter .......... 100 [2022-12-19 18:26:01,928] [INFO] [config.py:1024:print] eigenvalue_stability ......... 1e-06 [2022-12-19 18:26:01,928] [INFO] [config.py:1024:print] eigenvalue_tol ............... 0.01 [2022-12-19 18:26:01,929] [INFO] [config.py:1024:print] eigenvalue_verbose ........... False [2022-12-19 18:26:01,929] [INFO] [config.py:1024:print] elasticity_enabled ........... False [2022-12-19 18:26:01,929] [INFO] [config.py:1024:print] flops_profiler_config ........ { "enabled": false, "profile_step": 1, "module_depth": -1, "top_modules": 1, "detailed": true, "output_file": null } [2022-12-19 18:26:01,929] [INFO] [config.py:1024:print] fp16_auto_cast ............... False [2022-12-19 18:26:01,929] [INFO] [config.py:1024:print] fp16_enabled ................. True [2022-12-19 18:26:01,929] [INFO] [config.py:1024:print] fp16_master_weights_and_gradients False [2022-12-19 18:26:01,929] [INFO] [config.py:1024:print] global_rank .................. 0 [2022-12-19 18:26:01,929] [INFO] [config.py:1024:print] grad_accum_dtype ............. None [2022-12-19 18:26:01,929] [INFO] [config.py:1024:print] gradient_accumulation_steps .. 1 [2022-12-19 18:26:01,929] [INFO] [config.py:1024:print] gradient_clipping ............ 1.0 [2022-12-19 18:26:01,929] [INFO] [config.py:1024:print] gradient_predivide_factor .... 1.0 [2022-12-19 18:26:01,929] [INFO] [config.py:1024:print] initial_dynamic_scale ........ 65536 [2022-12-19 18:26:01,929] [INFO] [config.py:1024:print] load_universal_checkpoint .... False [2022-12-19 18:26:01,929] [INFO] [config.py:1024:print] loss_scale ................... 0 [2022-12-19 18:26:01,929] [INFO] [config.py:1024:print] memory_breakdown ............. False [2022-12-19 18:26:01,929] [INFO] [config.py:1024:print] monitor_config ............... [2022-12-19 18:26:01,929] [INFO] [config.py:1024:print] nebula_config ................ { "enabled": false, "persistent_storage_path": null, "persistent_time_interval": 100, "num_of_version_in_retention": 2, "enable_nebula_load": true, "load_path": null } [2022-12-19 18:26:01,929] [INFO] [config.py:1024:print] optimizer_legacy_fusion ...... False [2022-12-19 18:26:01,929] [INFO] [config.py:1024:print] optimizer_name ............... adamw [2022-12-19 18:26:01,929] [INFO] [config.py:1024:print] optimizer_params ............. {'lr': 1e-05, 'betas': [0.9, 0.999], 'eps': 1e-08, 'weight_decay': 0.0} [2022-12-19 18:26:01,929] [INFO] [config.py:1024:print] pipeline ..................... {'stages': 'auto', 'partition': 'best', 'seed_layers': False, 'activation_checkpoint_interval': 0} [2022-12-19 18:26:01,929] [INFO] [config.py:1024:print] pld_enabled .................. False [2022-12-19 18:26:01,929] [INFO] [config.py:1024:print] pld_params ................... False [2022-12-19 18:26:01,929] [INFO] [config.py:1024:print] prescale_gradients ........... False [2022-12-19 18:26:01,929] [INFO] [config.py:1024:print] scheduler_name ............... WarmupDecayLR [2022-12-19 18:26:01,929] [INFO] [config.py:1024:print] scheduler_params ............. {'last_batch_iteration': -1, 'total_num_steps': 5000, 'warmup_min_lr': 0, 'warmup_max_lr': 1e-05, 'warmup_num_steps': 500} [2022-12-19 18:26:01,929] [INFO] [config.py:1024:print] sparse_attention ............. None [2022-12-19 18:26:01,929] [INFO] [config.py:1024:print] sparse_gradients_enabled ..... False [2022-12-19 18:26:01,929] [INFO] [config.py:1024:print] steps_per_print .............. 10 [2022-12-19 18:26:01,929] [INFO] [config.py:1024:print] train_batch_size ............. 64 [2022-12-19 18:26:01,929] [INFO] [config.py:1024:print] train_micro_batch_size_per_gpu 64 [2022-12-19 18:26:01,929] [INFO] [config.py:1024:print] use_node_local_storage ....... False [2022-12-19 18:26:01,929] [INFO] [config.py:1024:print] wall_clock_breakdown ......... False [2022-12-19 18:26:01,929] [INFO] [config.py:1024:print] world_size ................... 1 [2022-12-19 18:26:01,929] [INFO] [config.py:1024:print] zero_allow_untested_optimizer False [2022-12-19 18:26:01,930] [INFO] [config.py:1024:print] zero_config .................. stage=2 contiguous_gradients=True reduce_scatter=True reduce_bucket_size=200000000 allgather_partitions=True allgather_bucket_size=200000000 overlap_comm=True load_from_fp32_weights=True elastic_checkpoint=False offload_param=None offload_optimizer=DeepSpeedZeroOffloadOptimizerConfig(device='cpu', nvme_path=None, buffer_count=4, pin_memory=True, pipeline=False, pipeline_read=False, pipeline_write=False, fast_init=False) sub_group_size=1,000,000,000 cpu_offload_param=None cpu_offload_use_pin_memory=None cpu_offload=None prefetch_bucket_size=50,000,000 param_persistence_threshold=100,000 model_persistence_threshold=sys.maxsize max_live_parameters=1,000,000,000 max_reuse_distance=1,000,000,000 gather_16bit_weights_on_model_save=False stage3_gather_fp16_weights_on_model_save=False ignore_unused_parameters=True legacy_stage1=False round_robin_gradients=False [2022-12-19 18:26:01,930] [INFO] [config.py:1024:print] zero_enabled ................. True [2022-12-19 18:26:01,930] [INFO] [config.py:1024:print] zero_optimization_stage ...... 2 [2022-12-19 18:26:01,930] [INFO] [config.py:1009:print_user_config] json = { "fp16": { "enabled": true, "loss_scale": 0, "loss_scale_window": 1000, "initial_scale_power": 16, "hysteresis": 2, "min_loss_scale": 1 }, "optimizer": { "type": "AdamW", "params": { "lr": 1e-05, "betas": [0.9, 0.999], "eps": 1e-08, "weight_decay": 0.0 } }, "scheduler": { "type": "WarmupDecayLR", "params": { "last_batch_iteration": -1, "total_num_steps": 5.000000e+03, "warmup_min_lr": 0, "warmup_max_lr": 1e-05, "warmup_num_steps": 500 } }, "zero_optimization": { "stage": 2, "offload_optimizer": { "device": "cpu", "pin_memory": true }, "allgather_partitions": true, "allgather_bucket_size": 2.000000e+08, "overlap_comm": true, "reduce_scatter": true, "reduce_bucket_size": 2.000000e+08, "contiguous_gradients": true }, "gradient_accumulation_steps": 1, "gradient_clipping": 1.0, "train_batch_size": 64, "train_micro_batch_size_per_gpu": 64 } Time to load utils op: 0.00036025047302246094 seconds [2022-12-19 18:26:08,603] [INFO] [stage_1_and_2.py:1765:step] [deepspeed] OVERFLOW! Rank 0 Skipping step. Attempted loss scale: 65536, reducing to 65536 [2022-12-19 18:26:14,791] [INFO] [stage_1_and_2.py:1765:step] [deepspeed] OVERFLOW! Rank 0 Skipping step. Attempted loss scale: 65536, reducing to 32768.0 [2022-12-19 18:26:20,916] [INFO] [stage_1_and_2.py:1765:step] [deepspeed] OVERFLOW! Rank 0 Skipping step. Attempted loss scale: 32768.0, reducing to 16384.0 [2022-12-19 18:26:20,917] [INFO] [timer.py:197:stop] 0/3, RunningAvgSamplesPerSec=12.770695296775497, CurrSamplesPerSec=12.770695296775497, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:26:27,184] [INFO] [stage_1_and_2.py:1765:step] [deepspeed] OVERFLOW! Rank 0 Skipping step. Attempted loss scale: 16384.0, reducing to 8192.0 [2022-12-19 18:26:27,185] [INFO] [timer.py:197:stop] 0/4, RunningAvgSamplesPerSec=12.72161336521847, CurrSamplesPerSec=12.672907264842044, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:26:33,823] [INFO] [timer.py:197:stop] 0/5, RunningAvgSamplesPerSec=12.417447570891687, CurrSamplesPerSec=11.850759138646605, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:26:40,361] [INFO] [timer.py:197:stop] 0/6, RunningAvgSamplesPerSec=12.289279867666883, CurrSamplesPerSec=11.920174686536152, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:26:46,889] [INFO] [timer.py:197:stop] 0/7, RunningAvgSamplesPerSec=12.202032162762992, CurrSamplesPerSec=11.86508755136586, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:26:53,443] [INFO] [timer.py:197:stop] 0/8, RunningAvgSamplesPerSec=12.134885303524678, CurrSamplesPerSec=11.809939294195038, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:26:59,982] [INFO] [timer.py:197:stop] 0/9, RunningAvgSamplesPerSec=12.090958404070973, CurrSamplesPerSec=11.833933474922075, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:27:06,514] [INFO] [logging.py:68:log_dist] [Rank 0] step=10, skipped=4, lr=[2.883141528559073e-06], mom=[[0.9, 0.999]] [2022-12-19 18:27:06,515] [INFO] [timer.py:197:stop] 0/10, RunningAvgSamplesPerSec=12.063121856911412, CurrSamplesPerSec=11.871797978947413, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:27:13,066] [INFO] [timer.py:197:stop] 0/11, RunningAvgSamplesPerSec=12.042646092618936, CurrSamplesPerSec=11.881308832152559, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:27:19,501] [INFO] [timer.py:197:stop] 0/12, RunningAvgSamplesPerSec=12.03670392322164, CurrSamplesPerSec=11.983487114497631, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:27:25,966] [INFO] [timer.py:197:stop] 0/13, RunningAvgSamplesPerSec=12.030092689268644, CurrSamplesPerSec=11.964377606499204, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:27:32,539] [INFO] [timer.py:197:stop] 0/14, RunningAvgSamplesPerSec=12.016084789228547, CurrSamplesPerSec=11.864123695174017, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:27:39,048] [INFO] [timer.py:197:stop] 0/15, RunningAvgSamplesPerSec=12.003122298261156, CurrSamplesPerSec=11.849725945066293, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:27:45,553] [INFO] [timer.py:197:stop] 0/16, RunningAvgSamplesPerSec=11.996271649287452, CurrSamplesPerSec=11.907919579281874, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:27:52,175] [INFO] [timer.py:197:stop] 0/17, RunningAvgSamplesPerSec=11.986143633379378, CurrSamplesPerSec=11.846126084826565, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:27:58,757] [INFO] [timer.py:197:stop] 0/18, RunningAvgSamplesPerSec=11.974989551776911, CurrSamplesPerSec=11.81013518046904, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:28:05,297] [INFO] [timer.py:197:stop] 0/19, RunningAvgSamplesPerSec=11.96641749179651, CurrSamplesPerSec=11.830914662772678, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:28:11,830] [INFO] [logging.py:68:log_dist] [Rank 0] step=20, skipped=4, lr=[4.461405575910259e-06], mom=[[0.9, 0.999]] [2022-12-19 18:28:11,830] [INFO] [timer.py:197:stop] 0/20, RunningAvgSamplesPerSec=11.96237746913552, CurrSamplesPerSec=11.89411207559048, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:28:18,307] [INFO] [timer.py:197:stop] 0/21, RunningAvgSamplesPerSec=11.960873700663855, CurrSamplesPerSec=11.933870372518308, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:28:24,815] [INFO] [timer.py:197:stop] 0/22, RunningAvgSamplesPerSec=11.959183627554138, CurrSamplesPerSec=11.927162742367901, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:28:31,421] [INFO] [timer.py:197:stop] 0/23, RunningAvgSamplesPerSec=11.95565665870325, CurrSamplesPerSec=11.885551588767477, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:28:37,888] [INFO] [timer.py:197:stop] 0/24, RunningAvgSamplesPerSec=11.951338000586484, CurrSamplesPerSec=11.861361473509685, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:28:44,383] [INFO] [timer.py:197:stop] 0/25, RunningAvgSamplesPerSec=11.951529935048148, CurrSamplesPerSec=11.95575405345182, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.8218, 'learning_rate': 4.898977360288234e-06, 'epoch': 0.66} [2022-12-19 18:28:50,941] [INFO] [timer.py:197:stop] 0/26, RunningAvgSamplesPerSec=11.94718818601409, CurrSamplesPerSec=11.848191396639134, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:28:57,748] [INFO] [timer.py:197:stop] 0/27, RunningAvgSamplesPerSec=11.944532057257671, CurrSamplesPerSec=11.881137396942313, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:29:04,233] [INFO] [timer.py:197:stop] 0/28, RunningAvgSamplesPerSec=11.945743275920867, CurrSamplesPerSec=11.97610377966594, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:29:10,929] [INFO] [timer.py:197:stop] 0/29, RunningAvgSamplesPerSec=11.947151312098287, CurrSamplesPerSec=11.983877117733229, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:29:17,648] [INFO] [logging.py:68:log_dist] [Rank 0] step=30, skipped=4, lr=[5.242641991936178e-06], mom=[[0.9, 0.999]] [2022-12-19 18:29:17,649] [INFO] [timer.py:197:stop] 0/30, RunningAvgSamplesPerSec=11.92522066157641, CurrSamplesPerSec=11.3620900436752, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:29:24,228] [INFO] [timer.py:197:stop] 0/31, RunningAvgSamplesPerSec=11.923891439812145, CurrSamplesPerSec=11.886793161338273, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:29:30,731] [INFO] [timer.py:197:stop] 0/32, RunningAvgSamplesPerSec=11.925303916588168, CurrSamplesPerSec=11.966411812194945, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:29:37,552] [INFO] [timer.py:197:stop] 0/33, RunningAvgSamplesPerSec=11.923892449639919, CurrSamplesPerSec=11.881703257095392, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:29:44,037] [INFO] [timer.py:197:stop] 0/34, RunningAvgSamplesPerSec=11.923702348930892, CurrSamplesPerSec=11.917812231965113, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:29:50,877] [INFO] [timer.py:197:stop] 0/35, RunningAvgSamplesPerSec=11.922450480041357, CurrSamplesPerSec=11.882529004768674, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:29:57,416] [INFO] [timer.py:197:stop] 0/36, RunningAvgSamplesPerSec=11.92235585714169, CurrSamplesPerSec=11.919234143828925, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:30:04,014] [INFO] [timer.py:197:stop] 0/37, RunningAvgSamplesPerSec=11.923196315496618, CurrSamplesPerSec=11.951842573527628, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:30:08,617] [INFO] [timer.py:197:stop] 0/38, RunningAvgSamplesPerSec=12.017485639102567, CurrSamplesPerSec=16.61668531876392, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:30:15,093] [INFO] [timer.py:197:stop] 0/39, RunningAvgSamplesPerSec=12.015775214061128, CurrSamplesPerSec=11.954522523549615, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:30:21,534] [INFO] [logging.py:68:log_dist] [Rank 0] step=40, skipped=4, lr=[5.766283057118146e-06], mom=[[0.9, 0.999]] [2022-12-19 18:30:21,534] [INFO] [timer.py:197:stop] 0/40, RunningAvgSamplesPerSec=12.013080562944676, CurrSamplesPerSec=11.914221126787318, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:30:28,065] [INFO] [timer.py:197:stop] 0/41, RunningAvgSamplesPerSec=12.007720882024694, CurrSamplesPerSec=11.8075377475698, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:30:34,563] [INFO] [timer.py:197:stop] 0/42, RunningAvgSamplesPerSec=12.00465544250619, CurrSamplesPerSec=11.886312080397534, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:30:41,100] [INFO] [timer.py:197:stop] 0/43, RunningAvgSamplesPerSec=12.000052226980072, CurrSamplesPerSec=11.818774664140612, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:30:47,622] [INFO] [timer.py:197:stop] 0/44, RunningAvgSamplesPerSec=11.99744423321191, CurrSamplesPerSec=11.89148389838182, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:30:54,128] [INFO] [timer.py:197:stop] 0/45, RunningAvgSamplesPerSec=11.994739168724625, CurrSamplesPerSec=11.882217626561658, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:31:00,688] [INFO] [timer.py:197:stop] 0/46, RunningAvgSamplesPerSec=11.991993376082611, CurrSamplesPerSec=11.875101930347217, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:31:07,167] [INFO] [timer.py:197:stop] 0/47, RunningAvgSamplesPerSec=11.989124471747616, CurrSamplesPerSec=11.864237482882771, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:31:13,970] [INFO] [timer.py:197:stop] 0/48, RunningAvgSamplesPerSec=11.988300349175946, CurrSamplesPerSec=11.95133173632273, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:31:20,448] [INFO] [timer.py:197:stop] 0/49, RunningAvgSamplesPerSec=11.98839091553399, CurrSamplesPerSec=11.9925584477353, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:31:27,193] [INFO] [logging.py:68:log_dist] [Rank 0] step=50, skipped=4, lr=[6.160712527409633e-06], mom=[[0.9, 0.999]] [2022-12-19 18:31:27,194] [INFO] [timer.py:197:stop] 0/50, RunningAvgSamplesPerSec=11.98648719937796, CurrSamplesPerSec=11.897689484904296, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.2781, 'learning_rate': 6.160712527409633e-06, 'epoch': 1.32} [2022-12-19 18:31:33,809] [INFO] [timer.py:197:stop] 0/51, RunningAvgSamplesPerSec=11.979564343681941, CurrSamplesPerSec=11.656417669142503, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:31:40,647] [INFO] [timer.py:197:stop] 0/52, RunningAvgSamplesPerSec=11.978034201609391, CurrSamplesPerSec=11.90353309973444, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:31:47,208] [INFO] [timer.py:197:stop] 0/53, RunningAvgSamplesPerSec=11.976828784462107, CurrSamplesPerSec=11.91686571359723, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:31:54,197] [INFO] [timer.py:197:stop] 0/54, RunningAvgSamplesPerSec=11.9748057793317, CurrSamplesPerSec=11.872530981266877, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:32:00,665] [INFO] [timer.py:197:stop] 0/55, RunningAvgSamplesPerSec=11.973632916007181, CurrSamplesPerSec=11.912959014937295, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:32:07,622] [INFO] [timer.py:197:stop] 0/56, RunningAvgSamplesPerSec=11.971082287878863, CurrSamplesPerSec=11.83743666385719, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:32:14,104] [INFO] [timer.py:197:stop] 0/57, RunningAvgSamplesPerSec=11.97083497117326, CurrSamplesPerSec=11.957495027203272, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:32:20,725] [INFO] [timer.py:197:stop] 0/58, RunningAvgSamplesPerSec=11.967168120211076, CurrSamplesPerSec=11.768893495278926, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:32:27,156] [INFO] [timer.py:197:stop] 0/59, RunningAvgSamplesPerSec=11.967203877717253, CurrSamplesPerSec=11.969206639160747, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:32:33,640] [INFO] [logging.py:68:log_dist] [Rank 0] step=60, skipped=4, lr=[6.4772414076394205e-06], mom=[[0.9, 0.999]] [2022-12-19 18:32:33,640] [INFO] [timer.py:197:stop] 0/60, RunningAvgSamplesPerSec=11.96614803045232, CurrSamplesPerSec=11.906271168097536, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:32:40,160] [INFO] [timer.py:197:stop] 0/61, RunningAvgSamplesPerSec=11.96498915328232, CurrSamplesPerSec=11.898156193380135, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:32:46,598] [INFO] [timer.py:197:stop] 0/62, RunningAvgSamplesPerSec=11.964632111030582, CurrSamplesPerSec=11.943604268286796, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:32:53,065] [INFO] [timer.py:197:stop] 0/63, RunningAvgSamplesPerSec=11.96506026405694, CurrSamplesPerSec=11.990805642653113, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:32:59,647] [INFO] [timer.py:197:stop] 0/64, RunningAvgSamplesPerSec=11.962476461050668, CurrSamplesPerSec=11.806947250765349, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:33:06,113] [INFO] [timer.py:197:stop] 0/65, RunningAvgSamplesPerSec=11.962677733759069, CurrSamplesPerSec=11.975169883088887, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:33:12,602] [INFO] [timer.py:197:stop] 0/66, RunningAvgSamplesPerSec=11.961295706682522, CurrSamplesPerSec=11.874867111068959, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:33:19,121] [INFO] [timer.py:197:stop] 0/67, RunningAvgSamplesPerSec=11.960030019809254, CurrSamplesPerSec=11.879579456476584, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:33:25,847] [INFO] [timer.py:197:stop] 0/68, RunningAvgSamplesPerSec=11.951028229726973, CurrSamplesPerSec=11.393622052761886, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:33:32,357] [INFO] [timer.py:197:stop] 0/69, RunningAvgSamplesPerSec=11.951568501396249, CurrSamplesPerSec=11.987334758286156, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:33:38,865] [INFO] [logging.py:68:log_dist] [Rank 0] step=70, skipped=4, lr=[6.741623406776245e-06], mom=[[0.9, 0.999]] [2022-12-19 18:33:38,866] [INFO] [timer.py:197:stop] 0/70, RunningAvgSamplesPerSec=11.950573636747201, CurrSamplesPerSec=11.884292912668403, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:33:45,329] [INFO] [timer.py:197:stop] 0/71, RunningAvgSamplesPerSec=11.949826825011192, CurrSamplesPerSec=11.899261673865789, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:33:51,947] [INFO] [timer.py:197:stop] 0/72, RunningAvgSamplesPerSec=11.945692974330852, CurrSamplesPerSec=11.667203342186168, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:33:58,411] [INFO] [timer.py:197:stop] 0/73, RunningAvgSamplesPerSec=11.945205371204342, CurrSamplesPerSec=11.911171789125206, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:34:04,946] [INFO] [timer.py:197:stop] 0/74, RunningAvgSamplesPerSec=11.944205406112932, CurrSamplesPerSec=11.873633280494387, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:34:11,483] [INFO] [timer.py:197:stop] 0/75, RunningAvgSamplesPerSec=11.94337558373894, CurrSamplesPerSec=11.883929882564297, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.1633, 'learning_rate': 6.85912902234906e-06, 'epoch': 1.97} [2022-12-19 18:34:16,140] [INFO] [timer.py:197:stop] 0/76, RunningAvgSamplesPerSec=11.989055368710677, CurrSamplesPerSec=16.63305007801444, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:34:22,797] [INFO] [timer.py:197:stop] 0/77, RunningAvgSamplesPerSec=11.988778641946762, CurrSamplesPerSec=11.96833625049005, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:34:29,288] [INFO] [timer.py:197:stop] 0/78, RunningAvgSamplesPerSec=11.98765603744316, CurrSamplesPerSec=11.904055695817226, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:34:35,928] [INFO] [timer.py:197:stop] 0/79, RunningAvgSamplesPerSec=11.985019760105475, CurrSamplesPerSec=11.787999662391314, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:34:42,411] [INFO] [logging.py:68:log_dist] [Rank 0] step=80, skipped=4, lr=[6.968634661590082e-06], mom=[[0.9, 0.999]] [2022-12-19 18:34:42,412] [INFO] [timer.py:197:stop] 0/80, RunningAvgSamplesPerSec=11.98450592126031, CurrSamplesPerSec=11.94507220719026, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:34:49,194] [INFO] [timer.py:197:stop] 0/81, RunningAvgSamplesPerSec=11.984615261788422, CurrSamplesPerSec=11.99314997436825, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:34:55,719] [INFO] [timer.py:197:stop] 0/82, RunningAvgSamplesPerSec=11.983718552141424, CurrSamplesPerSec=11.913300028430148, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:35:02,203] [INFO] [timer.py:197:stop] 0/83, RunningAvgSamplesPerSec=11.982591198535706, CurrSamplesPerSec=11.893085009217916, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:35:08,689] [INFO] [timer.py:197:stop] 0/84, RunningAvgSamplesPerSec=11.981385521650415, CurrSamplesPerSec=11.884524947874823, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:35:15,210] [INFO] [timer.py:197:stop] 0/85, RunningAvgSamplesPerSec=11.980351942273366, CurrSamplesPerSec=11.896201008606102, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:35:21,707] [INFO] [timer.py:197:stop] 0/86, RunningAvgSamplesPerSec=11.980288340600778, CurrSamplesPerSec=11.975011754839112, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:35:28,095] [INFO] [timer.py:197:stop] 0/87, RunningAvgSamplesPerSec=11.980492749372829, CurrSamplesPerSec=11.997688023723322, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:35:34,860] [INFO] [timer.py:197:stop] 0/88, RunningAvgSamplesPerSec=11.973205093163857, CurrSamplesPerSec=11.384566644801064, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:35:41,317] [INFO] [timer.py:197:stop] 0/89, RunningAvgSamplesPerSec=11.97254430523956, CurrSamplesPerSec=11.915988109556679, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:35:47,771] [INFO] [logging.py:68:log_dist] [Rank 0] step=90, skipped=4, lr=[7.1675433522258775e-06], mom=[[0.9, 0.999]] [2022-12-19 18:35:47,772] [INFO] [timer.py:197:stop] 0/90, RunningAvgSamplesPerSec=11.972879471110938, CurrSamplesPerSec=12.002110912130712, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:35:54,183] [INFO] [timer.py:197:stop] 0/91, RunningAvgSamplesPerSec=11.972986631129855, CurrSamplesPerSec=11.982424230439614, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:36:00,996] [INFO] [timer.py:197:stop] 0/92, RunningAvgSamplesPerSec=11.964624476181502, CurrSamplesPerSec=11.264435677007024, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:36:07,463] [INFO] [timer.py:197:stop] 0/93, RunningAvgSamplesPerSec=11.964893034868153, CurrSamplesPerSec=11.98911278661595, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:36:13,935] [INFO] [timer.py:197:stop] 0/94, RunningAvgSamplesPerSec=11.965195569846097, CurrSamplesPerSec=11.992790443526756, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:36:20,419] [INFO] [timer.py:197:stop] 0/95, RunningAvgSamplesPerSec=11.964468078174255, CurrSamplesPerSec=11.897915187934306, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:36:27,204] [INFO] [timer.py:197:stop] 0/96, RunningAvgSamplesPerSec=11.9646461154338, CurrSamplesPerSec=11.981226772701694, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:36:33,737] [INFO] [timer.py:197:stop] 0/97, RunningAvgSamplesPerSec=11.963508419495465, CurrSamplesPerSec=11.857522502684597, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:36:40,244] [INFO] [timer.py:197:stop] 0/98, RunningAvgSamplesPerSec=11.962280878172274, CurrSamplesPerSec=11.846802068379677, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:36:46,717] [INFO] [timer.py:197:stop] 0/99, RunningAvgSamplesPerSec=11.961254613684387, CurrSamplesPerSec=11.863546400382022, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:36:53,169] [INFO] [logging.py:68:log_dist] [Rank 0] step=100, skipped=4, lr=[7.344547104469332e-06], mom=[[0.9, 0.999]] [2022-12-19 18:36:53,170] [INFO] [timer.py:197:stop] 0/100, RunningAvgSamplesPerSec=11.960833391854557, CurrSamplesPerSec=11.920115402027617, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0649, 'learning_rate': 7.344547104469332e-06, 'epoch': 2.63} [2022-12-19 18:36:59,620] [INFO] [timer.py:197:stop] 0/101, RunningAvgSamplesPerSec=11.96118042101715, CurrSamplesPerSec=11.995287243374307, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:37:06,134] [INFO] [timer.py:197:stop] 0/102, RunningAvgSamplesPerSec=11.95990428660375, CurrSamplesPerSec=11.834900780286048, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:37:12,761] [INFO] [timer.py:197:stop] 0/103, RunningAvgSamplesPerSec=11.955906047923378, CurrSamplesPerSec=11.569145389261527, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:37:19,274] [INFO] [timer.py:197:stop] 0/104, RunningAvgSamplesPerSec=11.955430896240086, CurrSamplesPerSec=11.907634336105545, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:37:25,829] [INFO] [timer.py:197:stop] 0/105, RunningAvgSamplesPerSec=11.954819456194633, CurrSamplesPerSec=11.892779400315366, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:37:32,289] [INFO] [timer.py:197:stop] 0/106, RunningAvgSamplesPerSec=11.955050082573079, CurrSamplesPerSec=11.978852353504577, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:37:39,064] [INFO] [timer.py:197:stop] 0/107, RunningAvgSamplesPerSec=11.949117331314653, CurrSamplesPerSec=11.362683535059936, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:37:45,550] [INFO] [timer.py:197:stop] 0/108, RunningAvgSamplesPerSec=11.949403097124424, CurrSamplesPerSec=11.979484762760327, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:37:52,041] [INFO] [timer.py:197:stop] 0/109, RunningAvgSamplesPerSec=11.948781088218709, CurrSamplesPerSec=11.883213357800784, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:37:58,563] [INFO] [logging.py:68:log_dist] [Rank 0] step=110, skipped=4, lr=[7.503995457567235e-06], mom=[[0.9, 0.999]] [2022-12-19 18:37:58,564] [INFO] [timer.py:197:stop] 0/110, RunningAvgSamplesPerSec=11.948342450463205, CurrSamplesPerSec=11.901593560971051, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:38:05,256] [INFO] [timer.py:197:stop] 0/111, RunningAvgSamplesPerSec=11.947155357237115, CurrSamplesPerSec=11.820322941938132, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:38:11,847] [INFO] [timer.py:197:stop] 0/112, RunningAvgSamplesPerSec=11.94614608161501, CurrSamplesPerSec=11.837148001603007, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:38:18,436] [INFO] [timer.py:197:stop] 0/113, RunningAvgSamplesPerSec=11.945807279393875, CurrSamplesPerSec=11.908655992436962, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:38:23,107] [INFO] [timer.py:197:stop] 0/114, RunningAvgSamplesPerSec=11.975399622944872, CurrSamplesPerSec=16.517131060398814, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:38:29,929] [INFO] [timer.py:197:stop] 0/115, RunningAvgSamplesPerSec=11.97564932824106, CurrSamplesPerSec=12.00368237210802, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:38:36,367] [INFO] [timer.py:197:stop] 0/116, RunningAvgSamplesPerSec=11.975750223871604, CurrSamplesPerSec=11.9871623909319, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:38:42,845] [INFO] [timer.py:197:stop] 0/117, RunningAvgSamplesPerSec=11.975163876554683, CurrSamplesPerSec=11.908694558906802, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:38:49,381] [INFO] [timer.py:197:stop] 0/118, RunningAvgSamplesPerSec=11.974870190861088, CurrSamplesPerSec=11.941192147416698, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:38:55,854] [INFO] [timer.py:197:stop] 0/119, RunningAvgSamplesPerSec=11.974060078277441, CurrSamplesPerSec=11.880825039984053, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:39:02,349] [INFO] [logging.py:68:log_dist] [Rank 0] step=120, skipped=4, lr=[7.649058662787184e-06], mom=[[0.9, 0.999]] [2022-12-19 18:39:02,349] [INFO] [timer.py:197:stop] 0/120, RunningAvgSamplesPerSec=11.974403931668107, CurrSamplesPerSec=12.014771562178144, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:39:08,849] [INFO] [timer.py:197:stop] 0/121, RunningAvgSamplesPerSec=11.973817366938638, CurrSamplesPerSec=11.905003876154641, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:39:15,351] [INFO] [timer.py:197:stop] 0/122, RunningAvgSamplesPerSec=11.972927883840688, CurrSamplesPerSec=11.868014690343205, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:39:21,901] [INFO] [timer.py:197:stop] 0/123, RunningAvgSamplesPerSec=11.971012656961737, CurrSamplesPerSec=11.745550079009458, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:39:28,332] [INFO] [timer.py:197:stop] 0/124, RunningAvgSamplesPerSec=11.970529099538052, CurrSamplesPerSec=11.912305592465955, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:39:34,765] [INFO] [timer.py:197:stop] 0/125, RunningAvgSamplesPerSec=11.970787058332547, CurrSamplesPerSec=12.002341667703618, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0433, 'learning_rate': 7.716963756434345e-06, 'epoch': 3.29} [2022-12-19 18:39:41,238] [INFO] [timer.py:197:stop] 0/126, RunningAvgSamplesPerSec=11.970920647681927, CurrSamplesPerSec=11.9873749066736, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:39:47,734] [INFO] [timer.py:197:stop] 0/127, RunningAvgSamplesPerSec=11.9705270240146, CurrSamplesPerSec=11.921917491064518, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:39:54,225] [INFO] [timer.py:197:stop] 0/128, RunningAvgSamplesPerSec=11.969771430750296, CurrSamplesPerSec=11.876067571207951, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:40:00,694] [INFO] [timer.py:197:stop] 0/129, RunningAvgSamplesPerSec=11.969909002492257, CurrSamplesPerSec=11.987268380208794, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:40:07,286] [INFO] [logging.py:68:log_dist] [Rank 0] step=130, skipped=4, lr=[7.782118888847307e-06], mom=[[0.9, 0.999]] [2022-12-19 18:40:07,287] [INFO] [timer.py:197:stop] 0/130, RunningAvgSamplesPerSec=11.967800546277202, CurrSamplesPerSec=11.705931930100745, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:40:13,791] [INFO] [timer.py:197:stop] 0/131, RunningAvgSamplesPerSec=11.967365051497387, CurrSamplesPerSec=11.911882174918238, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:40:20,269] [INFO] [timer.py:197:stop] 0/132, RunningAvgSamplesPerSec=11.966801221834064, CurrSamplesPerSec=11.894509986324449, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:40:26,733] [INFO] [timer.py:197:stop] 0/133, RunningAvgSamplesPerSec=11.966475206732566, CurrSamplesPerSec=11.924243965525896, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:40:33,575] [INFO] [timer.py:197:stop] 0/134, RunningAvgSamplesPerSec=11.966063229480717, CurrSamplesPerSec=11.912338367613689, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:40:40,046] [INFO] [timer.py:197:stop] 0/135, RunningAvgSamplesPerSec=11.965953999258348, CurrSamplesPerSec=11.951553093749999, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:40:46,623] [INFO] [timer.py:197:stop] 0/136, RunningAvgSamplesPerSec=11.96463166729028, CurrSamplesPerSec=11.791328087793204, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:40:53,187] [INFO] [timer.py:197:stop] 0/137, RunningAvgSamplesPerSec=11.963321308455823, CurrSamplesPerSec=11.790291766346405, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:40:59,627] [INFO] [timer.py:197:stop] 0/138, RunningAvgSamplesPerSec=11.963099130714472, CurrSamplesPerSec=11.933180703039122, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:41:06,159] [INFO] [timer.py:197:stop] 0/139, RunningAvgSamplesPerSec=11.962131778309548, CurrSamplesPerSec=11.832013419715604, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:41:12,719] [INFO] [logging.py:68:log_dist] [Rank 0] step=140, skipped=4, lr=[7.905011559752758e-06], mom=[[0.9, 0.999]] [2022-12-19 18:41:12,720] [INFO] [timer.py:197:stop] 0/140, RunningAvgSamplesPerSec=11.960726369629006, CurrSamplesPerSec=11.771257666422239, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:41:19,671] [INFO] [timer.py:197:stop] 0/141, RunningAvgSamplesPerSec=11.960250749029747, CurrSamplesPerSec=11.894975918292486, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:41:26,762] [INFO] [timer.py:197:stop] 0/142, RunningAvgSamplesPerSec=11.959594585538905, CurrSamplesPerSec=11.869083088525297, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:41:33,845] [INFO] [timer.py:197:stop] 0/143, RunningAvgSamplesPerSec=11.95858879396008, CurrSamplesPerSec=11.8194282741326, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:41:40,940] [INFO] [timer.py:197:stop] 0/144, RunningAvgSamplesPerSec=11.958371856842502, CurrSamplesPerSec=11.927862316617233, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:41:47,653] [INFO] [timer.py:197:stop] 0/145, RunningAvgSamplesPerSec=11.958427022037231, CurrSamplesPerSec=11.966265650601596, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:41:54,085] [INFO] [timer.py:197:stop] 0/146, RunningAvgSamplesPerSec=11.958374727281898, CurrSamplesPerSec=11.950901283456991, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:42:00,522] [INFO] [timer.py:197:stop] 0/147, RunningAvgSamplesPerSec=11.95828940184098, CurrSamplesPerSec=11.946015237346073, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:42:07,030] [INFO] [timer.py:197:stop] 0/148, RunningAvgSamplesPerSec=11.958220538194965, CurrSamplesPerSec=11.948243697733337, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:42:13,549] [INFO] [timer.py:197:stop] 0/149, RunningAvgSamplesPerSec=11.957292030051242, CurrSamplesPerSec=11.823259798906044, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:42:20,067] [INFO] [logging.py:68:log_dist] [Rank 0] step=150, skipped=4, lr=[8.019180844200955e-06], mom=[[0.9, 0.999]] [2022-12-19 18:42:20,067] [INFO] [timer.py:197:stop] 0/150, RunningAvgSamplesPerSec=11.956663235993481, CurrSamplesPerSec=11.86494437892723, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0237, 'learning_rate': 8.019180844200955e-06, 'epoch': 3.95} [2022-12-19 18:42:26,647] [INFO] [timer.py:197:stop] 0/151, RunningAvgSamplesPerSec=11.955515629909023, CurrSamplesPerSec=11.788064887272855, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:42:31,290] [INFO] [timer.py:197:stop] 0/152, RunningAvgSamplesPerSec=11.977348129310863, CurrSamplesPerSec=16.45456057843749, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:42:37,747] [INFO] [timer.py:197:stop] 0/153, RunningAvgSamplesPerSec=11.976952415552413, CurrSamplesPerSec=11.91789001293746, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:42:44,534] [INFO] [timer.py:197:stop] 0/154, RunningAvgSamplesPerSec=11.976760757249068, CurrSamplesPerSec=11.94789057676298, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:42:51,045] [INFO] [timer.py:197:stop] 0/155, RunningAvgSamplesPerSec=11.97646824430772, CurrSamplesPerSec=11.932171807157655, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:42:57,909] [INFO] [timer.py:197:stop] 0/156, RunningAvgSamplesPerSec=11.97604480143561, CurrSamplesPerSec=11.91160889892653, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:43:04,737] [INFO] [timer.py:197:stop] 0/157, RunningAvgSamplesPerSec=11.970746697659047, CurrSamplesPerSec=11.207217630449145, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:43:11,615] [INFO] [timer.py:197:stop] 0/158, RunningAvgSamplesPerSec=11.970268407906728, CurrSamplesPerSec=11.896592732283754, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:43:18,066] [INFO] [timer.py:197:stop] 0/159, RunningAvgSamplesPerSec=11.97016882755755, CurrSamplesPerSec=11.954654556116928, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:43:24,961] [INFO] [logging.py:68:log_dist] [Rank 0] step=160, skipped=4, lr=[8.125783520495252e-06], mom=[[0.9, 0.999]] [2022-12-19 18:43:24,962] [INFO] [timer.py:197:stop] 0/160, RunningAvgSamplesPerSec=11.969499477827759, CurrSamplesPerSec=11.86533194893277, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:43:31,501] [INFO] [timer.py:197:stop] 0/161, RunningAvgSamplesPerSec=11.969072512738691, CurrSamplesPerSec=11.901992500163498, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:43:38,664] [INFO] [timer.py:197:stop] 0/162, RunningAvgSamplesPerSec=11.968689703878965, CurrSamplesPerSec=11.908132992372748, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:43:45,179] [INFO] [timer.py:197:stop] 0/163, RunningAvgSamplesPerSec=11.96822662187064, CurrSamplesPerSec=11.894592206960546, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:43:52,001] [INFO] [timer.py:197:stop] 0/164, RunningAvgSamplesPerSec=11.968409771172443, CurrSamplesPerSec=11.997970090025632, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:43:58,690] [INFO] [timer.py:197:stop] 0/165, RunningAvgSamplesPerSec=11.96621957944813, CurrSamplesPerSec=11.62168732577348, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:44:05,609] [INFO] [timer.py:197:stop] 0/166, RunningAvgSamplesPerSec=11.966306152383915, CurrSamplesPerSec=11.980434303897045, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:44:12,120] [INFO] [timer.py:197:stop] 0/167, RunningAvgSamplesPerSec=11.96568829387472, CurrSamplesPerSec=11.865215517934883, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:44:18,982] [INFO] [timer.py:197:stop] 0/168, RunningAvgSamplesPerSec=11.96475516122947, CurrSamplesPerSec=11.812756109103413, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:44:25,547] [INFO] [timer.py:197:stop] 0/169, RunningAvgSamplesPerSec=11.964024371006682, CurrSamplesPerSec=11.843938162865731, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:44:32,589] [INFO] [logging.py:68:log_dist] [Rank 0] step=170, skipped=4, lr=[8.225760510392298e-06], mom=[[0.9, 0.999]] [2022-12-19 18:44:32,590] [INFO] [timer.py:197:stop] 0/170, RunningAvgSamplesPerSec=11.963545874564042, CurrSamplesPerSec=11.8841703217617, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:44:39,154] [INFO] [timer.py:197:stop] 0/171, RunningAvgSamplesPerSec=11.96285847388959, CurrSamplesPerSec=11.848485827569215, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:44:45,970] [INFO] [timer.py:197:stop] 0/172, RunningAvgSamplesPerSec=11.96268001448524, CurrSamplesPerSec=11.932596668382528, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:44:52,532] [INFO] [timer.py:197:stop] 0/173, RunningAvgSamplesPerSec=11.962371310623036, CurrSamplesPerSec=11.910122222355328, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:44:59,112] [INFO] [timer.py:197:stop] 0/174, RunningAvgSamplesPerSec=11.961408098069231, CurrSamplesPerSec=11.79894890701397, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:45:05,661] [INFO] [timer.py:197:stop] 0/175, RunningAvgSamplesPerSec=11.960593062480887, CurrSamplesPerSec=11.822040313293776, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0133, 'learning_rate': 8.27351214279797e-06, 'epoch': 4.61} [2022-12-19 18:45:12,212] [INFO] [timer.py:197:stop] 0/176, RunningAvgSamplesPerSec=11.959797940843089, CurrSamplesPerSec=11.823814951256551, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:45:18,710] [INFO] [timer.py:197:stop] 0/177, RunningAvgSamplesPerSec=11.959499301634326, CurrSamplesPerSec=11.907762165616957, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:45:25,365] [INFO] [timer.py:197:stop] 0/178, RunningAvgSamplesPerSec=11.95747328248155, CurrSamplesPerSec=11.613186770970227, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:45:31,875] [INFO] [timer.py:197:stop] 0/179, RunningAvgSamplesPerSec=11.957232795454122, CurrSamplesPerSec=11.91505721823269, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:45:38,364] [INFO] [logging.py:68:log_dist] [Rank 0] step=180, skipped=4, lr=[8.31988745412743e-06], mom=[[0.9, 0.999]] [2022-12-19 18:45:38,365] [INFO] [timer.py:197:stop] 0/180, RunningAvgSamplesPerSec=11.957260905931054, CurrSamplesPerSec=11.962238543302307, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:45:44,854] [INFO] [timer.py:197:stop] 0/181, RunningAvgSamplesPerSec=11.956980731274534, CurrSamplesPerSec=11.907317943457292, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:45:51,446] [INFO] [timer.py:197:stop] 0/182, RunningAvgSamplesPerSec=11.955415137054269, CurrSamplesPerSec=11.681627357786487, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:45:57,906] [INFO] [timer.py:197:stop] 0/183, RunningAvgSamplesPerSec=11.95545821814125, CurrSamplesPerSec=11.963217874858184, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:46:04,350] [INFO] [timer.py:197:stop] 0/184, RunningAvgSamplesPerSec=11.955471935263988, CurrSamplesPerSec=11.957955253040835, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:46:10,891] [INFO] [timer.py:197:stop] 0/185, RunningAvgSamplesPerSec=11.955210228044418, CurrSamplesPerSec=11.907769560796638, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:46:17,428] [INFO] [timer.py:197:stop] 0/186, RunningAvgSamplesPerSec=11.95481220795668, CurrSamplesPerSec=11.882418021813043, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:46:23,951] [INFO] [timer.py:197:stop] 0/187, RunningAvgSamplesPerSec=11.954920753920847, CurrSamplesPerSec=11.974926816050653, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:46:30,425] [INFO] [timer.py:197:stop] 0/188, RunningAvgSamplesPerSec=11.9545948193769, CurrSamplesPerSec=11.894601166970343, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:46:36,987] [INFO] [timer.py:197:stop] 0/189, RunningAvgSamplesPerSec=11.954255796824288, CurrSamplesPerSec=11.89153025551718, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:46:41,618] [INFO] [logging.py:68:log_dist] [Rank 0] step=190, skipped=4, lr=[8.408811289387583e-06], mom=[[0.9, 0.999]] [2022-12-19 18:46:41,618] [INFO] [timer.py:197:stop] 0/190, RunningAvgSamplesPerSec=11.972085726575438, CurrSamplesPerSec=16.602830247845535, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:46:48,709] [INFO] [timer.py:197:stop] 0/191, RunningAvgSamplesPerSec=11.971718365805724, CurrSamplesPerSec=11.90305277410919, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:46:55,095] [INFO] [timer.py:197:stop] 0/192, RunningAvgSamplesPerSec=11.971805924957366, CurrSamplesPerSec=11.988377632963648, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:47:02,024] [INFO] [timer.py:197:stop] 0/193, RunningAvgSamplesPerSec=11.971837716369658, CurrSamplesPerSec=11.97788114995674, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:47:08,479] [INFO] [timer.py:197:stop] 0/194, RunningAvgSamplesPerSec=11.97144893606086, CurrSamplesPerSec=11.897652044341887, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:47:15,165] [INFO] [timer.py:197:stop] 0/195, RunningAvgSamplesPerSec=11.971633060620068, CurrSamplesPerSec=12.007090225390371, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:47:21,585] [INFO] [timer.py:197:stop] 0/196, RunningAvgSamplesPerSec=11.971344937010267, CurrSamplesPerSec=11.91599551495715, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:47:28,301] [INFO] [timer.py:197:stop] 0/197, RunningAvgSamplesPerSec=11.971002632316775, CurrSamplesPerSec=11.90496374964986, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:47:34,888] [INFO] [timer.py:197:stop] 0/198, RunningAvgSamplesPerSec=11.970098213872838, CurrSamplesPerSec=11.796310255242567, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:47:41,810] [INFO] [timer.py:197:stop] 0/199, RunningAvgSamplesPerSec=11.969640373249437, CurrSamplesPerSec=11.880574745588348, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:47:48,256] [INFO] [logging.py:68:log_dist] [Rank 0] step=200, skipped=4, lr=[8.49307723936858e-06], mom=[[0.9, 0.999]] [2022-12-19 18:47:48,257] [INFO] [timer.py:197:stop] 0/200, RunningAvgSamplesPerSec=11.969722973591413, CurrSamplesPerSec=11.986017505043487, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0104, 'learning_rate': 8.49307723936858e-06, 'epoch': 5.26} [2022-12-19 18:47:55,106] [INFO] [timer.py:197:stop] 0/201, RunningAvgSamplesPerSec=11.969823417978477, CurrSamplesPerSec=11.989744673162933, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:48:01,561] [INFO] [timer.py:197:stop] 0/202, RunningAvgSamplesPerSec=11.969927263071423, CurrSamplesPerSec=11.990628355028486, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:48:08,306] [INFO] [timer.py:197:stop] 0/203, RunningAvgSamplesPerSec=11.96891234037618, CurrSamplesPerSec=11.769329513044736, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:48:14,865] [INFO] [timer.py:197:stop] 0/204, RunningAvgSamplesPerSec=11.969023012232435, CurrSamplesPerSec=11.991309682333466, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:48:21,407] [INFO] [timer.py:197:stop] 0/205, RunningAvgSamplesPerSec=11.969053823741456, CurrSamplesPerSec=11.97528100273575, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:48:27,912] [INFO] [timer.py:197:stop] 0/206, RunningAvgSamplesPerSec=11.968591674220969, CurrSamplesPerSec=11.875508552134493, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:48:34,506] [INFO] [timer.py:197:stop] 0/207, RunningAvgSamplesPerSec=11.967412798901279, CurrSamplesPerSec=11.731682560399587, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:48:41,043] [INFO] [timer.py:197:stop] 0/208, RunningAvgSamplesPerSec=11.966704638128206, CurrSamplesPerSec=11.82328010843916, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:48:47,586] [INFO] [timer.py:197:stop] 0/209, RunningAvgSamplesPerSec=11.966013544546255, CurrSamplesPerSec=11.825330167934954, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:48:54,055] [INFO] [logging.py:68:log_dist] [Rank 0] step=210, skipped=4, lr=[8.573149077803088e-06], mom=[[0.9, 0.999]] [2022-12-19 18:48:54,056] [INFO] [timer.py:197:stop] 0/210, RunningAvgSamplesPerSec=11.965662492259098, CurrSamplesPerSec=11.893435424960474, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:49:00,650] [INFO] [timer.py:197:stop] 0/211, RunningAvgSamplesPerSec=11.963980559182376, CurrSamplesPerSec=11.6241240946051, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:49:07,097] [INFO] [timer.py:197:stop] 0/212, RunningAvgSamplesPerSec=11.963972501348298, CurrSamplesPerSec=11.962288652184109, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:49:13,536] [INFO] [timer.py:197:stop] 0/213, RunningAvgSamplesPerSec=11.963948947736387, CurrSamplesPerSec=11.959004743052576, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:49:20,077] [INFO] [timer.py:197:stop] 0/214, RunningAvgSamplesPerSec=11.963615780081513, CurrSamplesPerSec=11.893730000935333, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:49:26,934] [INFO] [timer.py:197:stop] 0/215, RunningAvgSamplesPerSec=11.95974524944009, CurrSamplesPerSec=11.19210838767328, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:49:33,451] [INFO] [timer.py:197:stop] 0/216, RunningAvgSamplesPerSec=11.959438531125091, CurrSamplesPerSec=11.894464133001891, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:49:39,989] [INFO] [timer.py:197:stop] 0/217, RunningAvgSamplesPerSec=11.959055902810736, CurrSamplesPerSec=11.877732856221867, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:49:46,523] [INFO] [timer.py:197:stop] 0/218, RunningAvgSamplesPerSec=11.958804526134285, CurrSamplesPerSec=11.905002820190525, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:49:53,081] [INFO] [timer.py:197:stop] 0/219, RunningAvgSamplesPerSec=11.958266565260303, CurrSamplesPerSec=11.843190398788378, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:49:59,604] [INFO] [logging.py:68:log_dist] [Rank 0] step=220, skipped=4, lr=[8.64942458567722e-06], mom=[[0.9, 0.999]] [2022-12-19 18:49:59,605] [INFO] [timer.py:197:stop] 0/220, RunningAvgSamplesPerSec=11.957966784064723, CurrSamplesPerSec=11.893267855157005, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:50:06,093] [INFO] [timer.py:197:stop] 0/221, RunningAvgSamplesPerSec=11.957712018435108, CurrSamplesPerSec=11.902431047514135, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:50:12,612] [INFO] [timer.py:197:stop] 0/222, RunningAvgSamplesPerSec=11.9578483384126, CurrSamplesPerSec=11.987777476038888, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:50:19,318] [INFO] [timer.py:197:stop] 0/223, RunningAvgSamplesPerSec=11.955153298789504, CurrSamplesPerSec=11.390381422993498, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:50:25,838] [INFO] [timer.py:197:stop] 0/224, RunningAvgSamplesPerSec=11.954664165619358, CurrSamplesPerSec=11.84753878486171, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:50:32,264] [INFO] [timer.py:197:stop] 0/225, RunningAvgSamplesPerSec=11.954815573235472, CurrSamplesPerSec=11.9885232643992, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.009, 'learning_rate': 8.686247975778677e-06, 'epoch': 5.92} [2022-12-19 18:50:38,807] [INFO] [timer.py:197:stop] 0/226, RunningAvgSamplesPerSec=11.954573554152372, CurrSamplesPerSec=11.90084694109717, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:50:45,353] [INFO] [timer.py:197:stop] 0/227, RunningAvgSamplesPerSec=11.954174532329091, CurrSamplesPerSec=11.865459920773597, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:50:50,197] [INFO] [timer.py:197:stop] 0/228, RunningAvgSamplesPerSec=11.968873837017497, CurrSamplesPerSec=16.546872163509338, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:50:56,705] [INFO] [timer.py:197:stop] 0/229, RunningAvgSamplesPerSec=11.968564190531376, CurrSamplesPerSec=11.898992668768397, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:51:03,366] [INFO] [logging.py:68:log_dist] [Rank 0] step=230, skipped=4, lr=[8.722247506883805e-06], mom=[[0.9, 0.999]] [2022-12-19 18:51:03,367] [INFO] [timer.py:197:stop] 0/230, RunningAvgSamplesPerSec=11.968529183355098, CurrSamplesPerSec=11.96058785029835, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:51:09,913] [INFO] [timer.py:197:stop] 0/231, RunningAvgSamplesPerSec=11.967879020850004, CurrSamplesPerSec=11.82146346241042, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:51:16,442] [INFO] [timer.py:197:stop] 0/232, RunningAvgSamplesPerSec=11.967061938247793, CurrSamplesPerSec=11.782842970197558, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:51:22,911] [INFO] [timer.py:197:stop] 0/233, RunningAvgSamplesPerSec=11.966750754380776, CurrSamplesPerSec=11.895605827793345, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:51:29,433] [INFO] [timer.py:197:stop] 0/234, RunningAvgSamplesPerSec=11.96624630459988, CurrSamplesPerSec=11.850847033807803, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:51:35,937] [INFO] [timer.py:197:stop] 0/235, RunningAvgSamplesPerSec=11.965826672166944, CurrSamplesPerSec=11.869260998652239, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:51:42,458] [INFO] [timer.py:197:stop] 0/236, RunningAvgSamplesPerSec=11.96532584363076, CurrSamplesPerSec=11.849764653880941, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:51:48,927] [INFO] [timer.py:197:stop] 0/237, RunningAvgSamplesPerSec=11.965001571041652, CurrSamplesPerSec=11.889601998535788, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:51:55,464] [INFO] [timer.py:197:stop] 0/238, RunningAvgSamplesPerSec=11.96464550642478, CurrSamplesPerSec=11.881553898436401, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:52:01,965] [INFO] [timer.py:197:stop] 0/239, RunningAvgSamplesPerSec=11.964587134742043, CurrSamplesPerSec=11.950827327433414, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:52:08,499] [INFO] [logging.py:68:log_dist] [Rank 0] step=240, skipped=4, lr=[8.79191691333329e-06], mom=[[0.9, 0.999]] [2022-12-19 18:52:08,500] [INFO] [timer.py:197:stop] 0/240, RunningAvgSamplesPerSec=11.964211432829098, CurrSamplesPerSec=11.87583061214829, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:52:15,263] [INFO] [timer.py:197:stop] 0/241, RunningAvgSamplesPerSec=11.964415439189414, CurrSamplesPerSec=12.013167628413601, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:52:21,725] [INFO] [timer.py:197:stop] 0/242, RunningAvgSamplesPerSec=11.96410428726764, CurrSamplesPerSec=11.890200265385754, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:52:28,480] [INFO] [timer.py:197:stop] 0/243, RunningAvgSamplesPerSec=11.963701648160816, CurrSamplesPerSec=11.867845737322513, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:52:35,258] [INFO] [timer.py:197:stop] 0/244, RunningAvgSamplesPerSec=11.960666342571534, CurrSamplesPerSec=11.271482733143172, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:52:42,087] [INFO] [timer.py:197:stop] 0/245, RunningAvgSamplesPerSec=11.960401224587407, CurrSamplesPerSec=11.896586405451426, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:52:48,622] [INFO] [timer.py:197:stop] 0/246, RunningAvgSamplesPerSec=11.960025506642335, CurrSamplesPerSec=11.86942054501301, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:52:55,390] [INFO] [timer.py:197:stop] 0/247, RunningAvgSamplesPerSec=11.959757371131706, CurrSamplesPerSec=11.894689713675385, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:53:02,150] [INFO] [timer.py:197:stop] 0/248, RunningAvgSamplesPerSec=11.95709827152116, CurrSamplesPerSec=11.33941077575699, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:53:09,084] [INFO] [timer.py:197:stop] 0/249, RunningAvgSamplesPerSec=11.95694381985539, CurrSamplesPerSec=11.919069550884041, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:53:15,576] [INFO] [logging.py:68:log_dist] [Rank 0] step=250, skipped=4, lr=[8.858694625217149e-06], mom=[[0.9, 0.999]] [2022-12-19 18:53:15,577] [INFO] [timer.py:197:stop] 0/250, RunningAvgSamplesPerSec=11.957018858592273, CurrSamplesPerSec=11.975582318309264, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0091, 'learning_rate': 8.858694625217149e-06, 'epoch': 6.58} [2022-12-19 18:53:22,422] [INFO] [timer.py:197:stop] 0/251, RunningAvgSamplesPerSec=11.956752555170752, CurrSamplesPerSec=11.891073548137316, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:53:29,100] [INFO] [timer.py:197:stop] 0/252, RunningAvgSamplesPerSec=11.955038834334175, CurrSamplesPerSec=11.54308543330756, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:53:35,931] [INFO] [timer.py:197:stop] 0/253, RunningAvgSamplesPerSec=11.954886252238936, CurrSamplesPerSec=11.916862539389484, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:53:42,452] [INFO] [timer.py:197:stop] 0/254, RunningAvgSamplesPerSec=11.954717288242081, CurrSamplesPerSec=11.912457839840409, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:53:49,316] [INFO] [timer.py:197:stop] 0/255, RunningAvgSamplesPerSec=11.954474788088856, CurrSamplesPerSec=11.89367677593137, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:53:55,800] [INFO] [timer.py:197:stop] 0/256, RunningAvgSamplesPerSec=11.954679467655913, CurrSamplesPerSec=12.006689580133575, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:54:02,501] [INFO] [timer.py:197:stop] 0/257, RunningAvgSamplesPerSec=11.95421380754742, CurrSamplesPerSec=11.837099457615839, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:54:09,043] [INFO] [timer.py:197:stop] 0/258, RunningAvgSamplesPerSec=11.954035325155182, CurrSamplesPerSec=11.90869561552593, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:54:15,709] [INFO] [timer.py:197:stop] 0/259, RunningAvgSamplesPerSec=11.953507335780868, CurrSamplesPerSec=11.819859196692144, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:54:22,235] [INFO] [logging.py:68:log_dist] [Rank 0] step=260, skipped=4, lr=[8.922811151820517e-06], mom=[[0.9, 0.999]] [2022-12-19 18:54:22,235] [INFO] [timer.py:197:stop] 0/260, RunningAvgSamplesPerSec=11.953524918056072, CurrSamplesPerSec=11.958045278209452, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:54:28,909] [INFO] [timer.py:197:stop] 0/261, RunningAvgSamplesPerSec=11.95294627807689, CurrSamplesPerSec=11.805505793389244, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:54:35,345] [INFO] [timer.py:197:stop] 0/262, RunningAvgSamplesPerSec=11.953099633869945, CurrSamplesPerSec=11.99295172071658, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:54:41,740] [INFO] [timer.py:197:stop] 0/263, RunningAvgSamplesPerSec=11.953154915407932, CurrSamplesPerSec=11.967545485935645, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:54:48,296] [INFO] [timer.py:197:stop] 0/264, RunningAvgSamplesPerSec=11.952958959230942, CurrSamplesPerSec=11.902033134359753, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:54:54,814] [INFO] [timer.py:197:stop] 0/265, RunningAvgSamplesPerSec=11.95284308146731, CurrSamplesPerSec=11.922560318565687, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:54:59,477] [INFO] [timer.py:197:stop] 0/266, RunningAvgSamplesPerSec=11.965370601881446, CurrSamplesPerSec=16.518648561648973, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:55:06,056] [INFO] [timer.py:197:stop] 0/267, RunningAvgSamplesPerSec=11.964809451324921, CurrSamplesPerSec=11.818484309644248, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:55:12,602] [INFO] [timer.py:197:stop] 0/268, RunningAvgSamplesPerSec=11.964545614341409, CurrSamplesPerSec=11.89503653429736, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:55:19,345] [INFO] [timer.py:197:stop] 0/269, RunningAvgSamplesPerSec=11.962250067229826, CurrSamplesPerSec=11.381395826072504, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:55:25,788] [INFO] [logging.py:68:log_dist] [Rank 0] step=270, skipped=4, lr=[8.984470493319244e-06], mom=[[0.9, 0.999]] [2022-12-19 18:55:25,788] [INFO] [timer.py:197:stop] 0/270, RunningAvgSamplesPerSec=11.96239745029132, CurrSamplesPerSec=12.001879092210856, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:55:32,290] [INFO] [timer.py:197:stop] 0/271, RunningAvgSamplesPerSec=11.962217071876005, CurrSamplesPerSec=11.91407094948462, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:55:38,796] [INFO] [timer.py:197:stop] 0/272, RunningAvgSamplesPerSec=11.962011088215489, CurrSamplesPerSec=11.90685791002186, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:55:45,334] [INFO] [timer.py:197:stop] 0/273, RunningAvgSamplesPerSec=11.961737058625605, CurrSamplesPerSec=11.888205574663681, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:55:52,031] [INFO] [timer.py:197:stop] 0/274, RunningAvgSamplesPerSec=11.961766852881928, CurrSamplesPerSec=11.969846570313806, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:55:58,536] [INFO] [timer.py:197:stop] 0/275, RunningAvgSamplesPerSec=11.961562278487147, CurrSamplesPerSec=11.906176639929061, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0073, 'learning_rate': 9.014436199608479e-06, 'epoch': 7.24} [2022-12-19 18:56:05,346] [INFO] [timer.py:197:stop] 0/276, RunningAvgSamplesPerSec=11.961402092808727, CurrSamplesPerSec=11.917831280272631, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:56:11,904] [INFO] [timer.py:197:stop] 0/277, RunningAvgSamplesPerSec=11.960747605566851, CurrSamplesPerSec=11.784076626416892, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:56:18,880] [INFO] [timer.py:197:stop] 0/278, RunningAvgSamplesPerSec=11.960585834456229, CurrSamplesPerSec=11.916264231429453, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:56:25,467] [INFO] [timer.py:197:stop] 0/279, RunningAvgSamplesPerSec=11.96036717579041, CurrSamplesPerSec=11.900321461301223, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:56:32,269] [INFO] [logging.py:68:log_dist] [Rank 0] step=280, skipped=4, lr=[9.043854055968706e-06], mom=[[0.9, 0.999]] [2022-12-19 18:56:32,270] [INFO] [timer.py:197:stop] 0/280, RunningAvgSamplesPerSec=11.960424377177432, CurrSamplesPerSec=11.976290255866592, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:56:38,829] [INFO] [timer.py:197:stop] 0/281, RunningAvgSamplesPerSec=11.960524268425525, CurrSamplesPerSec=11.988358893881443, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:56:45,526] [INFO] [timer.py:197:stop] 0/282, RunningAvgSamplesPerSec=11.960187110686173, CurrSamplesPerSec=11.86685677654128, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:56:52,089] [INFO] [timer.py:197:stop] 0/283, RunningAvgSamplesPerSec=11.959997466173446, CurrSamplesPerSec=11.907132552572872, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:56:58,834] [INFO] [timer.py:197:stop] 0/284, RunningAvgSamplesPerSec=11.959887502729112, CurrSamplesPerSec=11.929067684740032, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:57:05,438] [INFO] [timer.py:197:stop] 0/285, RunningAvgSamplesPerSec=11.960026464892223, CurrSamplesPerSec=11.99934307356995, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:57:12,118] [INFO] [timer.py:197:stop] 0/286, RunningAvgSamplesPerSec=11.95953376724836, CurrSamplesPerSec=11.821712834178557, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:57:18,800] [INFO] [timer.py:197:stop] 0/287, RunningAvgSamplesPerSec=11.958237330825453, CurrSamplesPerSec=11.601084653221788, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:57:25,361] [INFO] [timer.py:197:stop] 0/288, RunningAvgSamplesPerSec=11.958169600997557, CurrSamplesPerSec=11.938897817912004, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:57:31,940] [INFO] [timer.py:197:stop] 0/289, RunningAvgSamplesPerSec=11.957908031342138, CurrSamplesPerSec=11.883565822473857, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:57:38,679] [INFO] [logging.py:68:log_dist] [Rank 0] step=290, skipped=4, lr=[9.10112387015335e-06], mom=[[0.9, 0.999]] [2022-12-19 18:57:38,680] [INFO] [timer.py:197:stop] 0/290, RunningAvgSamplesPerSec=11.956220313805838, CurrSamplesPerSec=11.490767677092833, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:57:45,226] [INFO] [timer.py:197:stop] 0/291, RunningAvgSamplesPerSec=11.956082900079373, CurrSamplesPerSec=11.916638762005238, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:57:51,812] [INFO] [timer.py:197:stop] 0/292, RunningAvgSamplesPerSec=11.955730994704545, CurrSamplesPerSec=11.854891097574198, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:57:58,341] [INFO] [timer.py:197:stop] 0/293, RunningAvgSamplesPerSec=11.955736719084909, CurrSamplesPerSec=11.957397020720714, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:58:04,972] [INFO] [timer.py:197:stop] 0/294, RunningAvgSamplesPerSec=11.955152271599806, CurrSamplesPerSec=11.787471678016537, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:58:11,565] [INFO] [timer.py:197:stop] 0/295, RunningAvgSamplesPerSec=11.955288294524502, CurrSamplesPerSec=11.995139839539275, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:58:18,184] [INFO] [timer.py:197:stop] 0/296, RunningAvgSamplesPerSec=11.954982205488552, CurrSamplesPerSec=11.865968164063977, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:58:24,753] [INFO] [timer.py:197:stop] 0/297, RunningAvgSamplesPerSec=11.95499616872737, CurrSamplesPerSec=11.95910277589178, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:58:31,415] [INFO] [timer.py:197:stop] 0/298, RunningAvgSamplesPerSec=11.954584893681709, CurrSamplesPerSec=11.834481803911697, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:58:38,026] [INFO] [timer.py:197:stop] 0/299, RunningAvgSamplesPerSec=11.95441259901824, CurrSamplesPerSec=11.903630753092107, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:58:44,654] [INFO] [logging.py:68:log_dist] [Rank 0] step=300, skipped=4, lr=[9.156425255148058e-06], mom=[[0.9, 0.999]] [2022-12-19 18:58:44,655] [INFO] [timer.py:197:stop] 0/300, RunningAvgSamplesPerSec=11.954225909330887, CurrSamplesPerSec=11.899035919747792, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0072, 'learning_rate': 9.156425255148058e-06, 'epoch': 7.89} [2022-12-19 18:58:51,209] [INFO] [timer.py:197:stop] 0/301, RunningAvgSamplesPerSec=11.954295198335558, CurrSamplesPerSec=11.97497916811743, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:58:57,988] [INFO] [timer.py:197:stop] 0/302, RunningAvgSamplesPerSec=11.952520257598009, CurrSamplesPerSec=11.444447509347373, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:59:04,551] [INFO] [timer.py:197:stop] 0/303, RunningAvgSamplesPerSec=11.95235028071164, CurrSamplesPerSec=11.901574564533727, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:59:09,222] [INFO] [timer.py:197:stop] 0/304, RunningAvgSamplesPerSec=11.963231281508516, CurrSamplesPerSec=16.478730573596483, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:59:15,948] [INFO] [timer.py:197:stop] 0/305, RunningAvgSamplesPerSec=11.963293698186234, CurrSamplesPerSec=11.982173380930739, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:59:22,600] [INFO] [timer.py:197:stop] 0/306, RunningAvgSamplesPerSec=11.963279070866392, CurrSamplesPerSec=11.95884863972922, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:59:29,276] [INFO] [timer.py:197:stop] 0/307, RunningAvgSamplesPerSec=11.962831393051811, CurrSamplesPerSec=11.828273162298121, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:59:35,891] [INFO] [timer.py:197:stop] 0/308, RunningAvgSamplesPerSec=11.96221346985623, CurrSamplesPerSec=11.776679594881331, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:59:42,468] [INFO] [timer.py:197:stop] 0/309, RunningAvgSamplesPerSec=11.961672712620798, CurrSamplesPerSec=11.798466094329111, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:59:48,978] [INFO] [logging.py:68:log_dist] [Rank 0] step=310, skipped=4, lr=[9.209889040960644e-06], mom=[[0.9, 0.999]] [2022-12-19 18:59:48,979] [INFO] [timer.py:197:stop] 0/310, RunningAvgSamplesPerSec=11.96172355487196, CurrSamplesPerSec=11.97735258636264, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 18:59:55,642] [INFO] [timer.py:197:stop] 0/311, RunningAvgSamplesPerSec=11.961362875300894, CurrSamplesPerSec=11.851299086661022, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:00:02,124] [INFO] [timer.py:197:stop] 0/312, RunningAvgSamplesPerSec=11.961481228079201, CurrSamplesPerSec=11.998164755683883, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:00:08,693] [INFO] [timer.py:197:stop] 0/313, RunningAvgSamplesPerSec=11.961509967070256, CurrSamplesPerSec=11.970425716284513, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:00:15,309] [INFO] [timer.py:197:stop] 0/314, RunningAvgSamplesPerSec=11.961354905041691, CurrSamplesPerSec=11.913324878238512, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:00:21,866] [INFO] [timer.py:197:stop] 0/315, RunningAvgSamplesPerSec=11.960772949711172, CurrSamplesPerSec=11.781926559310214, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:00:28,469] [INFO] [timer.py:197:stop] 0/316, RunningAvgSamplesPerSec=11.960372169375326, CurrSamplesPerSec=11.836234084319031, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:00:35,049] [INFO] [timer.py:197:stop] 0/317, RunningAvgSamplesPerSec=11.960166919545655, CurrSamplesPerSec=11.896064992151805, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:00:41,588] [INFO] [timer.py:197:stop] 0/318, RunningAvgSamplesPerSec=11.960189302297533, CurrSamplesPerSec=11.967244041130852, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:00:48,435] [INFO] [timer.py:197:stop] 0/319, RunningAvgSamplesPerSec=11.957889184809606, CurrSamplesPerSec=11.272824180860752, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:00:54,950] [INFO] [logging.py:68:log_dist] [Rank 0] step=320, skipped=4, lr=[9.261633432763397e-06], mom=[[0.9, 0.999]] [2022-12-19 19:00:54,951] [INFO] [timer.py:197:stop] 0/320, RunningAvgSamplesPerSec=11.958074000379625, CurrSamplesPerSec=12.016949898399902, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:01:01,414] [INFO] [timer.py:197:stop] 0/321, RunningAvgSamplesPerSec=11.958221540162944, CurrSamplesPerSec=12.005324579488992, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:01:07,941] [INFO] [timer.py:197:stop] 0/322, RunningAvgSamplesPerSec=11.958316563801247, CurrSamplesPerSec=11.988706379534284, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:01:14,491] [INFO] [timer.py:197:stop] 0/323, RunningAvgSamplesPerSec=11.957953460815437, CurrSamplesPerSec=11.842882123827714, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:01:20,992] [INFO] [timer.py:197:stop] 0/324, RunningAvgSamplesPerSec=11.957815514721151, CurrSamplesPerSec=11.913698695311629, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:01:27,478] [INFO] [timer.py:197:stop] 0/325, RunningAvgSamplesPerSec=11.95756150142011, CurrSamplesPerSec=11.876326608226695, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0062, 'learning_rate': 9.28689473531776e-06, 'epoch': 8.55} [2022-12-19 19:01:33,939] [INFO] [timer.py:197:stop] 0/326, RunningAvgSamplesPerSec=11.95768468917118, CurrSamplesPerSec=11.99760758899516, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:01:40,451] [INFO] [timer.py:197:stop] 0/327, RunningAvgSamplesPerSec=11.95717498461903, CurrSamplesPerSec=11.794287345232508, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:01:47,109] [INFO] [timer.py:197:stop] 0/328, RunningAvgSamplesPerSec=11.957030227941305, CurrSamplesPerSec=11.910169253362525, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:01:53,573] [INFO] [timer.py:197:stop] 0/329, RunningAvgSamplesPerSec=11.957083348227176, CurrSamplesPerSec=11.974425755140082, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:02:00,267] [INFO] [logging.py:68:log_dist] [Rank 0] step=330, skipped=4, lr=[9.311765584761373e-06], mom=[[0.9, 0.999]] [2022-12-19 19:02:00,268] [INFO] [timer.py:197:stop] 0/330, RunningAvgSamplesPerSec=11.95695695647057, CurrSamplesPerSec=11.915769654383563, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:02:06,908] [INFO] [timer.py:197:stop] 0/331, RunningAvgSamplesPerSec=11.95592407981561, CurrSamplesPerSec=11.626503471291613, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:02:13,765] [INFO] [timer.py:197:stop] 0/332, RunningAvgSamplesPerSec=11.955770330016128, CurrSamplesPerSec=11.90540040392296, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:02:20,322] [INFO] [timer.py:197:stop] 0/333, RunningAvgSamplesPerSec=11.955500645148247, CurrSamplesPerSec=11.867164202396488, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:02:27,160] [INFO] [timer.py:197:stop] 0/334, RunningAvgSamplesPerSec=11.9552736740695, CurrSamplesPerSec=11.880616811136067, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:02:34,043] [INFO] [timer.py:197:stop] 0/335, RunningAvgSamplesPerSec=11.952698494700247, CurrSamplesPerSec=11.154971118923546, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:02:40,802] [INFO] [timer.py:197:stop] 0/336, RunningAvgSamplesPerSec=11.95256033183708, CurrSamplesPerSec=11.906729043653556, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:02:47,362] [INFO] [timer.py:197:stop] 0/337, RunningAvgSamplesPerSec=11.952411104397754, CurrSamplesPerSec=11.90277673637311, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:02:54,241] [INFO] [timer.py:197:stop] 0/338, RunningAvgSamplesPerSec=11.952245235431137, CurrSamplesPerSec=11.89693702755643, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:03:00,666] [INFO] [timer.py:197:stop] 0/339, RunningAvgSamplesPerSec=11.952297693119737, CurrSamplesPerSec=11.969949584808198, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:03:07,655] [INFO] [logging.py:68:log_dist] [Rank 0] step=340, skipped=4, lr=[9.360382936198493e-06], mom=[[0.9, 0.999]] [2022-12-19 19:03:07,656] [INFO] [timer.py:197:stop] 0/340, RunningAvgSamplesPerSec=11.952046517578895, CurrSamplesPerSec=11.86799737506084, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:03:14,176] [INFO] [timer.py:197:stop] 0/341, RunningAvgSamplesPerSec=11.952098305201716, CurrSamplesPerSec=11.969628270846362, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:03:18,888] [INFO] [timer.py:197:stop] 0/342, RunningAvgSamplesPerSec=11.96189269685124, CurrSamplesPerSec=16.563141536427064, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:03:25,404] [INFO] [timer.py:197:stop] 0/343, RunningAvgSamplesPerSec=11.961780248007488, CurrSamplesPerSec=11.923669809142542, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:03:31,889] [INFO] [timer.py:197:stop] 0/344, RunningAvgSamplesPerSec=11.96179881331923, CurrSamplesPerSec=11.968132946789327, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:03:38,378] [INFO] [timer.py:197:stop] 0/345, RunningAvgSamplesPerSec=11.961887642140821, CurrSamplesPerSec=11.992344676649527, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:03:44,925] [INFO] [timer.py:197:stop] 0/346, RunningAvgSamplesPerSec=11.96147734624168, CurrSamplesPerSec=11.822387075662531, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:03:51,462] [INFO] [timer.py:197:stop] 0/347, RunningAvgSamplesPerSec=11.961137472988574, CurrSamplesPerSec=11.845356090828643, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:03:58,075] [INFO] [timer.py:197:stop] 0/348, RunningAvgSamplesPerSec=11.960436412503, CurrSamplesPerSec=11.723378268033263, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:04:04,729] [INFO] [timer.py:197:stop] 0/349, RunningAvgSamplesPerSec=11.96029100371295, CurrSamplesPerSec=11.910190919457103, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:04:11,186] [INFO] [logging.py:68:log_dist] [Rank 0] step=350, skipped=4, lr=[9.407574351377137e-06], mom=[[0.9, 0.999]] [2022-12-19 19:04:11,187] [INFO] [timer.py:197:stop] 0/350, RunningAvgSamplesPerSec=11.960093519523195, CurrSamplesPerSec=11.891958022502779, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0047, 'learning_rate': 9.407574351377137e-06, 'epoch': 9.21} [2022-12-19 19:04:17,928] [INFO] [timer.py:197:stop] 0/351, RunningAvgSamplesPerSec=11.959944877859641, CurrSamplesPerSec=11.908440975738559, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:04:24,730] [INFO] [timer.py:197:stop] 0/352, RunningAvgSamplesPerSec=11.95804721874999, CurrSamplesPerSec=11.330613472945423, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:04:31,607] [INFO] [timer.py:197:stop] 0/353, RunningAvgSamplesPerSec=11.957861895038432, CurrSamplesPerSec=11.893349531672605, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:04:38,061] [INFO] [timer.py:197:stop] 0/354, RunningAvgSamplesPerSec=11.957734430166152, CurrSamplesPerSec=11.913161505866146, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:04:44,866] [INFO] [timer.py:197:stop] 0/355, RunningAvgSamplesPerSec=11.957589078844109, CurrSamplesPerSec=11.906644014503605, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:04:51,414] [INFO] [timer.py:197:stop] 0/356, RunningAvgSamplesPerSec=11.95726131803857, CurrSamplesPerSec=11.842673655450627, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:04:58,367] [INFO] [timer.py:197:stop] 0/357, RunningAvgSamplesPerSec=11.956966978504596, CurrSamplesPerSec=11.853673453499752, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:05:04,799] [INFO] [timer.py:197:stop] 0/358, RunningAvgSamplesPerSec=11.95705627127964, CurrSamplesPerSec=11.988839703721833, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:05:11,528] [INFO] [timer.py:197:stop] 0/359, RunningAvgSamplesPerSec=11.956769756407205, CurrSamplesPerSec=11.855635627450615, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:05:17,977] [INFO] [logging.py:68:log_dist] [Rank 0] step=360, skipped=4, lr=[9.45342109721062e-06], mom=[[0.9, 0.999]] [2022-12-19 19:05:17,978] [INFO] [timer.py:197:stop] 0/360, RunningAvgSamplesPerSec=11.95675756161352, CurrSamplesPerSec=11.952405609283566, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:05:24,692] [INFO] [timer.py:197:stop] 0/361, RunningAvgSamplesPerSec=11.956812377241004, CurrSamplesPerSec=11.976468722769011, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:05:31,187] [INFO] [timer.py:197:stop] 0/362, RunningAvgSamplesPerSec=11.956668418638309, CurrSamplesPerSec=11.905210320737845, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:05:37,637] [INFO] [timer.py:197:stop] 0/363, RunningAvgSamplesPerSec=11.956682423437918, CurrSamplesPerSec=11.961726284030215, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:05:44,186] [INFO] [timer.py:197:stop] 0/364, RunningAvgSamplesPerSec=11.95656022455115, CurrSamplesPerSec=11.912609033792682, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:05:50,740] [INFO] [timer.py:197:stop] 0/365, RunningAvgSamplesPerSec=11.9561902869176, CurrSamplesPerSec=11.823760266956167, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:05:57,270] [INFO] [timer.py:197:stop] 0/366, RunningAvgSamplesPerSec=11.956040415843384, CurrSamplesPerSec=11.901884319565061, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:06:03,705] [INFO] [timer.py:197:stop] 0/367, RunningAvgSamplesPerSec=11.956143339080421, CurrSamplesPerSec=11.993725482907088, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:06:10,145] [INFO] [timer.py:197:stop] 0/368, RunningAvgSamplesPerSec=11.956283255814228, CurrSamplesPerSec=12.007572538927086, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:06:16,704] [INFO] [timer.py:197:stop] 0/369, RunningAvgSamplesPerSec=11.956092776604171, CurrSamplesPerSec=11.886782633986288, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:06:23,395] [INFO] [logging.py:68:log_dist] [Rank 0] step=370, skipped=4, lr=[9.497997685324628e-06], mom=[[0.9, 0.999]] [2022-12-19 19:06:23,396] [INFO] [timer.py:197:stop] 0/370, RunningAvgSamplesPerSec=11.955945369165262, CurrSamplesPerSec=11.902091183692848, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:06:29,867] [INFO] [timer.py:197:stop] 0/371, RunningAvgSamplesPerSec=11.955848869740944, CurrSamplesPerSec=11.920442532826955, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:06:36,663] [INFO] [timer.py:197:stop] 0/372, RunningAvgSamplesPerSec=11.95576061031763, CurrSamplesPerSec=11.92328159635384, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:06:43,170] [INFO] [timer.py:197:stop] 0/373, RunningAvgSamplesPerSec=11.955788598150932, CurrSamplesPerSec=11.966153097939433, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:06:49,783] [INFO] [timer.py:197:stop] 0/374, RunningAvgSamplesPerSec=11.955901664679635, CurrSamplesPerSec=11.997997439364816, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:06:56,240] [INFO] [timer.py:197:stop] 0/375, RunningAvgSamplesPerSec=11.955797558161814, CurrSamplesPerSec=11.917195311371742, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0042, 'learning_rate': 9.519831289296397e-06, 'epoch': 9.87} [2022-12-19 19:07:02,841] [INFO] [timer.py:197:stop] 0/376, RunningAvgSamplesPerSec=11.955506054871446, CurrSamplesPerSec=11.847757882882286, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:07:09,293] [INFO] [timer.py:197:stop] 0/377, RunningAvgSamplesPerSec=11.95562445595161, CurrSamplesPerSec=12.000071526000058, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:07:15,821] [INFO] [timer.py:197:stop] 0/378, RunningAvgSamplesPerSec=11.955092856675767, CurrSamplesPerSec=11.759021316061945, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:07:22,317] [INFO] [timer.py:197:stop] 0/379, RunningAvgSamplesPerSec=11.954902461156566, CurrSamplesPerSec=11.883741010455603, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:07:26,981] [INFO] [logging.py:68:log_dist] [Rank 0] step=380, skipped=4, lr=[9.541372600623587e-06], mom=[[0.9, 0.999]] [2022-12-19 19:07:26,982] [INFO] [timer.py:197:stop] 0/380, RunningAvgSamplesPerSec=11.963689065685433, CurrSamplesPerSec=16.549282728973843, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:07:33,625] [INFO] [timer.py:197:stop] 0/381, RunningAvgSamplesPerSec=11.963515822089455, CurrSamplesPerSec=11.898387188069284, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:07:40,071] [INFO] [timer.py:197:stop] 0/382, RunningAvgSamplesPerSec=11.963524869652602, CurrSamplesPerSec=11.966954881801362, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:07:46,910] [INFO] [timer.py:197:stop] 0/383, RunningAvgSamplesPerSec=11.96353714862309, CurrSamplesPerSec=11.968204982743584, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:07:53,397] [INFO] [timer.py:197:stop] 0/384, RunningAvgSamplesPerSec=11.96361537290439, CurrSamplesPerSec=11.993493450938637, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:07:59,897] [INFO] [timer.py:197:stop] 0/385, RunningAvgSamplesPerSec=11.96348999361119, CurrSamplesPerSec=11.91578658040611, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:08:06,462] [INFO] [timer.py:197:stop] 0/386, RunningAvgSamplesPerSec=11.963306585276023, CurrSamplesPerSec=11.893472312029314, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:08:13,071] [INFO] [timer.py:197:stop] 0/387, RunningAvgSamplesPerSec=11.962700255973227, CurrSamplesPerSec=11.73432623045743, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:08:19,557] [INFO] [timer.py:197:stop] 0/388, RunningAvgSamplesPerSec=11.96266029050814, CurrSamplesPerSec=11.947293403103313, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:08:26,069] [INFO] [timer.py:197:stop] 0/389, RunningAvgSamplesPerSec=11.962412303548426, CurrSamplesPerSec=11.867451182543181, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:08:32,472] [INFO] [logging.py:68:log_dist] [Rank 0] step=390, skipped=4, lr=[9.583608934209288e-06], mom=[[0.9, 0.999]] [2022-12-19 19:08:32,473] [INFO] [timer.py:197:stop] 0/390, RunningAvgSamplesPerSec=11.96248295309582, CurrSamplesPerSec=11.989887124584424, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:08:38,911] [INFO] [timer.py:197:stop] 0/391, RunningAvgSamplesPerSec=11.962487447405872, CurrSamplesPerSec=11.964231494594209, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:08:45,448] [INFO] [timer.py:197:stop] 0/392, RunningAvgSamplesPerSec=11.962310595796598, CurrSamplesPerSec=11.893909704715535, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:08:51,951] [INFO] [timer.py:197:stop] 0/393, RunningAvgSamplesPerSec=11.962018215900912, CurrSamplesPerSec=11.849069503005135, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:08:58,516] [INFO] [timer.py:197:stop] 0/394, RunningAvgSamplesPerSec=11.961865640501298, CurrSamplesPerSec=11.902505461282438, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:09:05,046] [INFO] [timer.py:197:stop] 0/395, RunningAvgSamplesPerSec=11.961759973224343, CurrSamplesPerSec=11.920481705032536, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:09:11,734] [INFO] [timer.py:197:stop] 0/396, RunningAvgSamplesPerSec=11.961591799465188, CurrSamplesPerSec=11.895863609159935, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:09:18,253] [INFO] [timer.py:197:stop] 0/397, RunningAvgSamplesPerSec=11.961578446223642, CurrSamplesPerSec=11.956319587980142, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:09:25,110] [INFO] [timer.py:197:stop] 0/398, RunningAvgSamplesPerSec=11.961438450551832, CurrSamplesPerSec=11.906395271603161, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:09:31,550] [INFO] [timer.py:197:stop] 0/399, RunningAvgSamplesPerSec=11.961280801547348, CurrSamplesPerSec=11.899176751335, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:09:38,129] [INFO] [logging.py:68:log_dist] [Rank 0] step=400, skipped=4, lr=[9.624764935335318e-06], mom=[[0.9, 0.999]] [2022-12-19 19:09:38,130] [INFO] [timer.py:197:stop] 0/400, RunningAvgSamplesPerSec=11.961067135437117, CurrSamplesPerSec=11.87684051226843, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0048, 'learning_rate': 9.624764935335318e-06, 'epoch': 10.53} [2022-12-19 19:09:44,555] [INFO] [timer.py:197:stop] 0/401, RunningAvgSamplesPerSec=11.961098071202871, CurrSamplesPerSec=11.97342322503396, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:09:51,076] [INFO] [timer.py:197:stop] 0/402, RunningAvgSamplesPerSec=11.96077417537623, CurrSamplesPerSec=11.8329245988732, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:09:57,490] [INFO] [timer.py:197:stop] 0/403, RunningAvgSamplesPerSec=11.960802453686547, CurrSamplesPerSec=11.972124511845799, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:10:03,987] [INFO] [timer.py:197:stop] 0/404, RunningAvgSamplesPerSec=11.96073005384135, CurrSamplesPerSec=11.931768190498323, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:10:10,534] [INFO] [timer.py:197:stop] 0/405, RunningAvgSamplesPerSec=11.960600365426814, CurrSamplesPerSec=11.908692445669109, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:10:17,109] [INFO] [timer.py:197:stop] 0/406, RunningAvgSamplesPerSec=11.96039876029092, CurrSamplesPerSec=11.87970142694694, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:10:23,826] [INFO] [timer.py:197:stop] 0/407, RunningAvgSamplesPerSec=11.960286379575, CurrSamplesPerSec=11.915056689358508, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:10:30,565] [INFO] [timer.py:197:stop] 0/408, RunningAvgSamplesPerSec=11.958714014744611, CurrSamplesPerSec=11.354177622628272, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:10:37,154] [INFO] [timer.py:197:stop] 0/409, RunningAvgSamplesPerSec=11.958522461852873, CurrSamplesPerSec=11.881255718260933, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:10:43,692] [INFO] [logging.py:68:log_dist] [Rank 0] step=410, skipped=4, lr=[9.664894494516345e-06], mom=[[0.9, 0.999]] [2022-12-19 19:10:43,693] [INFO] [timer.py:197:stop] 0/410, RunningAvgSamplesPerSec=11.95830970299223, CurrSamplesPerSec=11.872340896041663, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:10:50,188] [INFO] [timer.py:197:stop] 0/411, RunningAvgSamplesPerSec=11.958350637485843, CurrSamplesPerSec=11.975075326167289, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:10:56,765] [INFO] [timer.py:197:stop] 0/412, RunningAvgSamplesPerSec=11.95823281659097, CurrSamplesPerSec=11.910237951006847, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:11:03,490] [INFO] [timer.py:197:stop] 0/413, RunningAvgSamplesPerSec=11.956746920884248, CurrSamplesPerSec=11.377134006335005, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:11:10,019] [INFO] [timer.py:197:stop] 0/414, RunningAvgSamplesPerSec=11.956655669909175, CurrSamplesPerSec=11.919269074106126, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:11:16,449] [INFO] [timer.py:197:stop] 0/415, RunningAvgSamplesPerSec=11.956655685292697, CurrSamplesPerSec=11.956662023307778, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:11:22,954] [INFO] [timer.py:197:stop] 0/416, RunningAvgSamplesPerSec=11.956780517069827, CurrSamplesPerSec=12.00855984492215, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:11:29,464] [INFO] [timer.py:197:stop] 0/417, RunningAvgSamplesPerSec=11.956549843725488, CurrSamplesPerSec=11.86180961146304, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:11:34,064] [INFO] [timer.py:197:stop] 0/418, RunningAvgSamplesPerSec=11.964625106550153, CurrSamplesPerSec=16.62410075896584, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:11:40,480] [INFO] [timer.py:197:stop] 0/419, RunningAvgSamplesPerSec=11.964686513632762, CurrSamplesPerSec=11.990286649261607, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:11:47,155] [INFO] [logging.py:68:log_dist] [Rank 0] step=420, skipped=4, lr=[9.704047567846437e-06], mom=[[0.9, 0.999]] [2022-12-19 19:11:47,156] [INFO] [timer.py:197:stop] 0/420, RunningAvgSamplesPerSec=11.964784959543438, CurrSamplesPerSec=12.005978581193482, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:11:53,887] [INFO] [timer.py:197:stop] 0/421, RunningAvgSamplesPerSec=11.963580703242256, CurrSamplesPerSec=11.480573193271733, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:12:00,749] [INFO] [timer.py:197:stop] 0/422, RunningAvgSamplesPerSec=11.963679846406063, CurrSamplesPerSec=12.005365922314523, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:12:07,207] [INFO] [timer.py:197:stop] 0/423, RunningAvgSamplesPerSec=11.963662142630655, CurrSamplesPerSec=11.956231186399881, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:12:14,197] [INFO] [timer.py:197:stop] 0/424, RunningAvgSamplesPerSec=11.963453143781376, CurrSamplesPerSec=11.876108554000023, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:12:20,780] [INFO] [timer.py:197:stop] 0/425, RunningAvgSamplesPerSec=11.962856104514723, CurrSamplesPerSec=11.716114491124491, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0047, 'learning_rate': 9.723272550712454e-06, 'epoch': 11.18} [2022-12-19 19:12:27,491] [INFO] [timer.py:197:stop] 0/426, RunningAvgSamplesPerSec=11.9627655738164, CurrSamplesPerSec=11.92459357121858, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:12:33,946] [INFO] [timer.py:197:stop] 0/427, RunningAvgSamplesPerSec=11.962573738945366, CurrSamplesPerSec=11.881786352622898, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:12:40,448] [INFO] [timer.py:197:stop] 0/428, RunningAvgSamplesPerSec=11.962651896566687, CurrSamplesPerSec=11.99596159511273, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:12:46,950] [INFO] [timer.py:197:stop] 0/429, RunningAvgSamplesPerSec=11.962543178028197, CurrSamplesPerSec=11.916408116097363, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:12:53,427] [INFO] [logging.py:68:log_dist] [Rank 0] step=430, skipped=4, lr=[9.742270550908135e-06], mom=[[0.9, 0.999]] [2022-12-19 19:12:53,427] [INFO] [timer.py:197:stop] 0/430, RunningAvgSamplesPerSec=11.962451764284422, CurrSamplesPerSec=11.923545345263417, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:12:59,896] [INFO] [timer.py:197:stop] 0/431, RunningAvgSamplesPerSec=11.962560863599244, CurrSamplesPerSec=12.00943878053189, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:13:06,411] [INFO] [timer.py:197:stop] 0/432, RunningAvgSamplesPerSec=11.962281997422737, CurrSamplesPerSec=11.843835738165968, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:13:12,890] [INFO] [timer.py:197:stop] 0/433, RunningAvgSamplesPerSec=11.96228584457164, CurrSamplesPerSec=11.963940347935317, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:13:19,644] [INFO] [timer.py:197:stop] 0/434, RunningAvgSamplesPerSec=11.960699638292724, CurrSamplesPerSec=11.314089698343578, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:13:26,112] [INFO] [timer.py:197:stop] 0/435, RunningAvgSamplesPerSec=11.960734196921816, CurrSamplesPerSec=11.97568222593352, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:13:32,572] [INFO] [timer.py:197:stop] 0/436, RunningAvgSamplesPerSec=11.960840034207777, CurrSamplesPerSec=12.006844249664644, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:13:39,117] [INFO] [timer.py:197:stop] 0/437, RunningAvgSamplesPerSec=11.960979997113505, CurrSamplesPerSec=12.022034679168051, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:13:45,582] [INFO] [timer.py:197:stop] 0/438, RunningAvgSamplesPerSec=11.961056155926537, CurrSamplesPerSec=11.99427746570264, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:13:52,036] [INFO] [timer.py:197:stop] 0/439, RunningAvgSamplesPerSec=11.961136989532214, CurrSamplesPerSec=11.996484832797169, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:13:58,526] [INFO] [logging.py:68:log_dist] [Rank 0] step=440, skipped=4, lr=[9.779606609292176e-06], mom=[[0.9, 0.999]] [2022-12-19 19:13:58,527] [INFO] [timer.py:197:stop] 0/440, RunningAvgSamplesPerSec=11.961036552526675, CurrSamplesPerSec=11.917306415853819, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:14:05,155] [INFO] [timer.py:197:stop] 0/441, RunningAvgSamplesPerSec=11.960953853945096, CurrSamplesPerSec=11.924841485859334, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:14:11,831] [INFO] [timer.py:197:stop] 0/442, RunningAvgSamplesPerSec=11.961060435577396, CurrSamplesPerSec=12.008033941923953, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:14:18,344] [INFO] [timer.py:197:stop] 0/443, RunningAvgSamplesPerSec=11.961049702856819, CurrSamplesPerSec=11.956329173772223, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:14:24,855] [INFO] [timer.py:197:stop] 0/444, RunningAvgSamplesPerSec=11.960906092686594, CurrSamplesPerSec=11.897908332342341, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:14:31,346] [INFO] [timer.py:197:stop] 0/445, RunningAvgSamplesPerSec=11.960658339820073, CurrSamplesPerSec=11.852147302025633, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:14:38,050] [INFO] [timer.py:197:stop] 0/446, RunningAvgSamplesPerSec=11.960672121279263, CurrSamplesPerSec=11.966780432654314, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:14:44,600] [INFO] [timer.py:197:stop] 0/447, RunningAvgSamplesPerSec=11.96078237187159, CurrSamplesPerSec=12.009935253344844, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:14:51,124] [INFO] [timer.py:197:stop] 0/448, RunningAvgSamplesPerSec=11.960669747874908, CurrSamplesPerSec=11.910761664482132, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:14:57,689] [INFO] [timer.py:197:stop] 0/449, RunningAvgSamplesPerSec=11.960243530588247, CurrSamplesPerSec=11.773131201979238, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:15:04,175] [INFO] [logging.py:68:log_dist] [Rank 0] step=450, skipped=4, lr=[9.816095971633122e-06], mom=[[0.9, 0.999]] [2022-12-19 19:15:04,175] [INFO] [timer.py:197:stop] 0/450, RunningAvgSamplesPerSec=11.960103229206762, CurrSamplesPerSec=11.897716378974831, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0048, 'learning_rate': 9.816095971633122e-06, 'epoch': 11.84} [2022-12-19 19:15:10,657] [INFO] [timer.py:197:stop] 0/451, RunningAvgSamplesPerSec=11.960000696720844, CurrSamplesPerSec=11.914242278824265, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:15:17,119] [INFO] [timer.py:197:stop] 0/452, RunningAvgSamplesPerSec=11.959940219437032, CurrSamplesPerSec=11.932847568183282, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:15:23,587] [INFO] [timer.py:197:stop] 0/453, RunningAvgSamplesPerSec=11.960066198484084, CurrSamplesPerSec=12.017027364896247, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:15:30,315] [INFO] [timer.py:197:stop] 0/454, RunningAvgSamplesPerSec=11.959986501887036, CurrSamplesPerSec=11.924151270485098, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:15:36,723] [INFO] [timer.py:197:stop] 0/455, RunningAvgSamplesPerSec=11.960105717157514, CurrSamplesPerSec=12.014235436095843, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:15:41,374] [INFO] [timer.py:197:stop] 0/456, RunningAvgSamplesPerSec=11.967390180794478, CurrSamplesPerSec=16.527393090849067, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:15:47,861] [INFO] [timer.py:197:stop] 0/457, RunningAvgSamplesPerSec=11.967289505176039, CurrSamplesPerSec=11.921757059754666, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:15:54,337] [INFO] [timer.py:197:stop] 0/458, RunningAvgSamplesPerSec=11.967092049162074, CurrSamplesPerSec=11.877920486233057, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:16:00,838] [INFO] [timer.py:197:stop] 0/459, RunningAvgSamplesPerSec=11.966991307584832, CurrSamplesPerSec=11.921229202389917, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:16:07,298] [INFO] [logging.py:68:log_dist] [Rank 0] step=460, skipped=4, lr=[9.851776190149156e-06], mom=[[0.9, 0.999]] [2022-12-19 19:16:07,299] [INFO] [timer.py:197:stop] 0/460, RunningAvgSamplesPerSec=11.966799772932442, CurrSamplesPerSec=11.87990541764102, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:16:13,782] [INFO] [timer.py:197:stop] 0/461, RunningAvgSamplesPerSec=11.966896572288283, CurrSamplesPerSec=12.011395895133065, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:16:20,264] [INFO] [timer.py:197:stop] 0/462, RunningAvgSamplesPerSec=11.96680418513396, CurrSamplesPerSec=11.92454854509294, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:16:26,755] [INFO] [timer.py:197:stop] 0/463, RunningAvgSamplesPerSec=11.966607455898583, CurrSamplesPerSec=11.876792693144612, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:16:33,193] [INFO] [timer.py:197:stop] 0/464, RunningAvgSamplesPerSec=11.966614079413304, CurrSamplesPerSec=11.96966830071487, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:16:39,837] [INFO] [timer.py:197:stop] 0/465, RunningAvgSamplesPerSec=11.96663256488401, CurrSamplesPerSec=11.975178964905716, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:16:46,434] [INFO] [timer.py:197:stop] 0/466, RunningAvgSamplesPerSec=11.96647316075237, CurrSamplesPerSec=11.893122421041914, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:16:52,958] [INFO] [timer.py:197:stop] 0/467, RunningAvgSamplesPerSec=11.966300651739248, CurrSamplesPerSec=11.886789476762956, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:16:59,404] [INFO] [timer.py:197:stop] 0/468, RunningAvgSamplesPerSec=11.966318566858343, CurrSamplesPerSec=11.974654913188974, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:17:05,941] [INFO] [timer.py:197:stop] 0/469, RunningAvgSamplesPerSec=11.96599109239272, CurrSamplesPerSec=11.815313713905496, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:17:12,443] [INFO] [logging.py:68:log_dist] [Rank 0] step=470, skipped=4, lr=[9.886682372916766e-06], mom=[[0.9, 0.999]] [2022-12-19 19:17:12,444] [INFO] [timer.py:197:stop] 0/470, RunningAvgSamplesPerSec=11.965779796564986, CurrSamplesPerSec=11.867913422893002, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:17:19,128] [INFO] [timer.py:197:stop] 0/471, RunningAvgSamplesPerSec=11.965573350892216, CurrSamplesPerSec=11.869732302583541, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:17:25,633] [INFO] [timer.py:197:stop] 0/472, RunningAvgSamplesPerSec=11.965648345094552, CurrSamplesPerSec=12.00092453926605, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:17:32,044] [INFO] [timer.py:197:stop] 0/473, RunningAvgSamplesPerSec=11.965736066432036, CurrSamplesPerSec=12.007107948921425, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:17:38,523] [INFO] [timer.py:197:stop] 0/474, RunningAvgSamplesPerSec=11.965855329328619, CurrSamplesPerSec=12.022293661720143, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:17:45,006] [INFO] [timer.py:197:stop] 0/475, RunningAvgSamplesPerSec=11.965804922445507, CurrSamplesPerSec=11.94206018617277, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0037, 'learning_rate': 9.90385555539545e-06, 'epoch': 12.5} [2022-12-19 19:17:51,531] [INFO] [timer.py:197:stop] 0/476, RunningAvgSamplesPerSec=11.965675277428401, CurrSamplesPerSec=11.904666505150017, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:17:58,087] [INFO] [timer.py:197:stop] 0/477, RunningAvgSamplesPerSec=11.965494902945602, CurrSamplesPerSec=11.88060524308075, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:18:04,553] [INFO] [timer.py:197:stop] 0/478, RunningAvgSamplesPerSec=11.965566826450265, CurrSamplesPerSec=11.999828519887469, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:18:11,010] [INFO] [timer.py:197:stop] 0/479, RunningAvgSamplesPerSec=11.965573289633536, CurrSamplesPerSec=11.968650557730545, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:18:17,702] [INFO] [logging.py:68:log_dist] [Rank 0] step=480, skipped=4, lr=[9.92084739148192e-06], mom=[[0.9, 0.999]] [2022-12-19 19:18:17,703] [INFO] [timer.py:197:stop] 0/480, RunningAvgSamplesPerSec=11.965460057094816, CurrSamplesPerSec=11.911691356125353, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:18:24,201] [INFO] [timer.py:197:stop] 0/481, RunningAvgSamplesPerSec=11.96552406855967, CurrSamplesPerSec=11.996200155813812, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:18:30,721] [INFO] [timer.py:197:stop] 0/482, RunningAvgSamplesPerSec=11.965187447997225, CurrSamplesPerSec=11.806094588280331, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:18:37,309] [INFO] [timer.py:197:stop] 0/483, RunningAvgSamplesPerSec=11.965072049912216, CurrSamplesPerSec=11.909936744138914, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:18:43,797] [INFO] [timer.py:197:stop] 0/484, RunningAvgSamplesPerSec=11.964938508978381, CurrSamplesPerSec=11.901049020481066, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:18:50,333] [INFO] [timer.py:197:stop] 0/485, RunningAvgSamplesPerSec=11.964813801717309, CurrSamplesPerSec=11.905005988083438, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:18:56,880] [INFO] [timer.py:197:stop] 0/486, RunningAvgSamplesPerSec=11.964642634590682, CurrSamplesPerSec=11.882537420622386, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:19:03,454] [INFO] [timer.py:197:stop] 0/487, RunningAvgSamplesPerSec=11.964751683093262, CurrSamplesPerSec=12.017765498552977, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:19:10,172] [INFO] [timer.py:197:stop] 0/488, RunningAvgSamplesPerSec=11.964681054332619, CurrSamplesPerSec=11.930524098608846, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:19:16,891] [INFO] [timer.py:197:stop] 0/489, RunningAvgSamplesPerSec=11.964601924421471, CurrSamplesPerSec=11.926268254854502, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:19:23,383] [INFO] [logging.py:68:log_dist] [Rank 0] step=490, skipped=4, lr=[9.954302066885107e-06], mom=[[0.9, 0.999]] [2022-12-19 19:19:23,384] [INFO] [timer.py:197:stop] 0/490, RunningAvgSamplesPerSec=11.964479521534537, CurrSamplesPerSec=11.905165440871563, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:19:29,863] [INFO] [timer.py:197:stop] 0/491, RunningAvgSamplesPerSec=11.964440521888939, CurrSamplesPerSec=11.945438982511188, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:19:36,326] [INFO] [timer.py:197:stop] 0/492, RunningAvgSamplesPerSec=11.964520793316316, CurrSamplesPerSec=12.003902989077892, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:19:42,994] [INFO] [timer.py:197:stop] 0/493, RunningAvgSamplesPerSec=11.964693088494137, CurrSamplesPerSec=12.04971890471031, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:19:47,636] [INFO] [timer.py:197:stop] 0/494, RunningAvgSamplesPerSec=11.97149869912848, CurrSamplesPerSec=16.610577646885343, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:19:54,118] [INFO] [timer.py:197:stop] 0/495, RunningAvgSamplesPerSec=11.971497733240914, CurrSamplesPerSec=11.971022535461254, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:20:00,784] [INFO] [timer.py:197:stop] 0/496, RunningAvgSamplesPerSec=11.971336688771379, CurrSamplesPerSec=11.892465904054477, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:20:07,363] [INFO] [timer.py:197:stop] 0/497, RunningAvgSamplesPerSec=11.971197147138936, CurrSamplesPerSec=11.902659041575854, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:20:14,024] [INFO] [timer.py:197:stop] 0/498, RunningAvgSamplesPerSec=11.97120584935865, CurrSamplesPerSec=11.975515001812596, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:20:20,568] [INFO] [timer.py:197:stop] 0/499, RunningAvgSamplesPerSec=11.971001559434184, CurrSamplesPerSec=11.870525941213016, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:20:27,050] [INFO] [logging.py:68:log_dist] [Rank 0] step=500, skipped=4, lr=[9.987075336738768e-06], mom=[[0.9, 0.999]] [2022-12-19 19:20:27,050] [INFO] [timer.py:197:stop] 0/500, RunningAvgSamplesPerSec=11.970882028414971, CurrSamplesPerSec=11.911769057222811, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0038, 'learning_rate': 9.987075336738768e-06, 'epoch': 13.16} [2022-12-19 19:20:33,554] [INFO] [timer.py:197:stop] 0/501, RunningAvgSamplesPerSec=11.970903542802073, CurrSamplesPerSec=11.98162732482164, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:20:40,103] [INFO] [timer.py:197:stop] 0/502, RunningAvgSamplesPerSec=11.970767152561097, CurrSamplesPerSec=11.903093943358158, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:20:46,815] [INFO] [timer.py:197:stop] 0/503, RunningAvgSamplesPerSec=11.97060946488682, CurrSamplesPerSec=11.892282555918243, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:20:53,323] [INFO] [timer.py:197:stop] 0/504, RunningAvgSamplesPerSec=11.970491524090875, CurrSamplesPerSec=11.911693999003171, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:20:59,789] [INFO] [timer.py:197:stop] 0/505, RunningAvgSamplesPerSec=11.970489548814147, CurrSamplesPerSec=11.969498042193239, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:21:06,331] [INFO] [timer.py:197:stop] 0/506, RunningAvgSamplesPerSec=11.970397710866356, CurrSamplesPerSec=11.924381156831144, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:21:12,818] [INFO] [timer.py:197:stop] 0/507, RunningAvgSamplesPerSec=11.970445463779823, CurrSamplesPerSec=11.994561515483362, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:21:19,429] [INFO] [timer.py:197:stop] 0/508, RunningAvgSamplesPerSec=11.97046189142773, CurrSamplesPerSec=11.978763618410035, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:21:25,999] [INFO] [timer.py:197:stop] 0/509, RunningAvgSamplesPerSec=11.970345202381258, CurrSamplesPerSec=11.91159092763728, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:21:32,449] [INFO] [logging.py:68:log_dist] [Rank 0] step=510, skipped=4, lr=[9.98888888888889e-06], mom=[[0.9, 0.999]] [2022-12-19 19:21:32,450] [INFO] [timer.py:197:stop] 0/510, RunningAvgSamplesPerSec=11.97041264973506, CurrSamplesPerSec=12.004706618723947, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:21:38,913] [INFO] [timer.py:197:stop] 0/511, RunningAvgSamplesPerSec=11.970248192883536, CurrSamplesPerSec=11.887284283033926, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:21:45,554] [INFO] [timer.py:197:stop] 0/512, RunningAvgSamplesPerSec=11.970127696373872, CurrSamplesPerSec=11.909108239628411, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:21:52,021] [INFO] [timer.py:197:stop] 0/513, RunningAvgSamplesPerSec=11.96994284545872, CurrSamplesPerSec=11.876407001323265, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:21:58,446] [INFO] [timer.py:197:stop] 0/514, RunningAvgSamplesPerSec=11.97000660132427, CurrSamplesPerSec=12.002674937343045, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:22:05,170] [INFO] [timer.py:197:stop] 0/515, RunningAvgSamplesPerSec=11.969899246291638, CurrSamplesPerSec=11.91518520716523, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:22:11,627] [INFO] [timer.py:197:stop] 0/516, RunningAvgSamplesPerSec=11.969935225269287, CurrSamplesPerSec=11.988421000778564, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:22:18,089] [INFO] [timer.py:197:stop] 0/517, RunningAvgSamplesPerSec=11.969880020421801, CurrSamplesPerSec=11.941571965248388, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:22:24,567] [INFO] [timer.py:197:stop] 0/518, RunningAvgSamplesPerSec=11.969677927576155, CurrSamplesPerSec=11.866499007461336, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:22:31,240] [INFO] [timer.py:197:stop] 0/519, RunningAvgSamplesPerSec=11.969563267659181, CurrSamplesPerSec=11.910690318362558, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:22:37,713] [INFO] [logging.py:68:log_dist] [Rank 0] step=520, skipped=4, lr=[9.966666666666667e-06], mom=[[0.9, 0.999]] [2022-12-19 19:22:37,713] [INFO] [timer.py:197:stop] 0/520, RunningAvgSamplesPerSec=11.969394647037719, CurrSamplesPerSec=11.882849341609326, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:22:44,244] [INFO] [timer.py:197:stop] 0/521, RunningAvgSamplesPerSec=11.969365331913723, CurrSamplesPerSec=11.954199375497849, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:22:50,651] [INFO] [timer.py:197:stop] 0/522, RunningAvgSamplesPerSec=11.969211039043975, CurrSamplesPerSec=11.889666246139303, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:22:57,087] [INFO] [timer.py:197:stop] 0/523, RunningAvgSamplesPerSec=11.96923942085822, CurrSamplesPerSec=11.984016219678896, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:23:03,646] [INFO] [timer.py:197:stop] 0/524, RunningAvgSamplesPerSec=11.969128333123505, CurrSamplesPerSec=11.911530671357285, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:23:10,116] [INFO] [timer.py:197:stop] 0/525, RunningAvgSamplesPerSec=11.969007049771843, CurrSamplesPerSec=11.906030889940311, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0034, 'learning_rate': 9.955555555555556e-06, 'epoch': 13.82} [2022-12-19 19:23:16,605] [INFO] [timer.py:197:stop] 0/526, RunningAvgSamplesPerSec=11.968961223189154, CurrSamplesPerSec=11.94504190936655, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:23:23,111] [INFO] [timer.py:197:stop] 0/527, RunningAvgSamplesPerSec=11.96878588899259, CurrSamplesPerSec=11.877611977281537, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:23:29,721] [INFO] [timer.py:197:stop] 0/528, RunningAvgSamplesPerSec=11.968603648091593, CurrSamplesPerSec=11.873687376673056, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:23:36,296] [INFO] [timer.py:197:stop] 0/529, RunningAvgSamplesPerSec=11.96850037213131, CurrSamplesPerSec=11.914423131806448, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:23:42,964] [INFO] [logging.py:68:log_dist] [Rank 0] step=530, skipped=4, lr=[9.944444444444445e-06], mom=[[0.9, 0.999]] [2022-12-19 19:23:42,965] [INFO] [timer.py:197:stop] 0/530, RunningAvgSamplesPerSec=11.968289035892484, CurrSamplesPerSec=11.857943635912738, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:23:49,644] [INFO] [timer.py:197:stop] 0/531, RunningAvgSamplesPerSec=11.9681654720526, CurrSamplesPerSec=11.9032781528446, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:23:54,355] [INFO] [timer.py:197:stop] 0/532, RunningAvgSamplesPerSec=11.974241365725362, CurrSamplesPerSec=16.370734748869264, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:24:00,845] [INFO] [timer.py:197:stop] 0/533, RunningAvgSamplesPerSec=11.974096085288423, CurrSamplesPerSec=11.89759034702779, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:24:07,308] [INFO] [timer.py:197:stop] 0/534, RunningAvgSamplesPerSec=11.974109244933327, CurrSamplesPerSec=11.981101104325049, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:24:13,817] [INFO] [timer.py:197:stop] 0/535, RunningAvgSamplesPerSec=11.973972548985593, CurrSamplesPerSec=11.901690127131745, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:24:20,452] [INFO] [timer.py:197:stop] 0/536, RunningAvgSamplesPerSec=11.973906687366869, CurrSamplesPerSec=11.93890525181514, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:24:26,964] [INFO] [timer.py:197:stop] 0/537, RunningAvgSamplesPerSec=11.973718586057204, CurrSamplesPerSec=11.874109658931614, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:24:33,494] [INFO] [timer.py:197:stop] 0/538, RunningAvgSamplesPerSec=11.973388733887482, CurrSamplesPerSec=11.799485700046317, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:24:39,996] [INFO] [timer.py:197:stop] 0/539, RunningAvgSamplesPerSec=11.97315662249098, CurrSamplesPerSec=11.85002672967011, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:24:46,538] [INFO] [logging.py:68:log_dist] [Rank 0] step=540, skipped=4, lr=[9.922222222222222e-06], mom=[[0.9, 0.999]] [2022-12-19 19:24:46,539] [INFO] [timer.py:197:stop] 0/540, RunningAvgSamplesPerSec=11.972922414016997, CurrSamplesPerSec=11.848462293475453, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:24:52,984] [INFO] [timer.py:197:stop] 0/541, RunningAvgSamplesPerSec=11.972941329547208, CurrSamplesPerSec=11.983126557957817, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:24:59,544] [INFO] [timer.py:197:stop] 0/542, RunningAvgSamplesPerSec=11.972901582717197, CurrSamplesPerSec=11.951516377647954, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:25:06,046] [INFO] [timer.py:197:stop] 0/543, RunningAvgSamplesPerSec=11.972886270601725, CurrSamplesPerSec=11.964623445166541, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:25:12,488] [INFO] [timer.py:197:stop] 0/544, RunningAvgSamplesPerSec=11.972836592575407, CurrSamplesPerSec=11.94602108523544, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:25:19,134] [INFO] [timer.py:197:stop] 0/545, RunningAvgSamplesPerSec=11.97287271643062, CurrSamplesPerSec=11.992483975238882, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:25:25,818] [INFO] [timer.py:197:stop] 0/546, RunningAvgSamplesPerSec=11.97293932472891, CurrSamplesPerSec=12.009217422768652, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:25:32,301] [INFO] [timer.py:197:stop] 0/547, RunningAvgSamplesPerSec=11.97274218755734, CurrSamplesPerSec=11.866453369848655, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:25:38,720] [INFO] [timer.py:197:stop] 0/548, RunningAvgSamplesPerSec=11.972600852552905, CurrSamplesPerSec=11.89606657372061, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:25:45,190] [INFO] [timer.py:197:stop] 0/549, RunningAvgSamplesPerSec=11.97249143431459, CurrSamplesPerSec=11.913046249088778, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:25:51,700] [INFO] [logging.py:68:log_dist] [Rank 0] step=550, skipped=4, lr=[9.9e-06], mom=[[0.9, 0.999]] [2022-12-19 19:25:51,701] [INFO] [timer.py:197:stop] 0/550, RunningAvgSamplesPerSec=11.972313631431716, CurrSamplesPerSec=11.875840594711676, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0037, 'learning_rate': 9.9e-06, 'epoch': 14.47} [2022-12-19 19:25:58,203] [INFO] [timer.py:197:stop] 0/551, RunningAvgSamplesPerSec=11.97229690899298, CurrSamplesPerSec=11.963140034258014, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:26:04,842] [INFO] [timer.py:197:stop] 0/552, RunningAvgSamplesPerSec=11.972317530025434, CurrSamplesPerSec=11.983649211525123, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:26:11,220] [INFO] [timer.py:197:stop] 0/553, RunningAvgSamplesPerSec=11.97233080516178, CurrSamplesPerSec=11.979636593687708, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:26:17,697] [INFO] [timer.py:197:stop] 0/554, RunningAvgSamplesPerSec=11.972341337218756, CurrSamplesPerSec=11.978147319962417, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:26:24,210] [INFO] [timer.py:197:stop] 0/555, RunningAvgSamplesPerSec=11.972180623461126, CurrSamplesPerSec=11.88412033907062, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:26:30,936] [INFO] [timer.py:197:stop] 0/556, RunningAvgSamplesPerSec=11.97197065456156, CurrSamplesPerSec=11.856975178951966, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:26:37,428] [INFO] [timer.py:197:stop] 0/557, RunningAvgSamplesPerSec=11.971821629939019, CurrSamplesPerSec=11.889828448096438, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:26:43,966] [INFO] [timer.py:197:stop] 0/558, RunningAvgSamplesPerSec=11.971596975681212, CurrSamplesPerSec=11.84820133279073, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:26:50,660] [INFO] [timer.py:197:stop] 0/559, RunningAvgSamplesPerSec=11.971477153010794, CurrSamplesPerSec=11.905225104767887, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:26:57,411] [INFO] [logging.py:68:log_dist] [Rank 0] step=560, skipped=4, lr=[9.877777777777778e-06], mom=[[0.9, 0.999]] [2022-12-19 19:26:57,412] [INFO] [timer.py:197:stop] 0/560, RunningAvgSamplesPerSec=11.971211700768691, CurrSamplesPerSec=11.825161907109676, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:27:03,933] [INFO] [timer.py:197:stop] 0/561, RunningAvgSamplesPerSec=11.970985773174483, CurrSamplesPerSec=11.84623429974481, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:27:10,421] [INFO] [timer.py:197:stop] 0/562, RunningAvgSamplesPerSec=11.971007464587084, CurrSamplesPerSec=11.98314528068618, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:27:16,962] [INFO] [timer.py:197:stop] 0/563, RunningAvgSamplesPerSec=11.970878612588608, CurrSamplesPerSec=11.899154597830668, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:27:23,465] [INFO] [timer.py:197:stop] 0/564, RunningAvgSamplesPerSec=11.970581890232928, CurrSamplesPerSec=11.806407701546343, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:27:30,209] [INFO] [timer.py:197:stop] 0/565, RunningAvgSamplesPerSec=11.970576624558733, CurrSamplesPerSec=11.967618048368738, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:27:36,762] [INFO] [timer.py:197:stop] 0/566, RunningAvgSamplesPerSec=11.970390571654809, CurrSamplesPerSec=11.866553038652004, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:27:43,318] [INFO] [timer.py:197:stop] 0/567, RunningAvgSamplesPerSec=11.97020571971344, CurrSamplesPerSec=11.866851005901077, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:27:49,769] [INFO] [timer.py:197:stop] 0/568, RunningAvgSamplesPerSec=11.970223645870083, CurrSamplesPerSec=11.980360516590608, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:27:56,326] [INFO] [timer.py:197:stop] 0/569, RunningAvgSamplesPerSec=11.970059366843243, CurrSamplesPerSec=11.87779539890028, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:28:01,013] [INFO] [logging.py:68:log_dist] [Rank 0] step=570, skipped=4, lr=[9.855555555555555e-06], mom=[[0.9, 0.999]] [2022-12-19 19:28:01,013] [INFO] [timer.py:197:stop] 0/570, RunningAvgSamplesPerSec=11.975816137271956, CurrSamplesPerSec=16.465852896423467, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:28:07,546] [INFO] [timer.py:197:stop] 0/571, RunningAvgSamplesPerSec=11.975599087928371, CurrSamplesPerSec=11.853573477502735, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:28:14,093] [INFO] [timer.py:197:stop] 0/572, RunningAvgSamplesPerSec=11.975379716876418, CurrSamplesPerSec=11.851847456317538, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:28:20,584] [INFO] [timer.py:197:stop] 0/573, RunningAvgSamplesPerSec=11.975246990556267, CurrSamplesPerSec=11.900068762140933, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:28:27,009] [INFO] [timer.py:197:stop] 0/574, RunningAvgSamplesPerSec=11.975135743276645, CurrSamplesPerSec=11.911949306529012, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:28:33,507] [INFO] [timer.py:197:stop] 0/575, RunningAvgSamplesPerSec=11.975145618361788, CurrSamplesPerSec=11.980796837344585, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0039, 'learning_rate': 9.844444444444446e-06, 'epoch': 15.13} [2022-12-19 19:28:40,027] [INFO] [timer.py:197:stop] 0/576, RunningAvgSamplesPerSec=11.975071602480421, CurrSamplesPerSec=11.932810436469357, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:28:46,584] [INFO] [timer.py:197:stop] 0/577, RunningAvgSamplesPerSec=11.975063936699872, CurrSamplesPerSec=11.970665397694008, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:28:53,082] [INFO] [timer.py:197:stop] 0/578, RunningAvgSamplesPerSec=11.974917879295877, CurrSamplesPerSec=11.891520773346453, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:28:59,545] [INFO] [timer.py:197:stop] 0/579, RunningAvgSamplesPerSec=11.974940407212763, CurrSamplesPerSec=11.98793058797454, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:29:06,083] [INFO] [logging.py:68:log_dist] [Rank 0] step=580, skipped=4, lr=[9.833333333333333e-06], mom=[[0.9, 0.999]] [2022-12-19 19:29:06,084] [INFO] [timer.py:197:stop] 0/580, RunningAvgSamplesPerSec=11.97473211906743, CurrSamplesPerSec=11.85574611066734, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:29:12,822] [INFO] [timer.py:197:stop] 0/581, RunningAvgSamplesPerSec=11.974661744030799, CurrSamplesPerSec=11.93412291758945, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:29:19,202] [INFO] [timer.py:197:stop] 0/582, RunningAvgSamplesPerSec=11.97463069654697, CurrSamplesPerSec=11.9566811960055, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:29:25,682] [INFO] [timer.py:197:stop] 0/583, RunningAvgSamplesPerSec=11.974457518761687, CurrSamplesPerSec=11.874851351718402, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:29:32,191] [INFO] [timer.py:197:stop] 0/584, RunningAvgSamplesPerSec=11.97426286311294, CurrSamplesPerSec=11.862227904105406, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:29:38,688] [INFO] [timer.py:197:stop] 0/585, RunningAvgSamplesPerSec=11.97427603984016, CurrSamplesPerSec=11.981949818151952, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:29:45,356] [INFO] [timer.py:197:stop] 0/586, RunningAvgSamplesPerSec=11.974055086764459, CurrSamplesPerSec=11.846612806121012, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:29:51,815] [INFO] [timer.py:197:stop] 0/587, RunningAvgSamplesPerSec=11.974034203852659, CurrSamplesPerSec=11.961851013256428, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:29:58,275] [INFO] [timer.py:197:stop] 0/588, RunningAvgSamplesPerSec=11.973988810012955, CurrSamplesPerSec=11.947492277104441, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:30:04,807] [INFO] [timer.py:197:stop] 0/589, RunningAvgSamplesPerSec=11.97378158739042, CurrSamplesPerSec=11.853570336922818, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:30:11,318] [INFO] [logging.py:68:log_dist] [Rank 0] step=590, skipped=4, lr=[9.811111111111112e-06], mom=[[0.9, 0.999]] [2022-12-19 19:30:11,319] [INFO] [timer.py:197:stop] 0/590, RunningAvgSamplesPerSec=11.973594325518631, CurrSamplesPerSec=11.864673252311206, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:30:17,903] [INFO] [timer.py:197:stop] 0/591, RunningAvgSamplesPerSec=11.973427390586687, CurrSamplesPerSec=11.876069147463953, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:30:24,417] [INFO] [timer.py:197:stop] 0/592, RunningAvgSamplesPerSec=11.973246508972519, CurrSamplesPerSec=11.86764845665348, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:30:30,952] [INFO] [timer.py:197:stop] 0/593, RunningAvgSamplesPerSec=11.973023018843532, CurrSamplesPerSec=11.842602600237324, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:30:37,391] [INFO] [timer.py:197:stop] 0/594, RunningAvgSamplesPerSec=11.972969564747007, CurrSamplesPerSec=11.941461470296202, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:30:43,892] [INFO] [timer.py:197:stop] 0/595, RunningAvgSamplesPerSec=11.97284001124814, CurrSamplesPerSec=11.896633329617973, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:30:50,450] [INFO] [timer.py:197:stop] 0/596, RunningAvgSamplesPerSec=11.972622151928826, CurrSamplesPerSec=11.844813025596551, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:30:56,957] [INFO] [timer.py:197:stop] 0/597, RunningAvgSamplesPerSec=11.972632986281393, CurrSamplesPerSec=11.97907205870291, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:31:03,457] [INFO] [timer.py:197:stop] 0/598, RunningAvgSamplesPerSec=11.972506449635182, CurrSamplesPerSec=11.897688430237341, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:31:09,944] [INFO] [timer.py:197:stop] 0/599, RunningAvgSamplesPerSec=11.972352314420391, CurrSamplesPerSec=11.88118840628846, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:31:16,655] [INFO] [logging.py:68:log_dist] [Rank 0] step=600, skipped=4, lr=[9.78888888888889e-06], mom=[[0.9, 0.999]] [2022-12-19 19:31:16,656] [INFO] [timer.py:197:stop] 0/600, RunningAvgSamplesPerSec=11.972360536868486, CurrSamplesPerSec=11.977271355244234, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0059, 'learning_rate': 9.78888888888889e-06, 'epoch': 15.79} [2022-12-19 19:31:23,371] [INFO] [timer.py:197:stop] 0/601, RunningAvgSamplesPerSec=11.972248379395364, CurrSamplesPerSec=11.905552474841421, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:31:29,944] [INFO] [timer.py:197:stop] 0/602, RunningAvgSamplesPerSec=11.972031492714857, CurrSamplesPerSec=11.843513320957232, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:31:36,492] [INFO] [timer.py:197:stop] 0/603, RunningAvgSamplesPerSec=11.971876049655261, CurrSamplesPerSec=11.879332368803489, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:31:42,963] [INFO] [timer.py:197:stop] 0/604, RunningAvgSamplesPerSec=11.97177175707332, CurrSamplesPerSec=11.90941891531148, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:31:49,398] [INFO] [timer.py:197:stop] 0/605, RunningAvgSamplesPerSec=11.97177433075692, CurrSamplesPerSec=11.97332388915701, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:31:55,908] [INFO] [timer.py:197:stop] 0/606, RunningAvgSamplesPerSec=11.971584255924967, CurrSamplesPerSec=11.858057828933267, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:32:02,343] [INFO] [timer.py:197:stop] 0/607, RunningAvgSamplesPerSec=11.971582241077458, CurrSamplesPerSec=11.970365397085155, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:32:06,998] [INFO] [timer.py:197:stop] 0/608, RunningAvgSamplesPerSec=11.977063638514322, CurrSamplesPerSec=16.566010752043834, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:32:13,653] [INFO] [timer.py:197:stop] 0/609, RunningAvgSamplesPerSec=11.976834178469378, CurrSamplesPerSec=11.839379888970635, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:32:20,066] [INFO] [logging.py:68:log_dist] [Rank 0] step=610, skipped=4, lr=[9.766666666666667e-06], mom=[[0.9, 0.999]] [2022-12-19 19:32:20,067] [INFO] [timer.py:197:stop] 0/610, RunningAvgSamplesPerSec=11.976799359169414, CurrSamplesPerSec=11.955701336857453, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:32:26,511] [INFO] [timer.py:197:stop] 0/611, RunningAvgSamplesPerSec=11.976824491472861, CurrSamplesPerSec=11.992124484326345, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:32:33,168] [INFO] [timer.py:197:stop] 0/612, RunningAvgSamplesPerSec=11.97682546316243, CurrSamplesPerSec=11.977417251395893, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:32:39,748] [INFO] [timer.py:197:stop] 0/613, RunningAvgSamplesPerSec=11.976640271105873, CurrSamplesPerSec=11.864730413371571, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:32:46,182] [INFO] [timer.py:197:stop] 0/614, RunningAvgSamplesPerSec=11.97665746388829, CurrSamplesPerSec=11.987171490947777, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:32:52,669] [INFO] [timer.py:197:stop] 0/615, RunningAvgSamplesPerSec=11.97653013208724, CurrSamplesPerSec=11.89910765377266, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:32:59,164] [INFO] [timer.py:197:stop] 0/616, RunningAvgSamplesPerSec=11.976477808296522, CurrSamplesPerSec=11.944489133952922, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:33:05,687] [INFO] [timer.py:197:stop] 0/617, RunningAvgSamplesPerSec=11.976428793367765, CurrSamplesPerSec=11.946409185237007, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:33:12,125] [INFO] [timer.py:197:stop] 0/618, RunningAvgSamplesPerSec=11.976445188434264, CurrSamplesPerSec=11.986536664156107, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:33:18,631] [INFO] [timer.py:197:stop] 0/619, RunningAvgSamplesPerSec=11.976330313852575, CurrSamplesPerSec=11.905983891610601, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:33:25,153] [INFO] [logging.py:68:log_dist] [Rank 0] step=620, skipped=4, lr=[9.744444444444445e-06], mom=[[0.9, 0.999]] [2022-12-19 19:33:25,154] [INFO] [timer.py:197:stop] 0/620, RunningAvgSamplesPerSec=11.976178362705582, CurrSamplesPerSec=11.88315391432687, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:33:31,723] [INFO] [timer.py:197:stop] 0/621, RunningAvgSamplesPerSec=11.975844954305249, CurrSamplesPerSec=11.773289207166249, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:33:38,216] [INFO] [timer.py:197:stop] 0/622, RunningAvgSamplesPerSec=11.975712574260081, CurrSamplesPerSec=11.894327102190855, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:33:44,701] [INFO] [timer.py:197:stop] 0/623, RunningAvgSamplesPerSec=11.97567354168919, CurrSamplesPerSec=11.951522230924468, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:33:51,234] [INFO] [timer.py:197:stop] 0/624, RunningAvgSamplesPerSec=11.97548155772061, CurrSamplesPerSec=11.857436603559924, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:33:57,880] [INFO] [timer.py:197:stop] 0/625, RunningAvgSamplesPerSec=11.975491477491861, CurrSamplesPerSec=11.981664760967679, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0041, 'learning_rate': 9.733333333333334e-06, 'epoch': 16.45} [2022-12-19 19:34:04,480] [INFO] [timer.py:197:stop] 0/626, RunningAvgSamplesPerSec=11.97549935004787, CurrSamplesPerSec=11.980405965186343, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:34:10,993] [INFO] [timer.py:197:stop] 0/627, RunningAvgSamplesPerSec=11.975348142225789, CurrSamplesPerSec=11.881733234461805, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:34:17,516] [INFO] [timer.py:197:stop] 0/628, RunningAvgSamplesPerSec=11.975277071081818, CurrSamplesPerSec=11.931022022406468, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:34:24,062] [INFO] [timer.py:197:stop] 0/629, RunningAvgSamplesPerSec=11.975084586764885, CurrSamplesPerSec=11.855791665958167, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:34:30,668] [INFO] [logging.py:68:log_dist] [Rank 0] step=630, skipped=4, lr=[9.722222222222223e-06], mom=[[0.9, 0.999]] [2022-12-19 19:34:30,668] [INFO] [timer.py:197:stop] 0/630, RunningAvgSamplesPerSec=11.97497757825607, CurrSamplesPerSec=11.908257662965317, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:34:37,159] [INFO] [timer.py:197:stop] 0/631, RunningAvgSamplesPerSec=11.974992089318647, CurrSamplesPerSec=11.984111987895323, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:34:43,602] [INFO] [timer.py:197:stop] 0/632, RunningAvgSamplesPerSec=11.974877020960001, CurrSamplesPerSec=11.90293454605437, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:34:50,043] [INFO] [timer.py:197:stop] 0/633, RunningAvgSamplesPerSec=11.974881813686583, CurrSamplesPerSec=11.977901994168818, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:34:56,701] [INFO] [timer.py:197:stop] 0/634, RunningAvgSamplesPerSec=11.974884864401915, CurrSamplesPerSec=11.97681017576726, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:35:03,196] [INFO] [timer.py:197:stop] 0/635, RunningAvgSamplesPerSec=11.974699441704077, CurrSamplesPerSec=11.858649782132717, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:35:09,741] [INFO] [timer.py:197:stop] 0/636, RunningAvgSamplesPerSec=11.974558917760351, CurrSamplesPerSec=11.886264184973664, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:35:16,187] [INFO] [timer.py:197:stop] 0/637, RunningAvgSamplesPerSec=11.974423312315334, CurrSamplesPerSec=11.889063293902053, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:35:22,702] [INFO] [timer.py:197:stop] 0/638, RunningAvgSamplesPerSec=11.97431101143373, CurrSamplesPerSec=11.903422779760557, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:35:29,205] [INFO] [timer.py:197:stop] 0/639, RunningAvgSamplesPerSec=11.974194950623405, CurrSamplesPerSec=11.900833223163199, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:35:35,711] [INFO] [logging.py:68:log_dist] [Rank 0] step=640, skipped=4, lr=[9.7e-06], mom=[[0.9, 0.999]] [2022-12-19 19:35:35,712] [INFO] [timer.py:197:stop] 0/640, RunningAvgSamplesPerSec=11.974142488288306, CurrSamplesPerSec=11.940817134214983, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:35:42,411] [INFO] [timer.py:197:stop] 0/641, RunningAvgSamplesPerSec=11.97394014772186, CurrSamplesPerSec=11.846225935236182, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:35:48,938] [INFO] [timer.py:197:stop] 0/642, RunningAvgSamplesPerSec=11.973830865696149, CurrSamplesPerSec=11.904405174556993, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:35:55,426] [INFO] [timer.py:197:stop] 0/643, RunningAvgSamplesPerSec=11.973669232272032, CurrSamplesPerSec=11.871111265204867, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:36:01,981] [INFO] [timer.py:197:stop] 0/644, RunningAvgSamplesPerSec=11.973488752742794, CurrSamplesPerSec=11.858910155483617, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:36:08,431] [INFO] [timer.py:197:stop] 0/645, RunningAvgSamplesPerSec=11.97335393941898, CurrSamplesPerSec=11.887425889430897, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:36:13,241] [INFO] [timer.py:197:stop] 0/646, RunningAvgSamplesPerSec=11.978555711481668, CurrSamplesPerSec=16.621844348814726, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:36:19,695] [INFO] [timer.py:197:stop] 0/647, RunningAvgSamplesPerSec=11.978487173964334, CurrSamplesPerSec=11.934511306077958, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:36:26,202] [INFO] [timer.py:197:stop] 0/648, RunningAvgSamplesPerSec=11.978454024847355, CurrSamplesPerSec=11.957111000118086, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:36:32,697] [INFO] [timer.py:197:stop] 0/649, RunningAvgSamplesPerSec=11.978295197024016, CurrSamplesPerSec=11.876565163467909, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:36:39,167] [INFO] [logging.py:68:log_dist] [Rank 0] step=650, skipped=4, lr=[9.677777777777778e-06], mom=[[0.9, 0.999]] [2022-12-19 19:36:39,168] [INFO] [timer.py:197:stop] 0/650, RunningAvgSamplesPerSec=11.978148694954083, CurrSamplesPerSec=11.88410718580074, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0032, 'learning_rate': 9.677777777777778e-06, 'epoch': 17.11} [2022-12-19 19:36:45,649] [INFO] [timer.py:197:stop] 0/651, RunningAvgSamplesPerSec=11.978147492909812, CurrSamplesPerSec=11.977368618950415, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:36:52,128] [INFO] [timer.py:197:stop] 0/652, RunningAvgSamplesPerSec=11.978188725796924, CurrSamplesPerSec=12.005008880028736, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:36:58,626] [INFO] [timer.py:197:stop] 0/653, RunningAvgSamplesPerSec=11.978072758688917, CurrSamplesPerSec=11.903166254138675, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:37:05,100] [INFO] [timer.py:197:stop] 0/654, RunningAvgSamplesPerSec=11.977948028263256, CurrSamplesPerSec=11.897296107101942, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:37:11,617] [INFO] [timer.py:197:stop] 0/655, RunningAvgSamplesPerSec=11.977822727291652, CurrSamplesPerSec=11.896680781398778, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:37:18,020] [INFO] [timer.py:197:stop] 0/656, RunningAvgSamplesPerSec=11.977869265862653, CurrSamplesPerSec=12.00833637080853, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:37:24,467] [INFO] [timer.py:197:stop] 0/657, RunningAvgSamplesPerSec=11.97787510355765, CurrSamplesPerSec=11.981694175246536, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:37:31,217] [INFO] [timer.py:197:stop] 0/658, RunningAvgSamplesPerSec=11.977749953015193, CurrSamplesPerSec=11.896334392103153, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:37:37,768] [INFO] [timer.py:197:stop] 0/659, RunningAvgSamplesPerSec=11.97761777706295, CurrSamplesPerSec=11.891534469820135, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:37:44,274] [INFO] [logging.py:68:log_dist] [Rank 0] step=660, skipped=4, lr=[9.655555555555556e-06], mom=[[0.9, 0.999]] [2022-12-19 19:37:44,278] [INFO] [timer.py:197:stop] 0/660, RunningAvgSamplesPerSec=11.977488820867785, CurrSamplesPerSec=11.893360597609396, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:37:50,807] [INFO] [timer.py:197:stop] 0/661, RunningAvgSamplesPerSec=11.977439445873683, CurrSamplesPerSec=11.94503872013089, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:37:57,301] [INFO] [timer.py:197:stop] 0/662, RunningAvgSamplesPerSec=11.977299796536682, CurrSamplesPerSec=11.885973662762156, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:38:03,862] [INFO] [timer.py:197:stop] 0/663, RunningAvgSamplesPerSec=11.97703058783773, CurrSamplesPerSec=11.801954020577972, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:38:10,570] [INFO] [timer.py:197:stop] 0/664, RunningAvgSamplesPerSec=11.97678806239548, CurrSamplesPerSec=11.818599304969712, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:38:17,243] [INFO] [timer.py:197:stop] 0/665, RunningAvgSamplesPerSec=11.976631586659392, CurrSamplesPerSec=11.873934229570153, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:38:23,745] [INFO] [timer.py:197:stop] 0/666, RunningAvgSamplesPerSec=11.976645785868248, CurrSamplesPerSec=11.986067278143265, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:38:30,267] [INFO] [timer.py:197:stop] 0/667, RunningAvgSamplesPerSec=11.976513259571458, CurrSamplesPerSec=11.889158603679634, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:38:36,738] [INFO] [timer.py:197:stop] 0/668, RunningAvgSamplesPerSec=11.976489028525965, CurrSamplesPerSec=11.960397066596578, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:38:43,250] [INFO] [timer.py:197:stop] 0/669, RunningAvgSamplesPerSec=11.976383125636769, CurrSamplesPerSec=11.906265359060658, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:38:49,874] [INFO] [logging.py:68:log_dist] [Rank 0] step=670, skipped=4, lr=[9.633333333333335e-06], mom=[[0.9, 0.999]] [2022-12-19 19:38:49,874] [INFO] [timer.py:197:stop] 0/670, RunningAvgSamplesPerSec=11.976178151244438, CurrSamplesPerSec=11.841005650289713, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:38:56,350] [INFO] [timer.py:197:stop] 0/671, RunningAvgSamplesPerSec=11.97620547624723, CurrSamplesPerSec=11.99448648215828, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:39:02,865] [INFO] [timer.py:197:stop] 0/672, RunningAvgSamplesPerSec=11.97616119805436, CurrSamplesPerSec=11.946612283233677, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:39:09,326] [INFO] [timer.py:197:stop] 0/673, RunningAvgSamplesPerSec=11.976073455605626, CurrSamplesPerSec=11.917573604160975, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:39:16,047] [INFO] [timer.py:197:stop] 0/674, RunningAvgSamplesPerSec=11.975934733379361, CurrSamplesPerSec=11.883571083298984, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:39:22,625] [INFO] [timer.py:197:stop] 0/675, RunningAvgSamplesPerSec=11.975947719467163, CurrSamplesPerSec=11.984680743532465, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0034, 'learning_rate': 9.622222222222222e-06, 'epoch': 17.76} [2022-12-19 19:39:29,132] [INFO] [timer.py:197:stop] 0/676, RunningAvgSamplesPerSec=11.975912409494633, CurrSamplesPerSec=11.952195928143308, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:39:35,609] [INFO] [timer.py:197:stop] 0/677, RunningAvgSamplesPerSec=11.975803769199521, CurrSamplesPerSec=11.903025855908071, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:39:42,091] [INFO] [timer.py:197:stop] 0/678, RunningAvgSamplesPerSec=11.97567938308128, CurrSamplesPerSec=11.892304156985325, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:39:48,519] [INFO] [timer.py:197:stop] 0/679, RunningAvgSamplesPerSec=11.975549453856276, CurrSamplesPerSec=11.888357733134713, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:39:55,123] [INFO] [logging.py:68:log_dist] [Rank 0] step=680, skipped=4, lr=[9.611111111111112e-06], mom=[[0.9, 0.999]] [2022-12-19 19:39:55,123] [INFO] [timer.py:197:stop] 0/680, RunningAvgSamplesPerSec=11.975564106186184, CurrSamplesPerSec=11.985491969151347, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:40:01,609] [INFO] [timer.py:197:stop] 0/681, RunningAvgSamplesPerSec=11.975601562748755, CurrSamplesPerSec=12.001051160376083, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:40:08,147] [INFO] [timer.py:197:stop] 0/682, RunningAvgSamplesPerSec=11.975419587003449, CurrSamplesPerSec=11.853121774518845, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:40:14,671] [INFO] [timer.py:197:stop] 0/683, RunningAvgSamplesPerSec=11.975346195331841, CurrSamplesPerSec=11.925647279638905, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:40:19,361] [INFO] [timer.py:197:stop] 0/684, RunningAvgSamplesPerSec=11.980158171428565, CurrSamplesPerSec=16.49345841801192, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:40:26,077] [INFO] [timer.py:197:stop] 0/685, RunningAvgSamplesPerSec=11.98004187176945, CurrSamplesPerSec=11.901247940832896, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:40:32,551] [INFO] [timer.py:197:stop] 0/686, RunningAvgSamplesPerSec=11.980037465495455, CurrSamplesPerSec=11.977028737279714, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:40:39,074] [INFO] [timer.py:197:stop] 0/687, RunningAvgSamplesPerSec=11.979800598670858, CurrSamplesPerSec=11.819948715948051, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:40:45,552] [INFO] [timer.py:197:stop] 0/688, RunningAvgSamplesPerSec=11.97975829874634, CurrSamplesPerSec=11.950852866100679, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:40:52,099] [INFO] [timer.py:197:stop] 0/689, RunningAvgSamplesPerSec=11.979494167064313, CurrSamplesPerSec=11.801003506729925, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:40:58,634] [INFO] [logging.py:68:log_dist] [Rank 0] step=690, skipped=4, lr=[9.58888888888889e-06], mom=[[0.9, 0.999]] [2022-12-19 19:40:58,635] [INFO] [timer.py:197:stop] 0/690, RunningAvgSamplesPerSec=11.979355380969116, CurrSamplesPerSec=11.884763306398538, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:41:05,200] [INFO] [timer.py:197:stop] 0/691, RunningAvgSamplesPerSec=11.97922316380156, CurrSamplesPerSec=11.888944290468602, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:41:11,745] [INFO] [timer.py:197:stop] 0/692, RunningAvgSamplesPerSec=11.97903085416433, CurrSamplesPerSec=11.847981172495958, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:41:18,237] [INFO] [timer.py:197:stop] 0/693, RunningAvgSamplesPerSec=11.97900966105328, CurrSamplesPerSec=11.964404269640626, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:41:24,948] [INFO] [timer.py:197:stop] 0/694, RunningAvgSamplesPerSec=11.979045972093477, CurrSamplesPerSec=12.004189642217757, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:41:31,359] [INFO] [timer.py:197:stop] 0/695, RunningAvgSamplesPerSec=11.979076515016768, CurrSamplesPerSec=12.000249629445117, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:41:37,933] [INFO] [timer.py:197:stop] 0/696, RunningAvgSamplesPerSec=11.978872341496341, CurrSamplesPerSec=11.839034217655005, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:41:44,391] [INFO] [timer.py:197:stop] 0/697, RunningAvgSamplesPerSec=11.978867397144608, CurrSamplesPerSec=11.975437001104236, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:41:50,924] [INFO] [timer.py:197:stop] 0/698, RunningAvgSamplesPerSec=11.978752303121045, CurrSamplesPerSec=11.89929332232395, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:41:57,393] [INFO] [timer.py:197:stop] 0/699, RunningAvgSamplesPerSec=11.978624188200971, CurrSamplesPerSec=11.890115998806362, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:42:03,949] [INFO] [logging.py:68:log_dist] [Rank 0] step=700, skipped=4, lr=[9.566666666666668e-06], mom=[[0.9, 0.999]] [2022-12-19 19:42:03,950] [INFO] [timer.py:197:stop] 0/700, RunningAvgSamplesPerSec=11.978583673268098, CurrSamplesPerSec=11.95041127535315, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0022, 'learning_rate': 9.566666666666668e-06, 'epoch': 18.42} [2022-12-19 19:42:10,724] [INFO] [timer.py:197:stop] 0/701, RunningAvgSamplesPerSec=11.978405409161088, CurrSamplesPerSec=11.855258115031516, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:42:17,217] [INFO] [timer.py:197:stop] 0/702, RunningAvgSamplesPerSec=11.978369458806464, CurrSamplesPerSec=11.95329284415615, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:42:23,646] [INFO] [timer.py:197:stop] 0/703, RunningAvgSamplesPerSec=11.978377741346549, CurrSamplesPerSec=11.984178331020894, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:42:30,164] [INFO] [timer.py:197:stop] 0/704, RunningAvgSamplesPerSec=11.978239482854361, CurrSamplesPerSec=11.882099286083735, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:42:36,671] [INFO] [timer.py:197:stop] 0/705, RunningAvgSamplesPerSec=11.978141989512459, CurrSamplesPerSec=11.910091044701355, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:42:43,316] [INFO] [timer.py:197:stop] 0/706, RunningAvgSamplesPerSec=11.977669103547663, CurrSamplesPerSec=11.654220324064358, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:42:49,834] [INFO] [timer.py:197:stop] 0/707, RunningAvgSamplesPerSec=11.97770315266527, CurrSamplesPerSec=12.001721867593247, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:42:56,372] [INFO] [timer.py:197:stop] 0/708, RunningAvgSamplesPerSec=11.977700559176439, CurrSamplesPerSec=11.975872429012957, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:43:02,769] [INFO] [timer.py:197:stop] 0/709, RunningAvgSamplesPerSec=11.977715028571701, CurrSamplesPerSec=11.987939153792759, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:43:09,263] [INFO] [logging.py:68:log_dist] [Rank 0] step=710, skipped=4, lr=[9.544444444444445e-06], mom=[[0.9, 0.999]] [2022-12-19 19:43:09,264] [INFO] [timer.py:197:stop] 0/710, RunningAvgSamplesPerSec=11.977595669608714, CurrSamplesPerSec=11.893800090033567, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:43:15,785] [INFO] [timer.py:197:stop] 0/711, RunningAvgSamplesPerSec=11.977470951137823, CurrSamplesPerSec=11.889817388731501, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:43:22,338] [INFO] [timer.py:197:stop] 0/712, RunningAvgSamplesPerSec=11.977219984958623, CurrSamplesPerSec=11.801893311786731, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:43:28,833] [INFO] [timer.py:197:stop] 0/713, RunningAvgSamplesPerSec=11.977126634415487, CurrSamplesPerSec=11.911213014556118, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:43:35,580] [INFO] [timer.py:197:stop] 0/714, RunningAvgSamplesPerSec=11.976991331599185, CurrSamplesPerSec=11.881558631575729, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:43:42,135] [INFO] [timer.py:197:stop] 0/715, RunningAvgSamplesPerSec=11.976701065505281, CurrSamplesPerSec=11.773542230841562, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:43:48,624] [INFO] [timer.py:197:stop] 0/716, RunningAvgSamplesPerSec=11.976554169883514, CurrSamplesPerSec=11.872726848929643, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:43:55,047] [INFO] [timer.py:197:stop] 0/717, RunningAvgSamplesPerSec=11.976580911306579, CurrSamplesPerSec=11.995704817904988, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:44:01,619] [INFO] [timer.py:197:stop] 0/718, RunningAvgSamplesPerSec=11.97644817832658, CurrSamplesPerSec=11.882291261826497, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:44:08,112] [INFO] [timer.py:197:stop] 0/719, RunningAvgSamplesPerSec=11.976296629007416, CurrSamplesPerSec=11.86876296869121, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:44:14,573] [INFO] [logging.py:68:log_dist] [Rank 0] step=720, skipped=4, lr=[9.522222222222223e-06], mom=[[0.9, 0.999]] [2022-12-19 19:44:14,574] [INFO] [timer.py:197:stop] 0/720, RunningAvgSamplesPerSec=11.976276684199156, CurrSamplesPerSec=11.961993335679072, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:44:21,075] [INFO] [timer.py:197:stop] 0/721, RunningAvgSamplesPerSec=11.976120602571164, CurrSamplesPerSec=11.86509436918722, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:44:26,010] [INFO] [timer.py:197:stop] 0/722, RunningAvgSamplesPerSec=11.98068217975925, CurrSamplesPerSec=16.49912327716341, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:44:32,414] [INFO] [timer.py:197:stop] 0/723, RunningAvgSamplesPerSec=11.98066078627941, CurrSamplesPerSec=11.965277286533706, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:44:38,937] [INFO] [timer.py:197:stop] 0/724, RunningAvgSamplesPerSec=11.980512184572245, CurrSamplesPerSec=11.874321336588874, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:44:45,430] [INFO] [timer.py:197:stop] 0/725, RunningAvgSamplesPerSec=11.980532445810129, CurrSamplesPerSec=11.995178968204105, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0024, 'learning_rate': 9.511111111111112e-06, 'epoch': 19.08} [2022-12-19 19:44:51,933] [INFO] [timer.py:197:stop] 0/726, RunningAvgSamplesPerSec=11.980425548659776, CurrSamplesPerSec=11.903634975976148, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:44:58,458] [INFO] [timer.py:197:stop] 0/727, RunningAvgSamplesPerSec=11.980285437372041, CurrSamplesPerSec=11.879697746766794, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:45:04,900] [INFO] [timer.py:197:stop] 0/728, RunningAvgSamplesPerSec=11.980213817416447, CurrSamplesPerSec=11.92851373642, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:45:11,370] [INFO] [timer.py:197:stop] 0/729, RunningAvgSamplesPerSec=11.980179835447856, CurrSamplesPerSec=11.955559696593365, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:45:18,011] [INFO] [logging.py:68:log_dist] [Rank 0] step=730, skipped=4, lr=[9.5e-06], mom=[[0.9, 0.999]] [2022-12-19 19:45:18,012] [INFO] [timer.py:197:stop] 0/730, RunningAvgSamplesPerSec=11.98003403203525, CurrSamplesPerSec=11.87496587061819, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:45:24,467] [INFO] [timer.py:197:stop] 0/731, RunningAvgSamplesPerSec=11.979884701901927, CurrSamplesPerSec=11.872151341975183, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:45:31,022] [INFO] [timer.py:197:stop] 0/732, RunningAvgSamplesPerSec=11.97975129332956, CurrSamplesPerSec=11.883280692719694, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:45:37,662] [INFO] [timer.py:197:stop] 0/733, RunningAvgSamplesPerSec=11.979655168016306, CurrSamplesPerSec=11.909892885462195, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:45:44,417] [INFO] [timer.py:197:stop] 0/734, RunningAvgSamplesPerSec=11.979438379752066, CurrSamplesPerSec=11.823037960856995, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:45:50,894] [INFO] [timer.py:197:stop] 0/735, RunningAvgSamplesPerSec=11.979412094280493, CurrSamplesPerSec=11.960202025863683, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:45:57,373] [INFO] [timer.py:197:stop] 0/736, RunningAvgSamplesPerSec=11.979322075534418, CurrSamplesPerSec=11.913700281572673, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:46:04,088] [INFO] [timer.py:197:stop] 0/737, RunningAvgSamplesPerSec=11.979229066472591, CurrSamplesPerSec=11.911347791994812, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:46:10,583] [INFO] [timer.py:197:stop] 0/738, RunningAvgSamplesPerSec=11.979134266249346, CurrSamplesPerSec=11.909859595356957, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:46:17,040] [INFO] [timer.py:197:stop] 0/739, RunningAvgSamplesPerSec=11.979153922087859, CurrSamplesPerSec=11.993638134946083, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:46:23,606] [INFO] [logging.py:68:log_dist] [Rank 0] step=740, skipped=4, lr=[9.47777777777778e-06], mom=[[0.9, 0.999]] [2022-12-19 19:46:23,607] [INFO] [timer.py:197:stop] 0/740, RunningAvgSamplesPerSec=11.979027125097204, CurrSamplesPerSec=11.886302080222208, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:46:30,187] [INFO] [timer.py:197:stop] 0/741, RunningAvgSamplesPerSec=11.978834902806833, CurrSamplesPerSec=11.83863739934022, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:46:36,862] [INFO] [timer.py:197:stop] 0/742, RunningAvgSamplesPerSec=11.978711314874198, CurrSamplesPerSec=11.888071846905047, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:46:43,328] [INFO] [timer.py:197:stop] 0/743, RunningAvgSamplesPerSec=11.978689803349484, CurrSamplesPerSec=11.962792429693451, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:46:49,886] [INFO] [timer.py:197:stop] 0/744, RunningAvgSamplesPerSec=11.978578230047376, CurrSamplesPerSec=11.896469887492607, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:46:56,309] [INFO] [timer.py:197:stop] 0/745, RunningAvgSamplesPerSec=11.978473389916921, CurrSamplesPerSec=11.901184623288588, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:47:02,810] [INFO] [timer.py:197:stop] 0/746, RunningAvgSamplesPerSec=11.978375544989852, CurrSamplesPerSec=11.906115910333419, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:47:09,321] [INFO] [timer.py:197:stop] 0/747, RunningAvgSamplesPerSec=11.978286545053317, CurrSamplesPerSec=11.912435108214439, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:47:15,812] [INFO] [timer.py:197:stop] 0/748, RunningAvgSamplesPerSec=11.978170292320765, CurrSamplesPerSec=11.89218456181892, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:47:22,366] [INFO] [timer.py:197:stop] 0/749, RunningAvgSamplesPerSec=11.978200150970373, CurrSamplesPerSec=12.00051625803764, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:47:28,871] [INFO] [logging.py:68:log_dist] [Rank 0] step=750, skipped=4, lr=[9.455555555555557e-06], mom=[[0.9, 0.999]] [2022-12-19 19:47:28,872] [INFO] [timer.py:197:stop] 0/750, RunningAvgSamplesPerSec=11.97810552129097, CurrSamplesPerSec=11.90783242019479, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0022, 'learning_rate': 9.455555555555557e-06, 'epoch': 19.74} [2022-12-19 19:47:35,330] [INFO] [timer.py:197:stop] 0/751, RunningAvgSamplesPerSec=11.978108337300501, CurrSamplesPerSec=11.980215083401454, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:47:41,783] [INFO] [timer.py:197:stop] 0/752, RunningAvgSamplesPerSec=11.978105564339486, CurrSamplesPerSec=11.97602897709133, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:47:48,232] [INFO] [timer.py:197:stop] 0/753, RunningAvgSamplesPerSec=11.978020223764426, CurrSamplesPerSec=11.914355443497685, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:47:54,804] [INFO] [timer.py:197:stop] 0/754, RunningAvgSamplesPerSec=11.977900194604274, CurrSamplesPerSec=11.88843249754756, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:48:01,298] [INFO] [timer.py:197:stop] 0/755, RunningAvgSamplesPerSec=11.977771031799467, CurrSamplesPerSec=11.88142294974375, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:48:07,778] [INFO] [timer.py:197:stop] 0/756, RunningAvgSamplesPerSec=11.977786016464945, CurrSamplesPerSec=11.98908012310498, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:48:14,295] [INFO] [timer.py:197:stop] 0/757, RunningAvgSamplesPerSec=11.97776445344107, CurrSamplesPerSec=11.961528001896838, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:48:20,738] [INFO] [timer.py:197:stop] 0/758, RunningAvgSamplesPerSec=11.97779573285277, CurrSamplesPerSec=12.001458404858285, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:48:27,216] [INFO] [timer.py:197:stop] 0/759, RunningAvgSamplesPerSec=11.977744124173299, CurrSamplesPerSec=11.938854807654161, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:48:31,980] [INFO] [logging.py:68:log_dist] [Rank 0] step=760, skipped=4, lr=[9.433333333333335e-06], mom=[[0.9, 0.999]] [2022-12-19 19:48:31,981] [INFO] [timer.py:197:stop] 0/760, RunningAvgSamplesPerSec=11.982174937497238, CurrSamplesPerSec=16.642599215756004, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:48:38,517] [INFO] [timer.py:197:stop] 0/761, RunningAvgSamplesPerSec=11.982178203031259, CurrSamplesPerSec=11.984653989940739, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:48:45,058] [INFO] [timer.py:197:stop] 0/762, RunningAvgSamplesPerSec=11.982064794163506, CurrSamplesPerSec=11.896602222544859, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:48:51,507] [INFO] [timer.py:197:stop] 0/763, RunningAvgSamplesPerSec=11.982095367257427, CurrSamplesPerSec=12.00537612383475, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:48:58,021] [INFO] [timer.py:197:stop] 0/764, RunningAvgSamplesPerSec=11.981912320565547, CurrSamplesPerSec=11.844216704299297, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:49:04,524] [INFO] [timer.py:197:stop] 0/765, RunningAvgSamplesPerSec=11.98190787030546, CurrSamplesPerSec=11.978517732848205, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:49:11,029] [INFO] [timer.py:197:stop] 0/766, RunningAvgSamplesPerSec=11.981716785559062, CurrSamplesPerSec=11.837674181332742, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:49:17,591] [INFO] [timer.py:197:stop] 0/767, RunningAvgSamplesPerSec=11.981589066790168, CurrSamplesPerSec=11.884801192077603, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:49:24,343] [INFO] [timer.py:197:stop] 0/768, RunningAvgSamplesPerSec=11.981388311505341, CurrSamplesPerSec=11.829756678093096, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:49:30,775] [INFO] [timer.py:197:stop] 0/769, RunningAvgSamplesPerSec=11.981275278213968, CurrSamplesPerSec=11.895313794508033, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:49:37,246] [INFO] [logging.py:68:log_dist] [Rank 0] step=770, skipped=4, lr=[9.411111111111113e-06], mom=[[0.9, 0.999]] [2022-12-19 19:49:37,247] [INFO] [timer.py:197:stop] 0/770, RunningAvgSamplesPerSec=11.981253213669396, CurrSamplesPerSec=11.964353609773532, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:49:43,830] [INFO] [timer.py:197:stop] 0/771, RunningAvgSamplesPerSec=11.981239772573442, CurrSamplesPerSec=11.970925908641167, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:49:50,414] [INFO] [timer.py:197:stop] 0/772, RunningAvgSamplesPerSec=11.981108188076112, CurrSamplesPerSec=11.880768249574357, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:49:56,978] [INFO] [timer.py:197:stop] 0/773, RunningAvgSamplesPerSec=11.980976876354708, CurrSamplesPerSec=11.880714088856092, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:50:03,468] [INFO] [timer.py:197:stop] 0/774, RunningAvgSamplesPerSec=11.980986177769637, CurrSamplesPerSec=11.988161869363177, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:50:10,011] [INFO] [timer.py:197:stop] 0/775, RunningAvgSamplesPerSec=11.980766366325167, CurrSamplesPerSec=11.81344492601003, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0016, 'learning_rate': 9.4e-06, 'epoch': 20.39} [2022-12-19 19:50:16,487] [INFO] [timer.py:197:stop] 0/776, RunningAvgSamplesPerSec=11.980602403247165, CurrSamplesPerSec=11.855187432299326, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:50:22,950] [INFO] [timer.py:197:stop] 0/777, RunningAvgSamplesPerSec=11.980626320045564, CurrSamplesPerSec=11.999166606134903, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:50:29,645] [INFO] [timer.py:197:stop] 0/778, RunningAvgSamplesPerSec=11.980489992252004, CurrSamplesPerSec=11.875760734674495, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:50:36,278] [INFO] [timer.py:197:stop] 0/779, RunningAvgSamplesPerSec=11.980432586270492, CurrSamplesPerSec=11.936050782877498, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:50:42,822] [INFO] [logging.py:68:log_dist] [Rank 0] step=780, skipped=4, lr=[9.38888888888889e-06], mom=[[0.9, 0.999]] [2022-12-19 19:50:42,823] [INFO] [timer.py:197:stop] 0/780, RunningAvgSamplesPerSec=11.980286357016164, CurrSamplesPerSec=11.867735028479908, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:50:49,285] [INFO] [timer.py:197:stop] 0/781, RunningAvgSamplesPerSec=11.980295375691854, CurrSamplesPerSec=11.98731602246405, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:50:55,743] [INFO] [timer.py:197:stop] 0/782, RunningAvgSamplesPerSec=11.9803104686943, CurrSamplesPerSec=11.992079482510942, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:51:02,279] [INFO] [timer.py:197:stop] 0/783, RunningAvgSamplesPerSec=11.980189983197402, CurrSamplesPerSec=11.886943704511545, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:51:08,893] [INFO] [timer.py:197:stop] 0/784, RunningAvgSamplesPerSec=11.979890129862069, CurrSamplesPerSec=11.750200447087725, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:51:15,371] [INFO] [timer.py:197:stop] 0/785, RunningAvgSamplesPerSec=11.97978688191083, CurrSamplesPerSec=11.899588188550048, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:51:21,841] [INFO] [timer.py:197:stop] 0/786, RunningAvgSamplesPerSec=11.979679000875512, CurrSamplesPerSec=11.895800348894433, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:51:28,292] [INFO] [timer.py:197:stop] 0/787, RunningAvgSamplesPerSec=11.979686072713772, CurrSamplesPerSec=11.98523296434199, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:51:34,773] [INFO] [timer.py:197:stop] 0/788, RunningAvgSamplesPerSec=11.979659260781558, CurrSamplesPerSec=11.958648854731944, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:51:41,269] [INFO] [timer.py:197:stop] 0/789, RunningAvgSamplesPerSec=11.979559448873635, CurrSamplesPerSec=11.9016183619666, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:51:47,891] [INFO] [logging.py:68:log_dist] [Rank 0] step=790, skipped=4, lr=[9.366666666666668e-06], mom=[[0.9, 0.999]] [2022-12-19 19:51:47,891] [INFO] [timer.py:197:stop] 0/790, RunningAvgSamplesPerSec=11.979561312769011, CurrSamplesPerSec=11.981028378298582, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:51:54,457] [INFO] [timer.py:197:stop] 0/791, RunningAvgSamplesPerSec=11.979412561144684, CurrSamplesPerSec=11.863333533828419, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:52:00,978] [INFO] [timer.py:197:stop] 0/792, RunningAvgSamplesPerSec=11.97931202287896, CurrSamplesPerSec=11.900509805651419, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:52:07,441] [INFO] [timer.py:197:stop] 0/793, RunningAvgSamplesPerSec=11.979318777362575, CurrSamplesPerSec=11.984657200365438, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:52:13,892] [INFO] [timer.py:197:stop] 0/794, RunningAvgSamplesPerSec=11.97935416159502, CurrSamplesPerSec=12.007408719850522, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:52:20,391] [INFO] [timer.py:197:stop] 0/795, RunningAvgSamplesPerSec=11.979246869102509, CurrSamplesPerSec=11.894870500625041, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:52:26,885] [INFO] [timer.py:197:stop] 0/796, RunningAvgSamplesPerSec=11.979242591050609, CurrSamplesPerSec=11.975851057579147, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:52:33,329] [INFO] [timer.py:197:stop] 0/797, RunningAvgSamplesPerSec=11.979127747950312, CurrSamplesPerSec=11.888632049422895, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:52:37,960] [INFO] [timer.py:197:stop] 0/798, RunningAvgSamplesPerSec=11.983327798767473, CurrSamplesPerSec=16.614393901366984, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:52:44,505] [INFO] [timer.py:197:stop] 0/799, RunningAvgSamplesPerSec=11.983242355228949, CurrSamplesPerSec=11.915613619878123, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:52:50,968] [INFO] [logging.py:68:log_dist] [Rank 0] step=800, skipped=4, lr=[9.344444444444446e-06], mom=[[0.9, 0.999]] [2022-12-19 19:52:50,969] [INFO] [timer.py:197:stop] 0/800, RunningAvgSamplesPerSec=11.98314432544352, CurrSamplesPerSec=11.905521321107688, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0016, 'learning_rate': 9.344444444444446e-06, 'epoch': 21.05} [2022-12-19 19:52:57,443] [INFO] [timer.py:197:stop] 0/801, RunningAvgSamplesPerSec=11.983141642587343, CurrSamplesPerSec=11.98100110626628, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:53:03,923] [INFO] [timer.py:197:stop] 0/802, RunningAvgSamplesPerSec=11.983021135378596, CurrSamplesPerSec=11.88750432713988, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:53:10,487] [INFO] [timer.py:197:stop] 0/803, RunningAvgSamplesPerSec=11.982819024731294, CurrSamplesPerSec=11.82328583678163, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:53:16,928] [INFO] [timer.py:197:stop] 0/804, RunningAvgSamplesPerSec=11.982790971478826, CurrSamplesPerSec=11.960362427750507, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:53:23,448] [INFO] [timer.py:197:stop] 0/805, RunningAvgSamplesPerSec=11.982713734166708, CurrSamplesPerSec=11.921088377732412, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:53:29,893] [INFO] [timer.py:197:stop] 0/806, RunningAvgSamplesPerSec=11.982723105360106, CurrSamplesPerSec=11.990252908212623, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:53:36,399] [INFO] [timer.py:197:stop] 0/807, RunningAvgSamplesPerSec=11.982723619816003, CurrSamplesPerSec=11.983137256652576, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:53:42,839] [INFO] [timer.py:197:stop] 0/808, RunningAvgSamplesPerSec=11.982617439646585, CurrSamplesPerSec=11.897748546552213, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:53:49,349] [INFO] [timer.py:197:stop] 0/809, RunningAvgSamplesPerSec=11.982509903291962, CurrSamplesPerSec=11.89645881576911, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:53:55,848] [INFO] [logging.py:68:log_dist] [Rank 0] step=810, skipped=4, lr=[9.322222222222223e-06], mom=[[0.9, 0.999]] [2022-12-19 19:53:55,848] [INFO] [timer.py:197:stop] 0/810, RunningAvgSamplesPerSec=11.982403897411295, CurrSamplesPerSec=11.897464317750822, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:54:02,346] [INFO] [timer.py:197:stop] 0/811, RunningAvgSamplesPerSec=11.98241782785198, CurrSamplesPerSec=11.993684220214812, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:54:08,790] [INFO] [timer.py:197:stop] 0/812, RunningAvgSamplesPerSec=11.982416056155053, CurrSamplesPerSec=11.980982924980388, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:54:15,252] [INFO] [timer.py:197:stop] 0/813, RunningAvgSamplesPerSec=11.982407782667156, CurrSamplesPerSec=11.975710008029, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:54:21,817] [INFO] [timer.py:197:stop] 0/814, RunningAvgSamplesPerSec=11.98223398583903, CurrSamplesPerSec=11.842925490352137, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:54:28,304] [INFO] [timer.py:197:stop] 0/815, RunningAvgSamplesPerSec=11.982213867109767, CurrSamplesPerSec=11.965899728841453, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:54:34,837] [INFO] [timer.py:197:stop] 0/816, RunningAvgSamplesPerSec=11.982188269993504, CurrSamplesPerSec=11.961413939346976, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:54:41,386] [INFO] [timer.py:197:stop] 0/817, RunningAvgSamplesPerSec=11.982076509046031, CurrSamplesPerSec=11.891789440706084, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:54:47,951] [INFO] [timer.py:197:stop] 0/818, RunningAvgSamplesPerSec=11.982062734376463, CurrSamplesPerSec=11.970846900024835, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:54:54,436] [INFO] [timer.py:197:stop] 0/819, RunningAvgSamplesPerSec=11.9820604273514, CurrSamplesPerSec=11.980178190984786, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:55:00,920] [INFO] [logging.py:68:log_dist] [Rank 0] step=820, skipped=4, lr=[9.3e-06], mom=[[0.9, 0.999]] [2022-12-19 19:55:00,921] [INFO] [timer.py:197:stop] 0/820, RunningAvgSamplesPerSec=11.98193607279563, CurrSamplesPerSec=11.881193665008846, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:55:07,507] [INFO] [timer.py:197:stop] 0/821, RunningAvgSamplesPerSec=11.981755749473937, CurrSamplesPerSec=11.836047247104597, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:55:14,003] [INFO] [timer.py:197:stop] 0/822, RunningAvgSamplesPerSec=11.981649637780032, CurrSamplesPerSec=11.89537072405942, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:55:20,543] [INFO] [timer.py:197:stop] 0/823, RunningAvgSamplesPerSec=11.981526332425174, CurrSamplesPerSec=11.881263080554193, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:55:27,044] [INFO] [timer.py:197:stop] 0/824, RunningAvgSamplesPerSec=11.98140995646887, CurrSamplesPerSec=11.886622094178941, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:55:33,500] [INFO] [timer.py:197:stop] 0/825, RunningAvgSamplesPerSec=11.98141219335818, CurrSamplesPerSec=11.9832511989369, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0025, 'learning_rate': 9.28888888888889e-06, 'epoch': 21.71} [2022-12-19 19:55:39,977] [INFO] [timer.py:197:stop] 0/826, RunningAvgSamplesPerSec=11.981307253376455, CurrSamplesPerSec=11.895560493166274, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:55:46,499] [INFO] [timer.py:197:stop] 0/827, RunningAvgSamplesPerSec=11.981128459132593, CurrSamplesPerSec=11.835593747787183, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:55:53,075] [INFO] [timer.py:197:stop] 0/828, RunningAvgSamplesPerSec=11.98085174090832, CurrSamplesPerSec=11.75683300840612, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:55:59,551] [INFO] [timer.py:197:stop] 0/829, RunningAvgSamplesPerSec=11.980852009453596, CurrSamplesPerSec=11.981073831961657, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:56:06,147] [INFO] [logging.py:68:log_dist] [Rank 0] step=830, skipped=4, lr=[9.277777777777778e-06], mom=[[0.9, 0.999]] [2022-12-19 19:56:06,147] [INFO] [timer.py:197:stop] 0/830, RunningAvgSamplesPerSec=11.980727265209744, CurrSamplesPerSec=11.878445566424796, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:56:12,645] [INFO] [timer.py:197:stop] 0/831, RunningAvgSamplesPerSec=11.980627212830624, CurrSamplesPerSec=11.898353434861107, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:56:19,116] [INFO] [timer.py:197:stop] 0/832, RunningAvgSamplesPerSec=11.980636256685223, CurrSamplesPerSec=11.988138312517263, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:56:25,638] [INFO] [timer.py:197:stop] 0/833, RunningAvgSamplesPerSec=11.980506246720328, CurrSamplesPerSec=11.873562378482356, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:56:32,059] [INFO] [timer.py:197:stop] 0/834, RunningAvgSamplesPerSec=11.98053467038546, CurrSamplesPerSec=12.004201452191309, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:56:38,562] [INFO] [timer.py:197:stop] 0/835, RunningAvgSamplesPerSec=11.980536521172313, CurrSamplesPerSec=11.982076574014389, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:56:43,237] [INFO] [timer.py:197:stop] 0/836, RunningAvgSamplesPerSec=11.984386151369149, CurrSamplesPerSec=16.36457006948476, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:56:49,764] [INFO] [timer.py:197:stop] 0/837, RunningAvgSamplesPerSec=11.984250901143252, CurrSamplesPerSec=11.872505251142632, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:56:56,306] [INFO] [timer.py:197:stop] 0/838, RunningAvgSamplesPerSec=11.98411650038727, CurrSamplesPerSec=11.872934276154512, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:57:02,732] [INFO] [timer.py:197:stop] 0/839, RunningAvgSamplesPerSec=11.984133208564284, CurrSamplesPerSec=11.998117563429098, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:57:09,241] [INFO] [logging.py:68:log_dist] [Rank 0] step=840, skipped=4, lr=[9.255555555555556e-06], mom=[[0.9, 0.999]] [2022-12-19 19:57:09,242] [INFO] [timer.py:197:stop] 0/840, RunningAvgSamplesPerSec=11.984034163733051, CurrSamplesPerSec=11.901703847041274, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:57:15,751] [INFO] [timer.py:197:stop] 0/841, RunningAvgSamplesPerSec=11.983931751241053, CurrSamplesPerSec=11.898721038708024, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:57:22,309] [INFO] [timer.py:197:stop] 0/842, RunningAvgSamplesPerSec=11.983558829090255, CurrSamplesPerSec=11.678647634609634, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:57:28,800] [INFO] [timer.py:197:stop] 0/843, RunningAvgSamplesPerSec=11.983552059522403, CurrSamplesPerSec=11.977868322785367, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:57:35,334] [INFO] [timer.py:197:stop] 0/844, RunningAvgSamplesPerSec=11.983488350892907, CurrSamplesPerSec=11.930148164513145, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:57:41,871] [INFO] [timer.py:197:stop] 0/845, RunningAvgSamplesPerSec=11.983381822856135, CurrSamplesPerSec=11.894352399949097, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:57:48,338] [INFO] [timer.py:197:stop] 0/846, RunningAvgSamplesPerSec=11.983375867228439, CurrSamplesPerSec=11.97835737813754, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:57:54,888] [INFO] [timer.py:197:stop] 0/847, RunningAvgSamplesPerSec=11.983265309483414, CurrSamplesPerSec=11.890676393998765, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:58:01,387] [INFO] [timer.py:197:stop] 0/848, RunningAvgSamplesPerSec=11.983212373788698, CurrSamplesPerSec=11.938648256536025, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:58:07,992] [INFO] [timer.py:197:stop] 0/849, RunningAvgSamplesPerSec=11.982958740630838, CurrSamplesPerSec=11.772164158073844, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:58:14,613] [INFO] [logging.py:68:log_dist] [Rank 0] step=850, skipped=4, lr=[9.233333333333334e-06], mom=[[0.9, 0.999]] [2022-12-19 19:58:14,613] [INFO] [timer.py:197:stop] 0/850, RunningAvgSamplesPerSec=11.98289203487784, CurrSamplesPerSec=11.92665772196626, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0021, 'learning_rate': 9.233333333333334e-06, 'epoch': 22.37} [2022-12-19 19:58:21,447] [INFO] [timer.py:197:stop] 0/851, RunningAvgSamplesPerSec=11.982796362029921, CurrSamplesPerSec=11.902212033666396, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:58:27,996] [INFO] [timer.py:197:stop] 0/852, RunningAvgSamplesPerSec=11.982451781422256, CurrSamplesPerSec=11.696883153157295, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:58:34,556] [INFO] [timer.py:197:stop] 0/853, RunningAvgSamplesPerSec=11.98233031250701, CurrSamplesPerSec=11.87996482861918, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:58:41,100] [INFO] [timer.py:197:stop] 0/854, RunningAvgSamplesPerSec=11.982188779835464, CurrSamplesPerSec=11.862944521588084, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:58:47,659] [INFO] [timer.py:197:stop] 0/855, RunningAvgSamplesPerSec=11.982008104786834, CurrSamplesPerSec=11.830027775298824, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:58:54,323] [INFO] [timer.py:197:stop] 0/856, RunningAvgSamplesPerSec=11.981849722007539, CurrSamplesPerSec=11.848257289324078, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:59:00,937] [INFO] [timer.py:197:stop] 0/857, RunningAvgSamplesPerSec=11.981705057784435, CurrSamplesPerSec=11.859424125173737, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:59:07,431] [INFO] [timer.py:197:stop] 0/858, RunningAvgSamplesPerSec=11.981701116487608, CurrSamplesPerSec=11.97833225628853, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:59:14,029] [INFO] [timer.py:197:stop] 0/859, RunningAvgSamplesPerSec=11.98145374060705, CurrSamplesPerSec=11.773381637003185, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:59:20,513] [INFO] [logging.py:68:log_dist] [Rank 0] step=860, skipped=4, lr=[9.211111111111111e-06], mom=[[0.9, 0.999]] [2022-12-19 19:59:20,514] [INFO] [timer.py:197:stop] 0/860, RunningAvgSamplesPerSec=11.981369267363489, CurrSamplesPerSec=11.909410989709924, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:59:27,015] [INFO] [timer.py:197:stop] 0/861, RunningAvgSamplesPerSec=11.981316968692994, CurrSamplesPerSec=11.93661233172, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:59:33,610] [INFO] [timer.py:197:stop] 0/862, RunningAvgSamplesPerSec=11.981154136555933, CurrSamplesPerSec=11.842897275950751, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:59:40,125] [INFO] [timer.py:197:stop] 0/863, RunningAvgSamplesPerSec=11.981118032618161, CurrSamplesPerSec=11.950148996538209, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:59:46,675] [INFO] [timer.py:197:stop] 0/864, RunningAvgSamplesPerSec=11.980999003889433, CurrSamplesPerSec=11.87938546550021, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:59:53,310] [INFO] [timer.py:197:stop] 0/865, RunningAvgSamplesPerSec=11.980747946956448, CurrSamplesPerSec=11.768180972470478, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 19:59:59,871] [INFO] [timer.py:197:stop] 0/866, RunningAvgSamplesPerSec=11.980606286580622, CurrSamplesPerSec=11.85958969472394, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:00:06,367] [INFO] [timer.py:197:stop] 0/867, RunningAvgSamplesPerSec=11.980539400299683, CurrSamplesPerSec=11.923027391580408, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:00:12,998] [INFO] [timer.py:197:stop] 0/868, RunningAvgSamplesPerSec=11.98032799892162, CurrSamplesPerSec=11.80021809936344, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:00:19,529] [INFO] [timer.py:197:stop] 0/869, RunningAvgSamplesPerSec=11.980153125016635, CurrSamplesPerSec=11.830604940925745, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:00:26,052] [INFO] [logging.py:68:log_dist] [Rank 0] step=870, skipped=4, lr=[9.188888888888889e-06], mom=[[0.9, 0.999]] [2022-12-19 20:00:26,053] [INFO] [timer.py:197:stop] 0/870, RunningAvgSamplesPerSec=11.98001204672134, CurrSamplesPerSec=11.858934778907349, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:00:32,533] [INFO] [timer.py:197:stop] 0/871, RunningAvgSamplesPerSec=11.980017343293701, CurrSamplesPerSec=11.98461653511299, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:00:39,010] [INFO] [timer.py:197:stop] 0/872, RunningAvgSamplesPerSec=11.97996003318681, CurrSamplesPerSec=11.930363965815399, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:00:45,544] [INFO] [timer.py:197:stop] 0/873, RunningAvgSamplesPerSec=11.97985772165359, CurrSamplesPerSec=11.891503916191386, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:00:50,179] [INFO] [timer.py:197:stop] 0/874, RunningAvgSamplesPerSec=11.983552615117333, CurrSamplesPerSec=16.385268098141385, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:00:56,681] [INFO] [timer.py:197:stop] 0/875, RunningAvgSamplesPerSec=11.9834597081003, CurrSamplesPerSec=11.90298943735917, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0023, 'learning_rate': 9.17777777777778e-06, 'epoch': 23.03} [2022-12-19 20:01:03,240] [INFO] [timer.py:197:stop] 0/876, RunningAvgSamplesPerSec=11.98322962488529, CurrSamplesPerSec=11.7856820593804, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:01:09,780] [INFO] [timer.py:197:stop] 0/877, RunningAvgSamplesPerSec=11.983021345565856, CurrSamplesPerSec=11.803712252072884, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:01:16,399] [INFO] [timer.py:197:stop] 0/878, RunningAvgSamplesPerSec=11.982866799878284, CurrSamplesPerSec=11.84915005054646, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:01:22,908] [INFO] [timer.py:197:stop] 0/879, RunningAvgSamplesPerSec=11.982831265054113, CurrSamplesPerSec=11.951783505743089, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:01:29,435] [INFO] [logging.py:68:log_dist] [Rank 0] step=880, skipped=4, lr=[9.166666666666666e-06], mom=[[0.9, 0.999]] [2022-12-19 20:01:29,436] [INFO] [timer.py:197:stop] 0/880, RunningAvgSamplesPerSec=11.982790429441541, CurrSamplesPerSec=11.947084433011256, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:01:36,043] [INFO] [timer.py:197:stop] 0/881, RunningAvgSamplesPerSec=11.98267979983293, CurrSamplesPerSec=11.886328922836121, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:01:42,555] [INFO] [timer.py:197:stop] 0/882, RunningAvgSamplesPerSec=11.98253675206111, CurrSamplesPerSec=11.85810497348226, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:01:49,021] [INFO] [timer.py:197:stop] 0/883, RunningAvgSamplesPerSec=11.9824280164329, CurrSamplesPerSec=11.887499589262312, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:01:55,487] [INFO] [timer.py:197:stop] 0/884, RunningAvgSamplesPerSec=11.982316050914847, CurrSamplesPerSec=11.88448074999409, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:02:02,029] [INFO] [timer.py:197:stop] 0/885, RunningAvgSamplesPerSec=11.982254726407474, CurrSamplesPerSec=11.928409843708701, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:02:08,489] [INFO] [timer.py:197:stop] 0/886, RunningAvgSamplesPerSec=11.98218034728463, CurrSamplesPerSec=11.916862010355025, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:02:15,010] [INFO] [timer.py:197:stop] 0/887, RunningAvgSamplesPerSec=11.982074813573476, CurrSamplesPerSec=11.889504575215854, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:02:21,559] [INFO] [timer.py:197:stop] 0/888, RunningAvgSamplesPerSec=11.98192834413386, CurrSamplesPerSec=11.853691773891462, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:02:28,069] [INFO] [timer.py:197:stop] 0/889, RunningAvgSamplesPerSec=11.981854329288753, CurrSamplesPerSec=11.916634529889821, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:02:34,532] [INFO] [logging.py:68:log_dist] [Rank 0] step=890, skipped=4, lr=[9.144444444444444e-06], mom=[[0.9, 0.999]] [2022-12-19 20:02:34,532] [INFO] [timer.py:197:stop] 0/890, RunningAvgSamplesPerSec=11.981746252846698, CurrSamplesPerSec=11.886644201042495, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:02:41,039] [INFO] [timer.py:197:stop] 0/891, RunningAvgSamplesPerSec=11.981651329527743, CurrSamplesPerSec=11.897948938656091, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:02:47,540] [INFO] [timer.py:197:stop] 0/892, RunningAvgSamplesPerSec=11.98154696408698, CurrSamplesPerSec=11.889479824680796, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:02:54,035] [INFO] [timer.py:197:stop] 0/893, RunningAvgSamplesPerSec=11.981458539146406, CurrSamplesPerSec=11.90327445804242, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:03:00,535] [INFO] [timer.py:197:stop] 0/894, RunningAvgSamplesPerSec=11.981410328686746, CurrSamplesPerSec=11.938608433877473, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:03:07,112] [INFO] [timer.py:197:stop] 0/895, RunningAvgSamplesPerSec=11.981271866167665, CurrSamplesPerSec=11.85902489103199, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:03:13,681] [INFO] [timer.py:197:stop] 0/896, RunningAvgSamplesPerSec=11.981114496504203, CurrSamplesPerSec=11.842214423941336, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:03:20,223] [INFO] [timer.py:197:stop] 0/897, RunningAvgSamplesPerSec=11.980949563091654, CurrSamplesPerSec=11.835293695421383, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:03:26,689] [INFO] [timer.py:197:stop] 0/898, RunningAvgSamplesPerSec=11.980904528365928, CurrSamplesPerSec=11.940733742129343, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:03:33,192] [INFO] [timer.py:197:stop] 0/899, RunningAvgSamplesPerSec=11.9807731842482, CurrSamplesPerSec=11.864234861021599, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:03:39,737] [INFO] [logging.py:68:log_dist] [Rank 0] step=900, skipped=4, lr=[9.122222222222223e-06], mom=[[0.9, 0.999]] [2022-12-19 20:03:39,738] [INFO] [timer.py:197:stop] 0/900, RunningAvgSamplesPerSec=11.980638461612315, CurrSamplesPerSec=11.861000367094544, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0016, 'learning_rate': 9.122222222222223e-06, 'epoch': 23.68} [2022-12-19 20:03:46,245] [INFO] [timer.py:197:stop] 0/901, RunningAvgSamplesPerSec=11.98049915506701, CurrSamplesPerSec=11.856696037118041, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:03:52,725] [INFO] [timer.py:197:stop] 0/902, RunningAvgSamplesPerSec=11.980498896458982, CurrSamplesPerSec=11.980266412359146, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:03:59,326] [INFO] [timer.py:197:stop] 0/903, RunningAvgSamplesPerSec=11.980391565922977, CurrSamplesPerSec=11.884567567571157, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:04:05,808] [INFO] [timer.py:197:stop] 0/904, RunningAvgSamplesPerSec=11.980325013931116, CurrSamplesPerSec=11.920660630001858, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:04:12,317] [INFO] [timer.py:197:stop] 0/905, RunningAvgSamplesPerSec=11.980207031909814, CurrSamplesPerSec=11.874725278419776, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:04:18,875] [INFO] [timer.py:197:stop] 0/906, RunningAvgSamplesPerSec=11.980032154817515, CurrSamplesPerSec=11.824174838651505, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:04:25,333] [INFO] [timer.py:197:stop] 0/907, RunningAvgSamplesPerSec=11.9800185586484, CurrSamplesPerSec=11.967740232689193, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:04:31,811] [INFO] [timer.py:197:stop] 0/908, RunningAvgSamplesPerSec=11.979888734385144, CurrSamplesPerSec=11.863540108650193, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:04:38,361] [INFO] [timer.py:197:stop] 0/909, RunningAvgSamplesPerSec=11.979805292224095, CurrSamplesPerSec=11.904681287829456, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:04:44,874] [INFO] [logging.py:68:log_dist] [Rank 0] step=910, skipped=4, lr=[9.100000000000001e-06], mom=[[0.9, 0.999]] [2022-12-19 20:04:44,875] [INFO] [timer.py:197:stop] 0/910, RunningAvgSamplesPerSec=11.97970836545113, CurrSamplesPerSec=11.892436926243004, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:04:51,363] [INFO] [timer.py:197:stop] 0/911, RunningAvgSamplesPerSec=11.979674537317225, CurrSamplesPerSec=11.949037232448587, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:04:56,007] [INFO] [timer.py:197:stop] 0/912, RunningAvgSamplesPerSec=11.983269197818773, CurrSamplesPerSec=16.47768060258604, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:05:02,511] [INFO] [timer.py:197:stop] 0/913, RunningAvgSamplesPerSec=11.983168218383554, CurrSamplesPerSec=11.891976988254015, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:05:09,018] [INFO] [timer.py:197:stop] 0/914, RunningAvgSamplesPerSec=11.983169905972785, CurrSamplesPerSec=11.984707497243635, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:05:15,599] [INFO] [timer.py:197:stop] 0/915, RunningAvgSamplesPerSec=11.98304526607897, CurrSamplesPerSec=11.870443003110184, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:05:22,071] [INFO] [timer.py:197:stop] 0/916, RunningAvgSamplesPerSec=11.983068807576172, CurrSamplesPerSec=12.004600857705775, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:05:28,579] [INFO] [timer.py:197:stop] 0/917, RunningAvgSamplesPerSec=11.982985495407744, CurrSamplesPerSec=11.907319528019647, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:05:35,150] [INFO] [timer.py:197:stop] 0/918, RunningAvgSamplesPerSec=11.982889907948238, CurrSamplesPerSec=11.896061829015457, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:05:41,595] [INFO] [timer.py:197:stop] 0/919, RunningAvgSamplesPerSec=11.982842926425958, CurrSamplesPerSec=11.939962022236086, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:05:48,116] [INFO] [logging.py:68:log_dist] [Rank 0] step=920, skipped=4, lr=[9.077777777777779e-06], mom=[[0.9, 0.999]] [2022-12-19 20:05:48,117] [INFO] [timer.py:197:stop] 0/920, RunningAvgSamplesPerSec=11.982764407919769, CurrSamplesPerSec=11.911193458867416, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:05:54,651] [INFO] [timer.py:197:stop] 0/921, RunningAvgSamplesPerSec=11.982658492465227, CurrSamplesPerSec=11.886211553084873, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:06:01,107] [INFO] [timer.py:197:stop] 0/922, RunningAvgSamplesPerSec=11.982668903127419, CurrSamplesPerSec=11.99224395507005, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:06:07,639] [INFO] [timer.py:197:stop] 0/923, RunningAvgSamplesPerSec=11.982588259962645, CurrSamplesPerSec=11.908853582194906, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:06:14,131] [INFO] [timer.py:197:stop] 0/924, RunningAvgSamplesPerSec=11.98250657628016, CurrSamplesPerSec=11.907745790608923, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:06:20,639] [INFO] [timer.py:197:stop] 0/925, RunningAvgSamplesPerSec=11.982540587864234, CurrSamplesPerSec=12.013981639833316, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0022, 'learning_rate': 9.066666666666667e-06, 'epoch': 24.34} [2022-12-19 20:06:27,055] [INFO] [timer.py:197:stop] 0/926, RunningAvgSamplesPerSec=11.982550243799029, CurrSamplesPerSec=11.991469312661055, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:06:33,597] [INFO] [timer.py:197:stop] 0/927, RunningAvgSamplesPerSec=11.982452504105975, CurrSamplesPerSec=11.89281733711839, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:06:40,097] [INFO] [timer.py:197:stop] 0/928, RunningAvgSamplesPerSec=11.982370871752023, CurrSamplesPerSec=11.907334317288633, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:06:46,630] [INFO] [timer.py:197:stop] 0/929, RunningAvgSamplesPerSec=11.98231029926522, CurrSamplesPerSec=11.926481796257018, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:06:53,156] [INFO] [logging.py:68:log_dist] [Rank 0] step=930, skipped=4, lr=[9.055555555555556e-06], mom=[[0.9, 0.999]] [2022-12-19 20:06:53,157] [INFO] [timer.py:197:stop] 0/930, RunningAvgSamplesPerSec=11.982241703113752, CurrSamplesPerSec=11.918989108397547, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:06:59,646] [INFO] [timer.py:197:stop] 0/931, RunningAvgSamplesPerSec=11.9821701781696, CurrSamplesPerSec=11.91616108095669, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:07:06,105] [INFO] [timer.py:197:stop] 0/932, RunningAvgSamplesPerSec=11.982116413439563, CurrSamplesPerSec=11.932376542997222, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:07:12,613] [INFO] [timer.py:197:stop] 0/933, RunningAvgSamplesPerSec=11.98197058748448, CurrSamplesPerSec=11.84787187980677, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:07:19,057] [INFO] [timer.py:197:stop] 0/934, RunningAvgSamplesPerSec=11.982000665016344, CurrSamplesPerSec=12.010068512775431, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:07:25,517] [INFO] [timer.py:197:stop] 0/935, RunningAvgSamplesPerSec=11.982020506595292, CurrSamplesPerSec=12.000541473024791, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:07:32,044] [INFO] [timer.py:197:stop] 0/936, RunningAvgSamplesPerSec=11.98205803557732, CurrSamplesPerSec=12.01717530716078, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:07:38,507] [INFO] [timer.py:197:stop] 0/937, RunningAvgSamplesPerSec=11.982042212754564, CurrSamplesPerSec=11.96728192097118, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:07:44,997] [INFO] [timer.py:197:stop] 0/938, RunningAvgSamplesPerSec=11.982078107109796, CurrSamplesPerSec=12.01573369775648, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:07:51,484] [INFO] [timer.py:197:stop] 0/939, RunningAvgSamplesPerSec=11.982010053807992, CurrSamplesPerSec=11.918649356715488, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:07:57,952] [INFO] [logging.py:68:log_dist] [Rank 0] step=940, skipped=4, lr=[9.033333333333334e-06], mom=[[0.9, 0.999]] [2022-12-19 20:07:57,953] [INFO] [timer.py:197:stop] 0/940, RunningAvgSamplesPerSec=11.982034470581441, CurrSamplesPerSec=12.004956801951309, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:08:04,393] [INFO] [timer.py:197:stop] 0/941, RunningAvgSamplesPerSec=11.982021466328323, CurrSamplesPerSec=11.969835895338058, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:08:10,963] [INFO] [timer.py:197:stop] 0/942, RunningAvgSamplesPerSec=11.981886755861986, CurrSamplesPerSec=11.85671646168461, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:08:17,478] [INFO] [timer.py:197:stop] 0/943, RunningAvgSamplesPerSec=11.981822838767812, CurrSamplesPerSec=11.922040861972283, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:08:23,989] [INFO] [timer.py:197:stop] 0/944, RunningAvgSamplesPerSec=11.981670732945945, CurrSamplesPerSec=11.840230577000028, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:08:30,465] [INFO] [timer.py:197:stop] 0/945, RunningAvgSamplesPerSec=11.981697213378585, CurrSamplesPerSec=12.00669387645563, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:08:36,893] [INFO] [timer.py:197:stop] 0/946, RunningAvgSamplesPerSec=11.981735401368988, CurrSamplesPerSec=12.017855350458216, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:08:43,343] [INFO] [timer.py:197:stop] 0/947, RunningAvgSamplesPerSec=11.981674702666513, CurrSamplesPerSec=11.924648132744139, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:08:49,857] [INFO] [timer.py:197:stop] 0/948, RunningAvgSamplesPerSec=11.981573502433035, CurrSamplesPerSec=11.886697363122433, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:08:56,286] [INFO] [timer.py:197:stop] 0/949, RunningAvgSamplesPerSec=11.981453036635726, CurrSamplesPerSec=11.868567231791992, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:09:00,953] [INFO] [logging.py:68:log_dist] [Rank 0] step=950, skipped=4, lr=[9.011111111111111e-06], mom=[[0.9, 0.999]] [2022-12-19 20:09:00,953] [INFO] [timer.py:197:stop] 0/950, RunningAvgSamplesPerSec=11.984961377845826, CurrSamplesPerSec=16.583480525231515, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0034, 'learning_rate': 9.011111111111111e-06, 'epoch': 25.0} [2022-12-19 20:09:07,400] [INFO] [timer.py:197:stop] 0/951, RunningAvgSamplesPerSec=11.98489328687906, CurrSamplesPerSec=11.920689216145462, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:09:13,962] [INFO] [timer.py:197:stop] 0/952, RunningAvgSamplesPerSec=11.984780237527145, CurrSamplesPerSec=11.878449245829165, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:09:20,510] [INFO] [timer.py:197:stop] 0/953, RunningAvgSamplesPerSec=11.98465349369228, CurrSamplesPerSec=11.865445759819316, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:09:26,989] [INFO] [timer.py:197:stop] 0/954, RunningAvgSamplesPerSec=11.984692836082711, CurrSamplesPerSec=12.022224742196753, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:09:33,470] [INFO] [timer.py:197:stop] 0/955, RunningAvgSamplesPerSec=11.98463364100865, CurrSamplesPerSec=11.928543950456334, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:09:39,921] [INFO] [timer.py:197:stop] 0/956, RunningAvgSamplesPerSec=11.984641677719958, CurrSamplesPerSec=11.992305566473584, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:09:46,450] [INFO] [timer.py:197:stop] 0/957, RunningAvgSamplesPerSec=11.984469362038348, CurrSamplesPerSec=11.822306891565008, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:09:52,970] [INFO] [timer.py:197:stop] 0/958, RunningAvgSamplesPerSec=11.984363497840585, CurrSamplesPerSec=11.884109816452387, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:09:59,489] [INFO] [timer.py:197:stop] 0/959, RunningAvgSamplesPerSec=11.984251686794812, CurrSamplesPerSec=11.87830627636451, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:10:05,991] [INFO] [logging.py:68:log_dist] [Rank 0] step=960, skipped=4, lr=[8.988888888888889e-06], mom=[[0.9, 0.999]] [2022-12-19 20:10:05,992] [INFO] [timer.py:197:stop] 0/960, RunningAvgSamplesPerSec=11.984115171781156, CurrSamplesPerSec=11.854880626653486, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:10:12,463] [INFO] [timer.py:197:stop] 0/961, RunningAvgSamplesPerSec=11.984064226610863, CurrSamplesPerSec=11.935456914890047, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:10:18,985] [INFO] [timer.py:197:stop] 0/962, RunningAvgSamplesPerSec=11.983996430335285, CurrSamplesPerSec=11.919330996464325, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:10:25,438] [INFO] [timer.py:197:stop] 0/963, RunningAvgSamplesPerSec=11.983990497879097, CurrSamplesPerSec=11.978298047982722, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:10:31,976] [INFO] [timer.py:197:stop] 0/964, RunningAvgSamplesPerSec=11.98389065684, CurrSamplesPerSec=11.888706290748539, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:10:38,445] [INFO] [timer.py:197:stop] 0/965, RunningAvgSamplesPerSec=11.983906035094055, CurrSamplesPerSec=11.998718219832808, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:10:44,942] [INFO] [timer.py:197:stop] 0/966, RunningAvgSamplesPerSec=11.983806121153947, CurrSamplesPerSec=11.888356153615007, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:10:51,498] [INFO] [timer.py:197:stop] 0/967, RunningAvgSamplesPerSec=11.983773364975226, CurrSamplesPerSec=11.952279480406327, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:10:57,944] [INFO] [timer.py:197:stop] 0/968, RunningAvgSamplesPerSec=11.983732493570201, CurrSamplesPerSec=11.944421103591708, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:11:04,380] [INFO] [timer.py:197:stop] 0/969, RunningAvgSamplesPerSec=11.98370986771487, CurrSamplesPerSec=11.961893123283378, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:11:10,897] [INFO] [logging.py:68:log_dist] [Rank 0] step=970, skipped=4, lr=[8.966666666666667e-06], mom=[[0.9, 0.999]] [2022-12-19 20:11:10,898] [INFO] [timer.py:197:stop] 0/970, RunningAvgSamplesPerSec=11.983570224433443, CurrSamplesPerSec=11.850041376962448, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:11:17,432] [INFO] [timer.py:197:stop] 0/971, RunningAvgSamplesPerSec=11.983510054057579, CurrSamplesPerSec=11.925547145492628, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:11:23,983] [INFO] [timer.py:197:stop] 0/972, RunningAvgSamplesPerSec=11.983431301211311, CurrSamplesPerSec=11.907603171475493, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:11:30,421] [INFO] [timer.py:197:stop] 0/973, RunningAvgSamplesPerSec=11.983445113993465, CurrSamplesPerSec=11.996858525352687, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:11:36,943] [INFO] [timer.py:197:stop] 0/974, RunningAvgSamplesPerSec=11.983394067382388, CurrSamplesPerSec=11.934032191134115, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:11:43,431] [INFO] [timer.py:197:stop] 0/975, RunningAvgSamplesPerSec=11.983338599644666, CurrSamplesPerSec=11.929665688613657, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0036, 'learning_rate': 8.955555555555555e-06, 'epoch': 25.66} [2022-12-19 20:11:49,942] [INFO] [timer.py:197:stop] 0/976, RunningAvgSamplesPerSec=11.98324976326228, CurrSamplesPerSec=11.89743162439965, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:11:56,373] [INFO] [timer.py:197:stop] 0/977, RunningAvgSamplesPerSec=11.983282055763688, CurrSamplesPerSec=12.014817809999654, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:12:02,896] [INFO] [timer.py:197:stop] 0/978, RunningAvgSamplesPerSec=11.983203514993598, CurrSamplesPerSec=11.907113010280371, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:12:09,357] [INFO] [timer.py:197:stop] 0/979, RunningAvgSamplesPerSec=11.98314478188719, CurrSamplesPerSec=11.926094460052305, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:12:15,926] [INFO] [logging.py:68:log_dist] [Rank 0] step=980, skipped=4, lr=[8.944444444444446e-06], mom=[[0.9, 0.999]] [2022-12-19 20:12:15,927] [INFO] [timer.py:197:stop] 0/980, RunningAvgSamplesPerSec=11.982959135171967, CurrSamplesPerSec=11.80428944815244, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:12:22,406] [INFO] [timer.py:197:stop] 0/981, RunningAvgSamplesPerSec=11.982991942908281, CurrSamplesPerSec=12.01516414221157, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:12:28,925] [INFO] [timer.py:197:stop] 0/982, RunningAvgSamplesPerSec=11.982913239797575, CurrSamplesPerSec=11.906355663820282, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:12:35,483] [INFO] [timer.py:197:stop] 0/983, RunningAvgSamplesPerSec=11.982813430020077, CurrSamplesPerSec=11.885792619871138, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:12:42,393] [INFO] [timer.py:197:stop] 0/984, RunningAvgSamplesPerSec=11.982698869313275, CurrSamplesPerSec=11.871360111006437, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:12:49,463] [INFO] [timer.py:197:stop] 0/985, RunningAvgSamplesPerSec=11.98248540580213, CurrSamplesPerSec=11.776471900745703, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:12:56,451] [INFO] [timer.py:197:stop] 0/986, RunningAvgSamplesPerSec=11.982282268839205, CurrSamplesPerSec=11.785875071912223, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:13:03,148] [INFO] [timer.py:197:stop] 0/987, RunningAvgSamplesPerSec=11.982128577028242, CurrSamplesPerSec=11.832782723174759, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:13:08,211] [INFO] [timer.py:197:stop] 0/988, RunningAvgSamplesPerSec=11.985415190922605, CurrSamplesPerSec=16.42239476510482, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:13:14,766] [INFO] [timer.py:197:stop] 0/989, RunningAvgSamplesPerSec=11.985273780707011, CurrSamplesPerSec=11.847448324319458, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:13:21,353] [INFO] [logging.py:68:log_dist] [Rank 0] step=990, skipped=4, lr=[8.922222222222224e-06], mom=[[0.9, 0.999]] [2022-12-19 20:13:21,353] [INFO] [timer.py:197:stop] 0/990, RunningAvgSamplesPerSec=11.985189516140235, CurrSamplesPerSec=11.90259412571053, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:13:27,849] [INFO] [timer.py:197:stop] 0/991, RunningAvgSamplesPerSec=11.98518585698797, CurrSamplesPerSec=11.981571705836634, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:13:34,316] [INFO] [timer.py:197:stop] 0/992, RunningAvgSamplesPerSec=11.985165623087367, CurrSamplesPerSec=11.96518768577523, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:13:40,898] [INFO] [timer.py:197:stop] 0/993, RunningAvgSamplesPerSec=11.98501916329215, CurrSamplesPerSec=11.841758884755997, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:13:47,480] [INFO] [timer.py:197:stop] 0/994, RunningAvgSamplesPerSec=11.98490783023967, CurrSamplesPerSec=11.875584205771913, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:13:54,119] [INFO] [timer.py:197:stop] 0/995, RunningAvgSamplesPerSec=11.98478266961461, CurrSamplesPerSec=11.861897670620404, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:14:00,786] [INFO] [timer.py:197:stop] 0/996, RunningAvgSamplesPerSec=11.984706039513199, CurrSamplesPerSec=11.909092917607682, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:14:07,334] [INFO] [timer.py:197:stop] 0/997, RunningAvgSamplesPerSec=11.984605081457271, CurrSamplesPerSec=11.88508692102057, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:14:13,839] [INFO] [timer.py:197:stop] 0/998, RunningAvgSamplesPerSec=11.984535021173334, CurrSamplesPerSec=11.91522857593963, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:14:20,389] [INFO] [timer.py:197:stop] 0/999, RunningAvgSamplesPerSec=11.98435874770376, CurrSamplesPerSec=11.811327784854544, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:14:27,042] [INFO] [logging.py:68:log_dist] [Rank 0] step=1000, skipped=4, lr=[8.900000000000001e-06], mom=[[0.9, 0.999]] [2022-12-19 20:14:27,043] [INFO] [timer.py:197:stop] 0/1000, RunningAvgSamplesPerSec=11.984206885402875, CurrSamplesPerSec=11.83469102750876, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.003, 'learning_rate': 8.900000000000001e-06, 'epoch': 26.32} {'eval_loss': 0.366455078125, 'eval_wer': 19.037900874635568, 'eval_runtime': 167.6651, 'eval_samples_per_second': 7.199, 'eval_steps_per_second': 0.227, 'epoch': 26.32} [2022-12-19 20:17:16,832] [INFO] [logging.py:68:log_dist] [Rank 0] [Torch] Checkpoint global_step1000 is begin to save! [2022-12-19 20:17:16,840] [INFO] [logging.py:68:log_dist] [Rank 0] Saving model checkpoint: ./checkpoint-1000/global_step1000/mp_rank_00_model_states.pt [2022-12-19 20:17:16,841] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving ./checkpoint-1000/global_step1000/mp_rank_00_model_states.pt... [2022-12-19 20:17:18,775] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved ./checkpoint-1000/global_step1000/mp_rank_00_model_states.pt. [2022-12-19 20:17:18,776] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving ./checkpoint-1000/global_step1000/zero_pp_rank_0_mp_rank_00_optim_states.pt... [2022-12-19 20:17:26,330] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved ./checkpoint-1000/global_step1000/zero_pp_rank_0_mp_rank_00_optim_states.pt. [2022-12-19 20:17:26,330] [INFO] [engine.py:3269:_save_zero_checkpoint] zero checkpoint saved ./checkpoint-1000/global_step1000/zero_pp_rank_0_mp_rank_00_optim_states.pt [2022-12-19 20:17:26,330] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step1000 is ready now! [2022-12-19 20:17:58,798] [INFO] [timer.py:197:stop] 0/1001, RunningAvgSamplesPerSec=11.98365454152263, CurrSamplesPerSec=11.456680071573453, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:18:05,329] [INFO] [timer.py:197:stop] 0/1002, RunningAvgSamplesPerSec=11.983499631120777, CurrSamplesPerSec=11.830719129416392, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:18:11,941] [INFO] [timer.py:197:stop] 0/1003, RunningAvgSamplesPerSec=11.983451878963741, CurrSamplesPerSec=11.935889440240707, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:18:18,663] [INFO] [timer.py:197:stop] 0/1004, RunningAvgSamplesPerSec=11.983385115629577, CurrSamplesPerSec=11.916926023865686, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:18:25,275] [INFO] [timer.py:197:stop] 0/1005, RunningAvgSamplesPerSec=11.983331665823895, CurrSamplesPerSec=11.930013491489722, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:18:31,967] [INFO] [timer.py:197:stop] 0/1006, RunningAvgSamplesPerSec=11.983301098253772, CurrSamplesPerSec=11.952720144788701, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:18:38,501] [INFO] [timer.py:197:stop] 0/1007, RunningAvgSamplesPerSec=11.983289944544538, CurrSamplesPerSec=11.972102085885544, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:18:44,955] [INFO] [timer.py:197:stop] 0/1008, RunningAvgSamplesPerSec=11.983287326470961, CurrSamplesPerSec=11.980656740697405, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:18:51,450] [INFO] [timer.py:197:stop] 0/1009, RunningAvgSamplesPerSec=11.983230774894896, CurrSamplesPerSec=11.926608970948681, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:18:57,905] [INFO] [logging.py:68:log_dist] [Rank 0] step=1010, skipped=4, lr=[8.877777777777779e-06], mom=[[0.9, 0.999]] [2022-12-19 20:18:57,906] [INFO] [timer.py:197:stop] 0/1010, RunningAvgSamplesPerSec=11.983171083970793, CurrSamplesPerSec=11.923362626403597, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:19:04,502] [INFO] [timer.py:197:stop] 0/1011, RunningAvgSamplesPerSec=11.983137660107197, CurrSamplesPerSec=11.949540958296936, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:19:11,097] [INFO] [timer.py:197:stop] 0/1012, RunningAvgSamplesPerSec=11.98311600016559, CurrSamplesPerSec=11.961300944947185, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:19:17,584] [INFO] [timer.py:197:stop] 0/1013, RunningAvgSamplesPerSec=11.983015460390819, CurrSamplesPerSec=11.882324397993427, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:19:24,118] [INFO] [timer.py:197:stop] 0/1014, RunningAvgSamplesPerSec=11.982997580987085, CurrSamplesPerSec=11.964948756979727, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:19:30,632] [INFO] [timer.py:197:stop] 0/1015, RunningAvgSamplesPerSec=11.982867550816463, CurrSamplesPerSec=11.852707787304318, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:19:37,140] [INFO] [timer.py:197:stop] 0/1016, RunningAvgSamplesPerSec=11.982894741242305, CurrSamplesPerSec=12.010502163730113, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:19:43,755] [INFO] [timer.py:197:stop] 0/1017, RunningAvgSamplesPerSec=11.982814091240728, CurrSamplesPerSec=11.9015898672146, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:19:50,322] [INFO] [timer.py:197:stop] 0/1018, RunningAvgSamplesPerSec=11.982769915370243, CurrSamplesPerSec=11.938098727310601, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:19:56,920] [INFO] [timer.py:197:stop] 0/1019, RunningAvgSamplesPerSec=11.982677795427527, CurrSamplesPerSec=11.889810015832973, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:20:03,401] [INFO] [logging.py:68:log_dist] [Rank 0] step=1020, skipped=4, lr=[8.855555555555556e-06], mom=[[0.9, 0.999]] [2022-12-19 20:20:03,402] [INFO] [timer.py:197:stop] 0/1020, RunningAvgSamplesPerSec=11.982625564100335, CurrSamplesPerSec=11.929740973492052, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:20:09,879] [INFO] [timer.py:197:stop] 0/1021, RunningAvgSamplesPerSec=11.98257244365925, CurrSamplesPerSec=11.928739020199522, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:20:16,273] [INFO] [timer.py:197:stop] 0/1022, RunningAvgSamplesPerSec=11.982569079603438, CurrSamplesPerSec=11.979142088087634, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:20:22,955] [INFO] [timer.py:197:stop] 0/1023, RunningAvgSamplesPerSec=11.982504976157971, CurrSamplesPerSec=11.917474663727647, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:20:29,518] [INFO] [timer.py:197:stop] 0/1024, RunningAvgSamplesPerSec=11.98251530426668, CurrSamplesPerSec=11.993069600476229, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:20:36,165] [INFO] [timer.py:197:stop] 0/1025, RunningAvgSamplesPerSec=11.982381793786423, CurrSamplesPerSec=11.84747185438538, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0017, 'learning_rate': 8.844444444444445e-06, 'epoch': 26.97} [2022-12-19 20:20:40,818] [INFO] [timer.py:197:stop] 0/1026, RunningAvgSamplesPerSec=11.985541597305163, CurrSamplesPerSec=16.413363531400687, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:20:47,434] [INFO] [timer.py:197:stop] 0/1027, RunningAvgSamplesPerSec=11.985555904496179, CurrSamplesPerSec=12.000224415684368, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:20:54,030] [INFO] [timer.py:197:stop] 0/1028, RunningAvgSamplesPerSec=11.985542085786149, CurrSamplesPerSec=11.971394643365787, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:21:00,538] [INFO] [timer.py:197:stop] 0/1029, RunningAvgSamplesPerSec=11.98547283212985, CurrSamplesPerSec=11.914837739480353, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:21:07,007] [INFO] [logging.py:68:log_dist] [Rank 0] step=1030, skipped=4, lr=[8.833333333333334e-06], mom=[[0.9, 0.999]] [2022-12-19 20:21:07,007] [INFO] [timer.py:197:stop] 0/1030, RunningAvgSamplesPerSec=11.985412532851136, CurrSamplesPerSec=11.9238038093298, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:21:13,419] [INFO] [timer.py:197:stop] 0/1031, RunningAvgSamplesPerSec=11.985436855142273, CurrSamplesPerSec=12.010492490876654, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:21:19,900] [INFO] [timer.py:197:stop] 0/1032, RunningAvgSamplesPerSec=11.9853421671813, CurrSamplesPerSec=11.888694706934551, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:21:26,481] [INFO] [timer.py:197:stop] 0/1033, RunningAvgSamplesPerSec=11.985171397561468, CurrSamplesPerSec=11.81182516425135, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:21:32,990] [INFO] [timer.py:197:stop] 0/1034, RunningAvgSamplesPerSec=11.98509704066105, CurrSamplesPerSec=11.908922792992232, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:21:39,430] [INFO] [timer.py:197:stop] 0/1035, RunningAvgSamplesPerSec=11.985103810615868, CurrSamplesPerSec=11.992094483078548, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:21:45,896] [INFO] [timer.py:197:stop] 0/1036, RunningAvgSamplesPerSec=11.985112867324592, CurrSamplesPerSec=11.994475763188472, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:21:52,343] [INFO] [timer.py:197:stop] 0/1037, RunningAvgSamplesPerSec=11.985126586584279, CurrSamplesPerSec=11.999329127626316, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:21:58,830] [INFO] [timer.py:197:stop] 0/1038, RunningAvgSamplesPerSec=11.98512807095093, CurrSamplesPerSec=11.986664587585492, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:22:05,293] [INFO] [timer.py:197:stop] 0/1039, RunningAvgSamplesPerSec=11.98502877762385, CurrSamplesPerSec=11.883037133304665, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:22:11,802] [INFO] [logging.py:68:log_dist] [Rank 0] step=1040, skipped=4, lr=[8.811111111111112e-06], mom=[[0.9, 0.999]] [2022-12-19 20:22:11,802] [INFO] [timer.py:197:stop] 0/1040, RunningAvgSamplesPerSec=11.98491688137689, CurrSamplesPerSec=11.86999421259447, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:22:18,264] [INFO] [timer.py:197:stop] 0/1041, RunningAvgSamplesPerSec=11.984833340681444, CurrSamplesPerSec=11.898741608317415, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:22:24,759] [INFO] [timer.py:197:stop] 0/1042, RunningAvgSamplesPerSec=11.984746159139776, CurrSamplesPerSec=11.894844673581485, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:22:31,299] [INFO] [timer.py:197:stop] 0/1043, RunningAvgSamplesPerSec=11.984648052252966, CurrSamplesPerSec=11.883479019531752, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:22:37,742] [INFO] [timer.py:197:stop] 0/1044, RunningAvgSamplesPerSec=11.984587812543458, CurrSamplesPerSec=11.922205007481928, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:22:44,282] [INFO] [timer.py:197:stop] 0/1045, RunningAvgSamplesPerSec=11.984550012714871, CurrSamplesPerSec=11.94529173811832, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:22:50,882] [INFO] [timer.py:197:stop] 0/1046, RunningAvgSamplesPerSec=11.98427721518935, CurrSamplesPerSec=11.70635410584949, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:22:57,459] [INFO] [timer.py:197:stop] 0/1047, RunningAvgSamplesPerSec=11.983990259539594, CurrSamplesPerSec=11.691721829873421, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:23:03,994] [INFO] [timer.py:197:stop] 0/1048, RunningAvgSamplesPerSec=11.983844028558456, CurrSamplesPerSec=11.83295850349597, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:23:10,541] [INFO] [timer.py:197:stop] 0/1049, RunningAvgSamplesPerSec=11.983749973755693, CurrSamplesPerSec=11.886170500535183, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:23:17,032] [INFO] [logging.py:68:log_dist] [Rank 0] step=1050, skipped=4, lr=[8.788888888888891e-06], mom=[[0.9, 0.999]] [2022-12-19 20:23:17,033] [INFO] [timer.py:197:stop] 0/1050, RunningAvgSamplesPerSec=11.983644181787682, CurrSamplesPerSec=11.873895362688533, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0013, 'learning_rate': 8.788888888888891e-06, 'epoch': 27.63} [2022-12-19 20:23:23,418] [INFO] [timer.py:197:stop] 0/1051, RunningAvgSamplesPerSec=11.983637313948767, CurrSamplesPerSec=11.976444143179199, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:23:29,946] [INFO] [timer.py:197:stop] 0/1052, RunningAvgSamplesPerSec=11.983600429105973, CurrSamplesPerSec=11.94503287320327, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:23:36,387] [INFO] [timer.py:197:stop] 0/1053, RunningAvgSamplesPerSec=11.983589294470178, CurrSamplesPerSec=11.971909332882111, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:23:42,872] [INFO] [timer.py:197:stop] 0/1054, RunningAvgSamplesPerSec=11.98357740398825, CurrSamplesPerSec=11.97109353848352, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:23:49,341] [INFO] [timer.py:197:stop] 0/1055, RunningAvgSamplesPerSec=11.983482810889667, CurrSamplesPerSec=11.884791194444391, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:23:55,844] [INFO] [timer.py:197:stop] 0/1056, RunningAvgSamplesPerSec=11.983384979918295, CurrSamplesPerSec=11.881247830099706, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:24:02,355] [INFO] [timer.py:197:stop] 0/1057, RunningAvgSamplesPerSec=11.983288141213084, CurrSamplesPerSec=11.882082981580423, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:24:08,901] [INFO] [timer.py:197:stop] 0/1058, RunningAvgSamplesPerSec=11.983147398647855, CurrSamplesPerSec=11.836483035806806, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:24:15,423] [INFO] [timer.py:197:stop] 0/1059, RunningAvgSamplesPerSec=11.983088713882328, CurrSamplesPerSec=11.921436739528675, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:24:21,974] [INFO] [logging.py:68:log_dist] [Rank 0] step=1060, skipped=4, lr=[8.766666666666669e-06], mom=[[0.9, 0.999]] [2022-12-19 20:24:21,974] [INFO] [timer.py:197:stop] 0/1060, RunningAvgSamplesPerSec=11.982996289230151, CurrSamplesPerSec=11.886094185651501, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:24:28,441] [INFO] [timer.py:197:stop] 0/1061, RunningAvgSamplesPerSec=11.982883243074514, CurrSamplesPerSec=11.864463491671177, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:24:35,011] [INFO] [timer.py:197:stop] 0/1062, RunningAvgSamplesPerSec=11.982767587450493, CurrSamplesPerSec=11.861528669557952, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:24:41,550] [INFO] [timer.py:197:stop] 0/1063, RunningAvgSamplesPerSec=11.982666056246028, CurrSamplesPerSec=11.876001894246404, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:24:46,207] [INFO] [timer.py:197:stop] 0/1064, RunningAvgSamplesPerSec=11.985776127549727, CurrSamplesPerSec=16.540772888152006, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:24:52,702] [INFO] [timer.py:197:stop] 0/1065, RunningAvgSamplesPerSec=11.985760612659842, CurrSamplesPerSec=11.969306440433256, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:24:59,187] [INFO] [timer.py:197:stop] 0/1066, RunningAvgSamplesPerSec=11.985625630766906, CurrSamplesPerSec=11.843838873589677, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:25:05,734] [INFO] [timer.py:197:stop] 0/1067, RunningAvgSamplesPerSec=11.985529877669745, CurrSamplesPerSec=11.884508110548163, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:25:12,250] [INFO] [timer.py:197:stop] 0/1068, RunningAvgSamplesPerSec=11.985434031149284, CurrSamplesPerSec=11.88422030487322, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:25:18,726] [INFO] [timer.py:197:stop] 0/1069, RunningAvgSamplesPerSec=11.985403004259974, CurrSamplesPerSec=11.952419446368843, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:25:25,184] [INFO] [logging.py:68:log_dist] [Rank 0] step=1070, skipped=4, lr=[8.744444444444446e-06], mom=[[0.9, 0.999]] [2022-12-19 20:25:25,185] [INFO] [timer.py:197:stop] 0/1070, RunningAvgSamplesPerSec=11.985309877648744, CurrSamplesPerSec=11.886761579338259, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:25:31,641] [INFO] [timer.py:197:stop] 0/1071, RunningAvgSamplesPerSec=11.985305038095623, CurrSamplesPerSec=11.980138625457101, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:25:38,139] [INFO] [timer.py:197:stop] 0/1072, RunningAvgSamplesPerSec=11.985273969856914, CurrSamplesPerSec=11.952153886363565, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:25:44,658] [INFO] [timer.py:197:stop] 0/1073, RunningAvgSamplesPerSec=11.985161854449727, CurrSamplesPerSec=11.866388323743045, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:25:51,184] [INFO] [timer.py:197:stop] 0/1074, RunningAvgSamplesPerSec=11.98514116760316, CurrSamplesPerSec=11.963026474103424, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:25:57,728] [INFO] [timer.py:197:stop] 0/1075, RunningAvgSamplesPerSec=11.985115945749218, CurrSamplesPerSec=11.958139033672104, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.002, 'learning_rate': 8.733333333333333e-06, 'epoch': 28.29} [2022-12-19 20:26:04,244] [INFO] [timer.py:197:stop] 0/1076, RunningAvgSamplesPerSec=11.985021103985039, CurrSamplesPerSec=11.884113499366649, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:26:10,821] [INFO] [timer.py:197:stop] 0/1077, RunningAvgSamplesPerSec=11.984816346011325, CurrSamplesPerSec=11.768872340209118, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:26:17,310] [INFO] [timer.py:197:stop] 0/1078, RunningAvgSamplesPerSec=11.98478204624457, CurrSamplesPerSec=11.94802299468085, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:26:23,865] [INFO] [timer.py:197:stop] 0/1079, RunningAvgSamplesPerSec=11.98464696149143, CurrSamplesPerSec=11.841039078913543, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:26:30,342] [INFO] [logging.py:68:log_dist] [Rank 0] step=1080, skipped=4, lr=[8.722222222222224e-06], mom=[[0.9, 0.999]] [2022-12-19 20:26:30,342] [INFO] [timer.py:197:stop] 0/1080, RunningAvgSamplesPerSec=11.984457201834502, CurrSamplesPerSec=11.783515888591301, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:26:36,845] [INFO] [timer.py:197:stop] 0/1081, RunningAvgSamplesPerSec=11.9842838881984, CurrSamplesPerSec=11.800322364732493, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:26:43,374] [INFO] [timer.py:197:stop] 0/1082, RunningAvgSamplesPerSec=11.984148263504078, CurrSamplesPerSec=11.839576230911812, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:26:49,837] [INFO] [timer.py:197:stop] 0/1083, RunningAvgSamplesPerSec=11.984033588594501, CurrSamplesPerSec=11.861452670769875, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:26:56,329] [INFO] [timer.py:197:stop] 0/1084, RunningAvgSamplesPerSec=11.983986248108742, CurrSamplesPerSec=11.933028986451088, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:27:02,816] [INFO] [timer.py:197:stop] 0/1085, RunningAvgSamplesPerSec=11.983889281985055, CurrSamplesPerSec=11.879883335836011, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:27:09,279] [INFO] [timer.py:197:stop] 0/1086, RunningAvgSamplesPerSec=11.983870234777996, CurrSamplesPerSec=11.963277588881896, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:27:15,820] [INFO] [timer.py:197:stop] 0/1087, RunningAvgSamplesPerSec=11.983823131170523, CurrSamplesPerSec=11.932979652867285, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:27:22,894] [INFO] [timer.py:197:stop] 0/1088, RunningAvgSamplesPerSec=11.983614748212192, CurrSamplesPerSec=11.761709790014612, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:27:30,004] [INFO] [timer.py:197:stop] 0/1089, RunningAvgSamplesPerSec=11.983393461157887, CurrSamplesPerSec=11.747804623577233, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:27:36,764] [INFO] [logging.py:68:log_dist] [Rank 0] step=1090, skipped=4, lr=[8.700000000000001e-06], mom=[[0.9, 0.999]] [2022-12-19 20:27:36,765] [INFO] [timer.py:197:stop] 0/1090, RunningAvgSamplesPerSec=11.98332293367454, CurrSamplesPerSec=11.907147341377403, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:27:43,616] [INFO] [timer.py:197:stop] 0/1091, RunningAvgSamplesPerSec=11.983261406135885, CurrSamplesPerSec=11.916691663701576, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:27:50,633] [INFO] [timer.py:197:stop] 0/1092, RunningAvgSamplesPerSec=11.983157562631458, CurrSamplesPerSec=11.871130164519915, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:27:57,323] [INFO] [timer.py:197:stop] 0/1093, RunningAvgSamplesPerSec=11.983127027680128, CurrSamplesPerSec=11.949936202713952, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:28:03,841] [INFO] [timer.py:197:stop] 0/1094, RunningAvgSamplesPerSec=11.983065348956583, CurrSamplesPerSec=11.916149972550752, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:28:10,317] [INFO] [timer.py:197:stop] 0/1095, RunningAvgSamplesPerSec=11.982966705288803, CurrSamplesPerSec=11.876208385061828, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:28:16,885] [INFO] [timer.py:197:stop] 0/1096, RunningAvgSamplesPerSec=11.98282760366482, CurrSamplesPerSec=11.832696138942289, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:28:23,393] [INFO] [timer.py:197:stop] 0/1097, RunningAvgSamplesPerSec=11.982735639776411, CurrSamplesPerSec=11.882965592911322, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:28:30,062] [INFO] [timer.py:197:stop] 0/1098, RunningAvgSamplesPerSec=11.982613110911698, CurrSamplesPerSec=11.849931000044013, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:28:36,678] [INFO] [timer.py:197:stop] 0/1099, RunningAvgSamplesPerSec=11.982579933719503, CurrSamplesPerSec=11.946327841641098, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:28:43,297] [INFO] [logging.py:68:log_dist] [Rank 0] step=1100, skipped=4, lr=[8.677777777777779e-06], mom=[[0.9, 0.999]] [2022-12-19 20:28:43,298] [INFO] [timer.py:197:stop] 0/1100, RunningAvgSamplesPerSec=11.982485805286641, CurrSamplesPerSec=11.880109941117135, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.001, 'learning_rate': 8.677777777777779e-06, 'epoch': 28.95} [2022-12-19 20:28:49,765] [INFO] [timer.py:197:stop] 0/1101, RunningAvgSamplesPerSec=11.982463676251783, CurrSamplesPerSec=11.958215211068032, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:28:54,399] [INFO] [timer.py:197:stop] 0/1102, RunningAvgSamplesPerSec=11.985483521156345, CurrSamplesPerSec=16.57679529256495, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:29:01,080] [INFO] [timer.py:197:stop] 0/1103, RunningAvgSamplesPerSec=11.985393568810386, CurrSamplesPerSec=11.887256909695749, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:29:07,776] [INFO] [timer.py:197:stop] 0/1104, RunningAvgSamplesPerSec=11.985278255189614, CurrSamplesPerSec=11.859649950612672, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:29:14,249] [INFO] [timer.py:197:stop] 0/1105, RunningAvgSamplesPerSec=11.98513558257049, CurrSamplesPerSec=11.82994800886935, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:29:20,688] [INFO] [timer.py:197:stop] 0/1106, RunningAvgSamplesPerSec=11.985109476924965, CurrSamplesPerSec=11.956384026100315, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:29:27,167] [INFO] [timer.py:197:stop] 0/1107, RunningAvgSamplesPerSec=11.985043714171962, CurrSamplesPerSec=11.912879183590944, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:29:33,702] [INFO] [timer.py:197:stop] 0/1108, RunningAvgSamplesPerSec=11.984951697867949, CurrSamplesPerSec=11.88412980944296, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:29:40,377] [INFO] [timer.py:197:stop] 0/1109, RunningAvgSamplesPerSec=11.984811203200689, CurrSamplesPerSec=11.831414736143394, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:29:46,886] [INFO] [logging.py:68:log_dist] [Rank 0] step=1110, skipped=4, lr=[8.655555555555557e-06], mom=[[0.9, 0.999]] [2022-12-19 20:29:46,886] [INFO] [timer.py:197:stop] 0/1110, RunningAvgSamplesPerSec=11.984766226887746, CurrSamplesPerSec=11.935183617487162, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:29:53,436] [INFO] [timer.py:197:stop] 0/1111, RunningAvgSamplesPerSec=11.984721866617127, CurrSamplesPerSec=11.935771620412917, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:30:00,014] [INFO] [timer.py:197:stop] 0/1112, RunningAvgSamplesPerSec=11.98450860770894, CurrSamplesPerSec=11.752585413351564, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:30:06,735] [INFO] [timer.py:197:stop] 0/1113, RunningAvgSamplesPerSec=11.984318154631033, CurrSamplesPerSec=11.776582980031327, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:30:13,343] [INFO] [timer.py:197:stop] 0/1114, RunningAvgSamplesPerSec=11.984234457759378, CurrSamplesPerSec=11.891963817587017, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:30:19,903] [INFO] [timer.py:197:stop] 0/1115, RunningAvgSamplesPerSec=11.98415166571858, CurrSamplesPerSec=11.892789411392656, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:30:26,434] [INFO] [timer.py:197:stop] 0/1116, RunningAvgSamplesPerSec=11.983999649165487, CurrSamplesPerSec=11.817162804555505, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:30:32,939] [INFO] [timer.py:197:stop] 0/1117, RunningAvgSamplesPerSec=11.983963343285309, CurrSamplesPerSec=11.943654752590772, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:30:39,456] [INFO] [timer.py:197:stop] 0/1118, RunningAvgSamplesPerSec=11.98390602800619, CurrSamplesPerSec=11.920338780661565, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:30:45,983] [INFO] [timer.py:197:stop] 0/1119, RunningAvgSamplesPerSec=11.983847331120435, CurrSamplesPerSec=11.918698042683461, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:30:52,657] [INFO] [logging.py:68:log_dist] [Rank 0] step=1120, skipped=4, lr=[8.633333333333334e-06], mom=[[0.9, 0.999]] [2022-12-19 20:30:52,657] [INFO] [timer.py:197:stop] 0/1120, RunningAvgSamplesPerSec=11.98374010633495, CurrSamplesPerSec=11.865156254385584, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:30:59,327] [INFO] [timer.py:197:stop] 0/1121, RunningAvgSamplesPerSec=11.983631637977323, CurrSamplesPerSec=11.863579956397821, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:31:05,920] [INFO] [timer.py:197:stop] 0/1122, RunningAvgSamplesPerSec=11.983491255712577, CurrSamplesPerSec=11.828437863186757, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:31:12,496] [INFO] [timer.py:197:stop] 0/1123, RunningAvgSamplesPerSec=11.983389931781655, CurrSamplesPerSec=11.870972672066568, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:31:19,001] [INFO] [timer.py:197:stop] 0/1124, RunningAvgSamplesPerSec=11.983348391483117, CurrSamplesPerSec=11.936962132232598, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:31:25,583] [INFO] [timer.py:197:stop] 0/1125, RunningAvgSamplesPerSec=11.983170943851857, CurrSamplesPerSec=11.78733140776729, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0025, 'learning_rate': 8.622222222222223e-06, 'epoch': 29.61} [2022-12-19 20:31:32,191] [INFO] [timer.py:197:stop] 0/1126, RunningAvgSamplesPerSec=11.983046895639509, CurrSamplesPerSec=11.845343023226823, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:31:38,776] [INFO] [timer.py:197:stop] 0/1127, RunningAvgSamplesPerSec=11.98289337014892, CurrSamplesPerSec=11.812782620569632, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:31:45,477] [INFO] [timer.py:197:stop] 0/1128, RunningAvgSamplesPerSec=11.982601444177801, CurrSamplesPerSec=11.662953364037376, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:31:52,174] [INFO] [timer.py:197:stop] 0/1129, RunningAvgSamplesPerSec=11.982432163344674, CurrSamplesPerSec=11.794809203077195, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:31:58,925] [INFO] [logging.py:68:log_dist] [Rank 0] step=1130, skipped=4, lr=[8.611111111111112e-06], mom=[[0.9, 0.999]] [2022-12-19 20:31:58,926] [INFO] [timer.py:197:stop] 0/1130, RunningAvgSamplesPerSec=11.982293244766963, CurrSamplesPerSec=11.827753029481011, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:32:05,467] [INFO] [timer.py:197:stop] 0/1131, RunningAvgSamplesPerSec=11.982061095872888, CurrSamplesPerSec=11.725802553345027, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:32:11,992] [INFO] [timer.py:197:stop] 0/1132, RunningAvgSamplesPerSec=11.981940582652635, CurrSamplesPerSec=11.84741015352242, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:32:18,486] [INFO] [timer.py:197:stop] 0/1133, RunningAvgSamplesPerSec=11.981894155505422, CurrSamplesPerSec=11.92966038689747, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:32:25,091] [INFO] [timer.py:197:stop] 0/1134, RunningAvgSamplesPerSec=11.981722874386431, CurrSamplesPerSec=11.791088800999386, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:32:31,700] [INFO] [timer.py:197:stop] 0/1135, RunningAvgSamplesPerSec=11.981584532444861, CurrSamplesPerSec=11.827003660027016, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:32:38,321] [INFO] [timer.py:197:stop] 0/1136, RunningAvgSamplesPerSec=11.981448398083113, CurrSamplesPerSec=11.829170212231526, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:32:45,064] [INFO] [timer.py:197:stop] 0/1137, RunningAvgSamplesPerSec=11.981281247328486, CurrSamplesPerSec=11.794686896598996, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:32:51,562] [INFO] [timer.py:197:stop] 0/1138, RunningAvgSamplesPerSec=11.98118095023026, CurrSamplesPerSec=11.86841610423302, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:32:58,109] [INFO] [timer.py:197:stop] 0/1139, RunningAvgSamplesPerSec=11.981133844776224, CurrSamplesPerSec=11.9278601965713, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:33:02,756] [INFO] [logging.py:68:log_dist] [Rank 0] step=1140, skipped=4, lr=[8.58888888888889e-06], mom=[[0.9, 0.999]] [2022-12-19 20:33:02,756] [INFO] [timer.py:197:stop] 0/1140, RunningAvgSamplesPerSec=11.98401831083999, CurrSamplesPerSec=16.50085554610003, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:33:09,390] [INFO] [timer.py:197:stop] 0/1141, RunningAvgSamplesPerSec=11.983932365441635, CurrSamplesPerSec=11.886918964637323, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:33:15,946] [INFO] [timer.py:197:stop] 0/1142, RunningAvgSamplesPerSec=11.98379353985404, CurrSamplesPerSec=11.827732183427774, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:33:22,465] [INFO] [timer.py:197:stop] 0/1143, RunningAvgSamplesPerSec=11.983795160412427, CurrSamplesPerSec=11.985642882073503, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:33:29,068] [INFO] [timer.py:197:stop] 0/1144, RunningAvgSamplesPerSec=11.983642181355224, CurrSamplesPerSec=11.811601156433586, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:33:35,587] [INFO] [timer.py:197:stop] 0/1145, RunningAvgSamplesPerSec=11.983497418812181, CurrSamplesPerSec=11.820430165434662, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:33:42,202] [INFO] [timer.py:197:stop] 0/1146, RunningAvgSamplesPerSec=11.983362810439864, CurrSamplesPerSec=11.831457497246486, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:33:48,816] [INFO] [timer.py:197:stop] 0/1147, RunningAvgSamplesPerSec=11.983268370474876, CurrSamplesPerSec=11.87619524929991, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:33:55,408] [INFO] [timer.py:197:stop] 0/1148, RunningAvgSamplesPerSec=11.983148019782835, CurrSamplesPerSec=11.846914478384686, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:34:02,064] [INFO] [timer.py:197:stop] 0/1149, RunningAvgSamplesPerSec=11.98304639577358, CurrSamplesPerSec=11.86770722042221, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:34:08,629] [INFO] [logging.py:68:log_dist] [Rank 0] step=1150, skipped=4, lr=[8.566666666666667e-06], mom=[[0.9, 0.999]] [2022-12-19 20:34:08,630] [INFO] [timer.py:197:stop] 0/1150, RunningAvgSamplesPerSec=11.983018497086025, CurrSamplesPerSec=11.951104002142722, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0012, 'learning_rate': 8.566666666666667e-06, 'epoch': 30.26} [2022-12-19 20:34:15,257] [INFO] [timer.py:197:stop] 0/1151, RunningAvgSamplesPerSec=11.983000718438793, CurrSamplesPerSec=11.962625565353255, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:34:21,819] [INFO] [timer.py:197:stop] 0/1152, RunningAvgSamplesPerSec=11.98287621534954, CurrSamplesPerSec=11.84151127838786, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:34:28,297] [INFO] [timer.py:197:stop] 0/1153, RunningAvgSamplesPerSec=11.982776499954051, CurrSamplesPerSec=11.86919172327849, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:34:34,908] [INFO] [timer.py:197:stop] 0/1154, RunningAvgSamplesPerSec=11.982544233232568, CurrSamplesPerSec=11.721044547709528, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:34:41,370] [INFO] [timer.py:197:stop] 0/1155, RunningAvgSamplesPerSec=11.982519212412615, CurrSamplesPerSec=11.953764457513275, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:34:47,996] [INFO] [timer.py:197:stop] 0/1156, RunningAvgSamplesPerSec=11.982376978385444, CurrSamplesPerSec=11.820597250585756, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:34:54,603] [INFO] [timer.py:197:stop] 0/1157, RunningAvgSamplesPerSec=11.98231624431853, CurrSamplesPerSec=11.912637052673883, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:35:01,273] [INFO] [timer.py:197:stop] 0/1158, RunningAvgSamplesPerSec=11.98219661869138, CurrSamplesPerSec=11.845605426194615, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:35:07,737] [INFO] [timer.py:197:stop] 0/1159, RunningAvgSamplesPerSec=11.982106225146978, CurrSamplesPerSec=11.878515475497581, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:35:14,307] [INFO] [logging.py:68:log_dist] [Rank 0] step=1160, skipped=4, lr=[8.544444444444445e-06], mom=[[0.9, 0.999]] [2022-12-19 20:35:14,308] [INFO] [timer.py:197:stop] 0/1160, RunningAvgSamplesPerSec=11.982052437421588, CurrSamplesPerSec=11.920141868253204, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:35:20,896] [INFO] [timer.py:197:stop] 0/1161, RunningAvgSamplesPerSec=11.982022948531892, CurrSamplesPerSec=11.947971941641955, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:35:27,480] [INFO] [timer.py:197:stop] 0/1162, RunningAvgSamplesPerSec=11.981941584554216, CurrSamplesPerSec=11.888377740420646, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:35:34,021] [INFO] [timer.py:197:stop] 0/1163, RunningAvgSamplesPerSec=11.981828571617315, CurrSamplesPerSec=11.852153581679097, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:35:40,500] [INFO] [timer.py:197:stop] 0/1164, RunningAvgSamplesPerSec=11.981818350107048, CurrSamplesPerSec=11.969962928784337, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:35:46,975] [INFO] [timer.py:197:stop] 0/1165, RunningAvgSamplesPerSec=11.981797599324471, CurrSamplesPerSec=11.957733658400556, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:35:53,529] [INFO] [timer.py:197:stop] 0/1166, RunningAvgSamplesPerSec=11.981706216213002, CurrSamplesPerSec=11.876362863802171, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:36:00,083] [INFO] [timer.py:197:stop] 0/1167, RunningAvgSamplesPerSec=11.981603804207829, CurrSamplesPerSec=11.863571567376075, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:36:06,615] [INFO] [timer.py:197:stop] 0/1168, RunningAvgSamplesPerSec=11.981483498260122, CurrSamplesPerSec=11.842949002455985, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:36:13,122] [INFO] [timer.py:197:stop] 0/1169, RunningAvgSamplesPerSec=11.981476884649963, CurrSamplesPerSec=11.973770379484487, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:36:19,615] [INFO] [logging.py:68:log_dist] [Rank 0] step=1170, skipped=4, lr=[8.522222222222222e-06], mom=[[0.9, 0.999]] [2022-12-19 20:36:19,615] [INFO] [timer.py:197:stop] 0/1170, RunningAvgSamplesPerSec=11.981373872790066, CurrSamplesPerSec=11.862354236380279, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:36:26,095] [INFO] [timer.py:197:stop] 0/1171, RunningAvgSamplesPerSec=11.981353364133042, CurrSamplesPerSec=11.95744708904912, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:36:32,603] [INFO] [timer.py:197:stop] 0/1172, RunningAvgSamplesPerSec=11.981262123949378, CurrSamplesPerSec=11.875544277343158, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:36:39,142] [INFO] [timer.py:197:stop] 0/1173, RunningAvgSamplesPerSec=11.981175527761696, CurrSamplesPerSec=11.880708304730899, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:36:45,664] [INFO] [timer.py:197:stop] 0/1174, RunningAvgSamplesPerSec=11.98100772761157, CurrSamplesPerSec=11.787687007109003, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:36:52,202] [INFO] [timer.py:197:stop] 0/1175, RunningAvgSamplesPerSec=11.980889670284203, CurrSamplesPerSec=11.844107481050056, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0018, 'learning_rate': 8.511111111111113e-06, 'epoch': 30.92} [2022-12-19 20:36:58,653] [INFO] [timer.py:197:stop] 0/1176, RunningAvgSamplesPerSec=11.980851775483846, CurrSamplesPerSec=11.936565622429242, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:37:05,135] [INFO] [timer.py:197:stop] 0/1177, RunningAvgSamplesPerSec=11.980828736241332, CurrSamplesPerSec=11.953841643829785, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:37:09,720] [INFO] [timer.py:197:stop] 0/1178, RunningAvgSamplesPerSec=11.983651838209251, CurrSamplesPerSec=16.571943479110132, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:37:16,170] [INFO] [timer.py:197:stop] 0/1179, RunningAvgSamplesPerSec=11.983557564584745, CurrSamplesPerSec=11.873708910240557, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:37:22,696] [INFO] [logging.py:68:log_dist] [Rank 0] step=1180, skipped=4, lr=[8.5e-06], mom=[[0.9, 0.999]] [2022-12-19 20:37:22,696] [INFO] [timer.py:197:stop] 0/1180, RunningAvgSamplesPerSec=11.983473660424366, CurrSamplesPerSec=11.885526328431732, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:37:29,168] [INFO] [timer.py:197:stop] 0/1181, RunningAvgSamplesPerSec=11.983389066703026, CurrSamplesPerSec=11.884560201182039, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:37:35,749] [INFO] [timer.py:197:stop] 0/1182, RunningAvgSamplesPerSec=11.983305715560126, CurrSamplesPerSec=11.885834722376888, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:37:42,249] [INFO] [timer.py:197:stop] 0/1183, RunningAvgSamplesPerSec=11.983221743493917, CurrSamplesPerSec=11.884948001365437, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:37:48,736] [INFO] [timer.py:197:stop] 0/1184, RunningAvgSamplesPerSec=11.98319031232334, CurrSamplesPerSec=11.946184828462272, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:37:55,210] [INFO] [timer.py:197:stop] 0/1185, RunningAvgSamplesPerSec=11.983109232986033, CurrSamplesPerSec=11.88803446684912, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:38:01,707] [INFO] [timer.py:197:stop] 0/1186, RunningAvgSamplesPerSec=11.98309141188603, CurrSamplesPerSec=11.962046107775118, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:38:08,258] [INFO] [timer.py:197:stop] 0/1187, RunningAvgSamplesPerSec=11.982982258808779, CurrSamplesPerSec=11.85512512747876, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:38:14,803] [INFO] [timer.py:197:stop] 0/1188, RunningAvgSamplesPerSec=11.982860509845038, CurrSamplesPerSec=11.84030578182275, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:38:21,350] [INFO] [timer.py:197:stop] 0/1189, RunningAvgSamplesPerSec=11.982769359606076, CurrSamplesPerSec=11.875632540544908, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:38:27,860] [INFO] [logging.py:68:log_dist] [Rank 0] step=1190, skipped=4, lr=[8.477777777777778e-06], mom=[[0.9, 0.999]] [2022-12-19 20:38:27,860] [INFO] [timer.py:197:stop] 0/1190, RunningAvgSamplesPerSec=11.982669556012967, CurrSamplesPerSec=11.86536341716228, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:38:34,364] [INFO] [timer.py:197:stop] 0/1191, RunningAvgSamplesPerSec=11.982653018131796, CurrSamplesPerSec=11.96303820322143, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:38:40,894] [INFO] [timer.py:197:stop] 0/1192, RunningAvgSamplesPerSec=11.982540833573823, CurrSamplesPerSec=11.850623112991185, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:38:47,389] [INFO] [timer.py:197:stop] 0/1193, RunningAvgSamplesPerSec=11.982434497647024, CurrSamplesPerSec=11.857218194927503, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:38:53,870] [INFO] [timer.py:197:stop] 0/1194, RunningAvgSamplesPerSec=11.982350311825856, CurrSamplesPerSec=11.882917724452678, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:39:00,361] [INFO] [timer.py:197:stop] 0/1195, RunningAvgSamplesPerSec=11.982278592468262, CurrSamplesPerSec=11.897395240074921, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:39:06,842] [INFO] [timer.py:197:stop] 0/1196, RunningAvgSamplesPerSec=11.982260793122816, CurrSamplesPerSec=11.96106377024958, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:39:13,315] [INFO] [timer.py:197:stop] 0/1197, RunningAvgSamplesPerSec=11.982224742335571, CurrSamplesPerSec=11.939334309910674, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:39:19,879] [INFO] [timer.py:197:stop] 0/1198, RunningAvgSamplesPerSec=11.982119611390425, CurrSamplesPerSec=11.85779277878057, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:39:26,384] [INFO] [timer.py:197:stop] 0/1199, RunningAvgSamplesPerSec=11.982026234428412, CurrSamplesPerSec=11.871379536117683, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:39:32,818] [INFO] [logging.py:68:log_dist] [Rank 0] step=1200, skipped=4, lr=[8.455555555555555e-06], mom=[[0.9, 0.999]] [2022-12-19 20:39:32,819] [INFO] [timer.py:197:stop] 0/1200, RunningAvgSamplesPerSec=11.982003991505959, CurrSamplesPerSec=11.955438293323832, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.002, 'learning_rate': 8.455555555555555e-06, 'epoch': 31.58} [2022-12-19 20:39:39,385] [INFO] [timer.py:197:stop] 0/1201, RunningAvgSamplesPerSec=11.981959454006642, CurrSamplesPerSec=11.928840268151928, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:39:45,891] [INFO] [timer.py:197:stop] 0/1202, RunningAvgSamplesPerSec=11.981834504291225, CurrSamplesPerSec=11.833871393300594, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:39:52,408] [INFO] [timer.py:197:stop] 0/1203, RunningAvgSamplesPerSec=11.981777317064232, CurrSamplesPerSec=11.913543772518295, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:39:58,868] [INFO] [timer.py:197:stop] 0/1204, RunningAvgSamplesPerSec=11.981741398124388, CurrSamplesPerSec=11.938757637471824, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:40:05,328] [INFO] [timer.py:197:stop] 0/1205, RunningAvgSamplesPerSec=11.981722265317478, CurrSamplesPerSec=11.958768724929273, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:40:11,940] [INFO] [timer.py:197:stop] 0/1206, RunningAvgSamplesPerSec=11.98153435333365, CurrSamplesPerSec=11.759665757690973, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:40:18,511] [INFO] [timer.py:197:stop] 0/1207, RunningAvgSamplesPerSec=11.981397180435588, CurrSamplesPerSec=11.818488472331923, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:40:24,999] [INFO] [timer.py:197:stop] 0/1208, RunningAvgSamplesPerSec=11.98138254776994, CurrSamplesPerSec=11.9637761176096, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:40:31,551] [INFO] [timer.py:197:stop] 0/1209, RunningAvgSamplesPerSec=11.981309264816481, CurrSamplesPerSec=11.89357770491536, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:40:38,023] [INFO] [logging.py:68:log_dist] [Rank 0] step=1210, skipped=4, lr=[8.433333333333334e-06], mom=[[0.9, 0.999]] [2022-12-19 20:40:38,024] [INFO] [timer.py:197:stop] 0/1210, RunningAvgSamplesPerSec=11.98130032100117, CurrSamplesPerSec=11.970514861701066, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:40:44,476] [INFO] [timer.py:197:stop] 0/1211, RunningAvgSamplesPerSec=11.981296643716709, CurrSamplesPerSec=11.9768561318044, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:40:51,008] [INFO] [timer.py:197:stop] 0/1212, RunningAvgSamplesPerSec=11.981189946975894, CurrSamplesPerSec=11.853568766633485, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:40:57,510] [INFO] [timer.py:197:stop] 0/1213, RunningAvgSamplesPerSec=11.98116302211649, CurrSamplesPerSec=11.948672363584794, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:41:03,999] [INFO] [timer.py:197:stop] 0/1214, RunningAvgSamplesPerSec=11.981132546321492, CurrSamplesPerSec=11.944339787067314, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:41:10,486] [INFO] [timer.py:197:stop] 0/1215, RunningAvgSamplesPerSec=11.981122011325395, CurrSamplesPerSec=11.968367200215615, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:41:15,194] [INFO] [timer.py:197:stop] 0/1216, RunningAvgSamplesPerSec=11.98377398676904, CurrSamplesPerSec=16.38230717492388, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:41:21,646] [INFO] [timer.py:197:stop] 0/1217, RunningAvgSamplesPerSec=11.98377590691503, CurrSamplesPerSec=11.98610741804097, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:41:28,166] [INFO] [timer.py:197:stop] 0/1218, RunningAvgSamplesPerSec=11.983687426267121, CurrSamplesPerSec=11.877140047408483, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:41:34,700] [INFO] [timer.py:197:stop] 0/1219, RunningAvgSamplesPerSec=11.983554574278827, CurrSamplesPerSec=11.824157130220911, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:41:41,289] [INFO] [logging.py:68:log_dist] [Rank 0] step=1220, skipped=4, lr=[8.411111111111112e-06], mom=[[0.9, 0.999]] [2022-12-19 20:41:41,290] [INFO] [timer.py:197:stop] 0/1220, RunningAvgSamplesPerSec=11.98334449449036, CurrSamplesPerSec=11.733022452220474, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:41:47,835] [INFO] [timer.py:197:stop] 0/1221, RunningAvgSamplesPerSec=11.98323299985034, CurrSamplesPerSec=11.848955483033205, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:41:54,297] [INFO] [timer.py:197:stop] 0/1222, RunningAvgSamplesPerSec=11.983179734167033, CurrSamplesPerSec=11.918599083579014, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:42:00,758] [INFO] [timer.py:197:stop] 0/1223, RunningAvgSamplesPerSec=11.983126156209496, CurrSamplesPerSec=11.918115953900744, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:42:07,834] [INFO] [timer.py:197:stop] 0/1224, RunningAvgSamplesPerSec=11.982935811841763, CurrSamplesPerSec=11.754950757684737, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:42:14,640] [INFO] [timer.py:197:stop] 0/1225, RunningAvgSamplesPerSec=11.982872818161079, CurrSamplesPerSec=11.906386293815945, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0006, 'learning_rate': 8.400000000000001e-06, 'epoch': 32.24} [2022-12-19 20:42:21,758] [INFO] [timer.py:197:stop] 0/1226, RunningAvgSamplesPerSec=11.982734283927137, CurrSamplesPerSec=11.815671003613076, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:42:28,688] [INFO] [timer.py:197:stop] 0/1227, RunningAvgSamplesPerSec=11.98265683640482, CurrSamplesPerSec=11.888605722934374, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:42:35,216] [INFO] [timer.py:197:stop] 0/1228, RunningAvgSamplesPerSec=11.982556169348781, CurrSamplesPerSec=11.860496217558161, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:42:41,676] [INFO] [timer.py:197:stop] 0/1229, RunningAvgSamplesPerSec=11.982549777051112, CurrSamplesPerSec=11.974717946544676, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:42:48,328] [INFO] [logging.py:68:log_dist] [Rank 0] step=1230, skipped=4, lr=[8.38888888888889e-06], mom=[[0.9, 0.999]] [2022-12-19 20:42:48,329] [INFO] [timer.py:197:stop] 0/1230, RunningAvgSamplesPerSec=11.98247138425705, CurrSamplesPerSec=11.887050034505089, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:42:55,016] [INFO] [timer.py:197:stop] 0/1231, RunningAvgSamplesPerSec=11.982365578513475, CurrSamplesPerSec=11.853831010719146, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:43:01,602] [INFO] [timer.py:197:stop] 0/1232, RunningAvgSamplesPerSec=11.982280253443436, CurrSamplesPerSec=11.8783262498329, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:43:08,136] [INFO] [timer.py:197:stop] 0/1233, RunningAvgSamplesPerSec=11.982263102006524, CurrSamplesPerSec=11.961203942023985, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:43:14,599] [INFO] [timer.py:197:stop] 0/1234, RunningAvgSamplesPerSec=11.982246405651749, CurrSamplesPerSec=11.961728416131887, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:43:21,107] [INFO] [timer.py:197:stop] 0/1235, RunningAvgSamplesPerSec=11.982148928874139, CurrSamplesPerSec=11.863250171914858, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:43:27,674] [INFO] [timer.py:197:stop] 0/1236, RunningAvgSamplesPerSec=11.982056914106604, CurrSamplesPerSec=11.869667745378493, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:43:34,343] [INFO] [timer.py:197:stop] 0/1237, RunningAvgSamplesPerSec=11.981962774457578, CurrSamplesPerSec=11.866910810990031, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:43:40,964] [INFO] [timer.py:197:stop] 0/1238, RunningAvgSamplesPerSec=11.981865549826223, CurrSamplesPerSec=11.862985413858105, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:43:47,498] [INFO] [timer.py:197:stop] 0/1239, RunningAvgSamplesPerSec=11.981736933104596, CurrSamplesPerSec=11.82484988190588, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:43:53,996] [INFO] [logging.py:68:log_dist] [Rank 0] step=1240, skipped=4, lr=[8.366666666666667e-06], mom=[[0.9, 0.999]] [2022-12-19 20:43:53,997] [INFO] [timer.py:197:stop] 0/1240, RunningAvgSamplesPerSec=11.981608014151497, CurrSamplesPerSec=11.824231610154246, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:44:00,565] [INFO] [timer.py:197:stop] 0/1241, RunningAvgSamplesPerSec=11.98146623336186, CurrSamplesPerSec=11.808477886332899, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:44:07,150] [INFO] [timer.py:197:stop] 0/1242, RunningAvgSamplesPerSec=11.981387252550075, CurrSamplesPerSec=11.884323429252923, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:44:13,974] [INFO] [timer.py:197:stop] 0/1243, RunningAvgSamplesPerSec=11.981213723337861, CurrSamplesPerSec=11.769836779100093, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:44:20,724] [INFO] [timer.py:197:stop] 0/1244, RunningAvgSamplesPerSec=11.981153112552091, CurrSamplesPerSec=11.906404777510264, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:44:27,448] [INFO] [timer.py:197:stop] 0/1245, RunningAvgSamplesPerSec=11.980973489335499, CurrSamplesPerSec=11.761962831585812, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:44:33,979] [INFO] [timer.py:197:stop] 0/1246, RunningAvgSamplesPerSec=11.980883101264363, CurrSamplesPerSec=11.86957537206248, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:44:40,437] [INFO] [timer.py:197:stop] 0/1247, RunningAvgSamplesPerSec=11.980807918859055, CurrSamplesPerSec=11.88800603710481, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:44:46,926] [INFO] [timer.py:197:stop] 0/1248, RunningAvgSamplesPerSec=11.980719178511743, CurrSamplesPerSec=11.87124776161013, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:44:53,533] [INFO] [timer.py:197:stop] 0/1249, RunningAvgSamplesPerSec=11.980584984854133, CurrSamplesPerSec=11.815682965651055, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:45:00,122] [INFO] [logging.py:68:log_dist] [Rank 0] step=1250, skipped=4, lr=[8.344444444444445e-06], mom=[[0.9, 0.999]] [2022-12-19 20:45:00,122] [INFO] [timer.py:197:stop] 0/1250, RunningAvgSamplesPerSec=11.980469383005653, CurrSamplesPerSec=11.838029170272584, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0007, 'learning_rate': 8.344444444444445e-06, 'epoch': 32.89} [2022-12-19 20:45:06,693] [INFO] [timer.py:197:stop] 0/1251, RunningAvgSamplesPerSec=11.980432260587483, CurrSamplesPerSec=11.934282090142345, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:45:13,174] [INFO] [timer.py:197:stop] 0/1252, RunningAvgSamplesPerSec=11.980419580155617, CurrSamplesPerSec=11.96460264714406, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:45:19,683] [INFO] [timer.py:197:stop] 0/1253, RunningAvgSamplesPerSec=11.980334112668478, CurrSamplesPerSec=11.874444774934677, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:45:24,334] [INFO] [timer.py:197:stop] 0/1254, RunningAvgSamplesPerSec=11.982997789727133, CurrSamplesPerSec=16.600264443646424, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:45:30,885] [INFO] [timer.py:197:stop] 0/1255, RunningAvgSamplesPerSec=11.982908900021213, CurrSamplesPerSec=11.872643880069052, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:45:37,370] [INFO] [timer.py:197:stop] 0/1256, RunningAvgSamplesPerSec=11.982895529253335, CurrSamplesPerSec=11.966165366610214, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:45:43,906] [INFO] [timer.py:197:stop] 0/1257, RunningAvgSamplesPerSec=11.982794236600904, CurrSamplesPerSec=11.857106636788114, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:45:50,505] [INFO] [timer.py:197:stop] 0/1258, RunningAvgSamplesPerSec=11.982619963410357, CurrSamplesPerSec=11.767830676094007, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:45:57,133] [INFO] [timer.py:197:stop] 0/1259, RunningAvgSamplesPerSec=11.98253194058262, CurrSamplesPerSec=11.872986790525724, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:46:03,664] [INFO] [logging.py:68:log_dist] [Rank 0] step=1260, skipped=4, lr=[8.322222222222223e-06], mom=[[0.9, 0.999]] [2022-12-19 20:46:03,665] [INFO] [timer.py:197:stop] 0/1260, RunningAvgSamplesPerSec=11.98244462787346, CurrSamplesPerSec=11.87368947750547, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:46:10,304] [INFO] [timer.py:197:stop] 0/1261, RunningAvgSamplesPerSec=11.98239311790432, CurrSamplesPerSec=11.917942396613407, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:46:16,842] [INFO] [timer.py:197:stop] 0/1262, RunningAvgSamplesPerSec=11.982294977999478, CurrSamplesPerSec=11.859998922844328, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:46:23,352] [INFO] [timer.py:197:stop] 0/1263, RunningAvgSamplesPerSec=11.98226604288631, CurrSamplesPerSec=11.945918482189382, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:46:30,004] [INFO] [timer.py:197:stop] 0/1264, RunningAvgSamplesPerSec=11.98218044289338, CurrSamplesPerSec=11.875203320512403, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:46:36,713] [INFO] [timer.py:197:stop] 0/1265, RunningAvgSamplesPerSec=11.982050180137481, CurrSamplesPerSec=11.819885219591857, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:46:43,335] [INFO] [timer.py:197:stop] 0/1266, RunningAvgSamplesPerSec=11.98200196888076, CurrSamplesPerSec=11.9214192680114, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:46:49,984] [INFO] [timer.py:197:stop] 0/1267, RunningAvgSamplesPerSec=11.981934072631743, CurrSamplesPerSec=11.896724015573078, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:46:56,548] [INFO] [timer.py:197:stop] 0/1268, RunningAvgSamplesPerSec=11.981894965458023, CurrSamplesPerSec=11.932627963976225, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:47:02,998] [INFO] [timer.py:197:stop] 0/1269, RunningAvgSamplesPerSec=11.981863632755143, CurrSamplesPerSec=11.942327423017975, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:47:09,662] [INFO] [logging.py:68:log_dist] [Rank 0] step=1270, skipped=4, lr=[8.3e-06], mom=[[0.9, 0.999]] [2022-12-19 20:47:09,662] [INFO] [timer.py:197:stop] 0/1270, RunningAvgSamplesPerSec=11.981776244125713, CurrSamplesPerSec=11.872069431287555, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:47:16,169] [INFO] [timer.py:197:stop] 0/1271, RunningAvgSamplesPerSec=11.981696376002601, CurrSamplesPerSec=11.8812730722525, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:47:22,796] [INFO] [timer.py:197:stop] 0/1272, RunningAvgSamplesPerSec=11.981645235609093, CurrSamplesPerSec=11.917097964385459, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:47:29,258] [INFO] [timer.py:197:stop] 0/1273, RunningAvgSamplesPerSec=11.981606347722328, CurrSamplesPerSec=11.932421628258146, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:47:35,739] [INFO] [timer.py:197:stop] 0/1274, RunningAvgSamplesPerSec=11.981593164962943, CurrSamplesPerSec=11.964861294360137, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:47:42,333] [INFO] [timer.py:197:stop] 0/1275, RunningAvgSamplesPerSec=11.981484290314098, CurrSamplesPerSec=11.844579402649623, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.001, 'learning_rate': 8.288888888888889e-06, 'epoch': 33.55} [2022-12-19 20:47:48,849] [INFO] [timer.py:197:stop] 0/1276, RunningAvgSamplesPerSec=11.981440753744616, CurrSamplesPerSec=11.926274083426508, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:47:55,430] [INFO] [timer.py:197:stop] 0/1277, RunningAvgSamplesPerSec=11.981421410905845, CurrSamplesPerSec=11.956829253908664, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:48:02,050] [INFO] [timer.py:197:stop] 0/1278, RunningAvgSamplesPerSec=11.981294171029782, CurrSamplesPerSec=11.821232321356948, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:48:08,669] [INFO] [timer.py:197:stop] 0/1279, RunningAvgSamplesPerSec=11.981142672860056, CurrSamplesPerSec=11.790902868399586, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:48:15,128] [INFO] [logging.py:68:log_dist] [Rank 0] step=1280, skipped=4, lr=[8.277777777777778e-06], mom=[[0.9, 0.999]] [2022-12-19 20:48:15,129] [INFO] [timer.py:197:stop] 0/1280, RunningAvgSamplesPerSec=11.981020174666739, CurrSamplesPerSec=11.826607647764495, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:48:21,631] [INFO] [timer.py:197:stop] 0/1281, RunningAvgSamplesPerSec=11.980931788475987, CurrSamplesPerSec=11.869030083884477, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:48:28,126] [INFO] [timer.py:197:stop] 0/1282, RunningAvgSamplesPerSec=11.980859183829011, CurrSamplesPerSec=11.888712609202047, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:48:34,671] [INFO] [timer.py:197:stop] 0/1283, RunningAvgSamplesPerSec=11.980820926373642, CurrSamplesPerSec=11.93205087840788, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:48:41,325] [INFO] [timer.py:197:stop] 0/1284, RunningAvgSamplesPerSec=11.980667761984284, CurrSamplesPerSec=11.787627997873237, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:48:47,994] [INFO] [timer.py:197:stop] 0/1285, RunningAvgSamplesPerSec=11.980551921877437, CurrSamplesPerSec=11.833864611314183, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:48:54,708] [INFO] [timer.py:197:stop] 0/1286, RunningAvgSamplesPerSec=11.980437134360775, CurrSamplesPerSec=11.83495452410963, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:49:01,270] [INFO] [timer.py:197:stop] 0/1287, RunningAvgSamplesPerSec=11.980363238295388, CurrSamplesPerSec=11.886226816284637, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:49:07,741] [INFO] [timer.py:197:stop] 0/1288, RunningAvgSamplesPerSec=11.980352375295462, CurrSamplesPerSec=11.966409678423375, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:49:14,182] [INFO] [timer.py:197:stop] 0/1289, RunningAvgSamplesPerSec=11.980318748390916, CurrSamplesPerSec=11.937230202646406, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:49:20,696] [INFO] [logging.py:68:log_dist] [Rank 0] step=1290, skipped=4, lr=[8.255555555555557e-06], mom=[[0.9, 0.999]] [2022-12-19 20:49:20,697] [INFO] [timer.py:197:stop] 0/1290, RunningAvgSamplesPerSec=11.98022992436918, CurrSamplesPerSec=11.866994748973712, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:49:27,301] [INFO] [timer.py:197:stop] 0/1291, RunningAvgSamplesPerSec=11.980212720298908, CurrSamplesPerSec=11.958094819265064, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:49:31,958] [INFO] [timer.py:197:stop] 0/1292, RunningAvgSamplesPerSec=11.982739048768192, CurrSamplesPerSec=16.455692341267067, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:49:38,500] [INFO] [timer.py:197:stop] 0/1293, RunningAvgSamplesPerSec=11.982663642654977, CurrSamplesPerSec=11.886173658413552, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:49:45,119] [INFO] [timer.py:197:stop] 0/1294, RunningAvgSamplesPerSec=11.982641670851127, CurrSamplesPerSec=11.954343113001228, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:49:51,755] [INFO] [timer.py:197:stop] 0/1295, RunningAvgSamplesPerSec=11.982543359061138, CurrSamplesPerSec=11.856857864460004, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:49:58,246] [INFO] [timer.py:197:stop] 0/1296, RunningAvgSamplesPerSec=11.982502757456462, CurrSamplesPerSec=11.930234059919286, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:50:04,736] [INFO] [timer.py:197:stop] 0/1297, RunningAvgSamplesPerSec=11.982411152741047, CurrSamplesPerSec=11.865036680176907, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:50:11,270] [INFO] [timer.py:197:stop] 0/1298, RunningAvgSamplesPerSec=11.982319518278723, CurrSamplesPerSec=11.864817467053735, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:50:17,919] [INFO] [timer.py:197:stop] 0/1299, RunningAvgSamplesPerSec=11.982227374135174, CurrSamplesPerSec=11.863987886894911, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:50:24,572] [INFO] [logging.py:68:log_dist] [Rank 0] step=1300, skipped=4, lr=[8.233333333333335e-06], mom=[[0.9, 0.999]] [2022-12-19 20:50:24,573] [INFO] [timer.py:197:stop] 0/1300, RunningAvgSamplesPerSec=11.982082767948798, CurrSamplesPerSec=11.797421259098504, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0006, 'learning_rate': 8.233333333333335e-06, 'epoch': 34.21} [2022-12-19 20:50:31,131] [INFO] [timer.py:197:stop] 0/1301, RunningAvgSamplesPerSec=11.981996598803468, CurrSamplesPerSec=11.871184237892582, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:50:37,581] [INFO] [timer.py:197:stop] 0/1302, RunningAvgSamplesPerSec=11.98198003402118, CurrSamplesPerSec=11.960500984338625, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:50:44,010] [INFO] [timer.py:197:stop] 0/1303, RunningAvgSamplesPerSec=11.981965753288026, CurrSamplesPerSec=11.963429542434303, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:50:50,448] [INFO] [timer.py:197:stop] 0/1304, RunningAvgSamplesPerSec=11.981953493768843, CurrSamplesPerSec=11.966025078527812, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:50:56,920] [INFO] [timer.py:197:stop] 0/1305, RunningAvgSamplesPerSec=11.981953382586486, CurrSamplesPerSec=11.98180862490573, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:51:03,419] [INFO] [timer.py:197:stop] 0/1306, RunningAvgSamplesPerSec=11.981937821419917, CurrSamplesPerSec=11.961695901663962, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:51:09,975] [INFO] [timer.py:197:stop] 0/1307, RunningAvgSamplesPerSec=11.981793346792758, CurrSamplesPerSec=11.796316994252471, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:51:16,448] [INFO] [timer.py:197:stop] 0/1308, RunningAvgSamplesPerSec=11.981776623629766, CurrSamplesPerSec=11.959992603962228, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:51:22,977] [INFO] [timer.py:197:stop] 0/1309, RunningAvgSamplesPerSec=11.981699511446344, CurrSamplesPerSec=11.88183105639384, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:51:29,453] [INFO] [logging.py:68:log_dist] [Rank 0] step=1310, skipped=4, lr=[8.211111111111112e-06], mom=[[0.9, 0.999]] [2022-12-19 20:51:29,453] [INFO] [timer.py:197:stop] 0/1310, RunningAvgSamplesPerSec=11.98162573595762, CurrSamplesPerSec=11.885971557580568, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:51:35,948] [INFO] [timer.py:197:stop] 0/1311, RunningAvgSamplesPerSec=11.981554106638947, CurrSamplesPerSec=11.888590453624456, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:51:42,388] [INFO] [timer.py:197:stop] 0/1312, RunningAvgSamplesPerSec=11.981542014148465, CurrSamplesPerSec=11.96573384457058, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:51:48,974] [INFO] [timer.py:197:stop] 0/1313, RunningAvgSamplesPerSec=11.981420340647931, CurrSamplesPerSec=11.824122234351163, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:51:55,515] [INFO] [timer.py:197:stop] 0/1314, RunningAvgSamplesPerSec=11.981322356125432, CurrSamplesPerSec=11.854228324763485, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:52:01,990] [INFO] [timer.py:197:stop] 0/1315, RunningAvgSamplesPerSec=11.981251431797574, CurrSamplesPerSec=11.88891638292862, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:52:08,476] [INFO] [timer.py:197:stop] 0/1316, RunningAvgSamplesPerSec=11.981150262367564, CurrSamplesPerSec=11.849772500293115, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:52:14,988] [INFO] [timer.py:197:stop] 0/1317, RunningAvgSamplesPerSec=11.981052436266456, CurrSamplesPerSec=11.853874457466528, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:52:21,533] [INFO] [timer.py:197:stop] 0/1318, RunningAvgSamplesPerSec=11.980994209853916, CurrSamplesPerSec=11.904913063925196, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:52:28,044] [INFO] [timer.py:197:stop] 0/1319, RunningAvgSamplesPerSec=11.980891813117992, CurrSamplesPerSec=11.847637613275865, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:52:34,521] [INFO] [logging.py:68:log_dist] [Rank 0] step=1320, skipped=4, lr=[8.18888888888889e-06], mom=[[0.9, 0.999]] [2022-12-19 20:52:34,522] [INFO] [timer.py:197:stop] 0/1320, RunningAvgSamplesPerSec=11.980870076630007, CurrSamplesPerSec=11.952311411516, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:52:41,042] [INFO] [timer.py:197:stop] 0/1321, RunningAvgSamplesPerSec=11.980807762554338, CurrSamplesPerSec=11.899237410161874, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:52:47,509] [INFO] [timer.py:197:stop] 0/1322, RunningAvgSamplesPerSec=11.980719152652279, CurrSamplesPerSec=11.864972698476123, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:52:54,064] [INFO] [timer.py:197:stop] 0/1323, RunningAvgSamplesPerSec=11.980632959136972, CurrSamplesPerSec=11.867928639135295, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:53:00,542] [INFO] [timer.py:197:stop] 0/1324, RunningAvgSamplesPerSec=11.980625708368606, CurrSamplesPerSec=11.971055100652617, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:53:07,032] [INFO] [timer.py:197:stop] 0/1325, RunningAvgSamplesPerSec=11.98055396862151, CurrSamplesPerSec=11.886459453353536, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0006, 'learning_rate': 8.177777777777779e-06, 'epoch': 34.87} [2022-12-19 20:53:13,618] [INFO] [timer.py:197:stop] 0/1326, RunningAvgSamplesPerSec=11.980382775213341, CurrSamplesPerSec=11.758099336158116, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:53:20,057] [INFO] [timer.py:197:stop] 0/1327, RunningAvgSamplesPerSec=11.980372788585434, CurrSamplesPerSec=11.967165081107849, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:53:26,487] [INFO] [timer.py:197:stop] 0/1328, RunningAvgSamplesPerSec=11.980397514971513, CurrSamplesPerSec=12.013249884794993, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:53:32,967] [INFO] [timer.py:197:stop] 0/1329, RunningAvgSamplesPerSec=11.980386829147962, CurrSamplesPerSec=11.966234178316936, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:53:37,567] [INFO] [logging.py:68:log_dist] [Rank 0] step=1330, skipped=4, lr=[8.166666666666668e-06], mom=[[0.9, 0.999]] [2022-12-19 20:53:37,568] [INFO] [timer.py:197:stop] 0/1330, RunningAvgSamplesPerSec=11.982902537478143, CurrSamplesPerSec=16.61179265403995, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:53:44,075] [INFO] [timer.py:197:stop] 0/1331, RunningAvgSamplesPerSec=11.982813860615586, CurrSamplesPerSec=11.866197910613801, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:53:50,550] [INFO] [timer.py:197:stop] 0/1332, RunningAvgSamplesPerSec=11.982792845898802, CurrSamplesPerSec=11.954929278386402, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:53:57,104] [INFO] [timer.py:197:stop] 0/1333, RunningAvgSamplesPerSec=11.982704996363076, CurrSamplesPerSec=11.866994224357626, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:54:03,550] [INFO] [timer.py:197:stop] 0/1334, RunningAvgSamplesPerSec=11.982693609376577, CurrSamplesPerSec=11.967556690371545, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:54:10,039] [INFO] [timer.py:197:stop] 0/1335, RunningAvgSamplesPerSec=11.982608768783983, CurrSamplesPerSec=11.870657699696109, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:54:16,496] [INFO] [timer.py:197:stop] 0/1336, RunningAvgSamplesPerSec=11.982606043819173, CurrSamplesPerSec=11.978974767328886, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:54:22,952] [INFO] [timer.py:197:stop] 0/1337, RunningAvgSamplesPerSec=11.982592855578439, CurrSamplesPerSec=11.9650255544805, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:54:29,441] [INFO] [timer.py:197:stop] 0/1338, RunningAvgSamplesPerSec=11.98252102575826, CurrSamplesPerSec=11.887390092699654, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:54:35,984] [INFO] [timer.py:197:stop] 0/1339, RunningAvgSamplesPerSec=11.982408428346377, CurrSamplesPerSec=11.833844787090786, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:54:42,505] [INFO] [logging.py:68:log_dist] [Rank 0] step=1340, skipped=4, lr=[8.144444444444445e-06], mom=[[0.9, 0.999]] [2022-12-19 20:54:42,506] [INFO] [timer.py:197:stop] 0/1340, RunningAvgSamplesPerSec=11.982398525079391, CurrSamplesPerSec=11.969172482957775, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:54:49,020] [INFO] [timer.py:197:stop] 0/1341, RunningAvgSamplesPerSec=11.982367203576372, CurrSamplesPerSec=11.940605203743695, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:54:55,476] [INFO] [timer.py:197:stop] 0/1342, RunningAvgSamplesPerSec=11.982354905100326, CurrSamplesPerSec=11.965909863399364, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:55:01,966] [INFO] [timer.py:197:stop] 0/1343, RunningAvgSamplesPerSec=11.982277604435687, CurrSamplesPerSec=11.879583136583442, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:55:08,484] [INFO] [timer.py:197:stop] 0/1344, RunningAvgSamplesPerSec=11.982291901010923, CurrSamplesPerSec=12.001494355303539, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:55:14,997] [INFO] [timer.py:197:stop] 0/1345, RunningAvgSamplesPerSec=11.982258917582456, CurrSamplesPerSec=11.938158190834093, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:55:21,505] [INFO] [timer.py:197:stop] 0/1346, RunningAvgSamplesPerSec=11.982203487363318, CurrSamplesPerSec=11.908220684127414, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:55:27,910] [INFO] [timer.py:197:stop] 0/1347, RunningAvgSamplesPerSec=11.98218230635706, CurrSamplesPerSec=11.953782556284237, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:55:34,413] [INFO] [timer.py:197:stop] 0/1348, RunningAvgSamplesPerSec=11.982118374768122, CurrSamplesPerSec=11.896743523779, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:55:40,869] [INFO] [timer.py:197:stop] 0/1349, RunningAvgSamplesPerSec=11.982118473575229, CurrSamplesPerSec=11.982251469417424, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:55:47,393] [INFO] [logging.py:68:log_dist] [Rank 0] step=1350, skipped=4, lr=[8.122222222222223e-06], mom=[[0.9, 0.999]] [2022-12-19 20:55:47,393] [INFO] [timer.py:197:stop] 0/1350, RunningAvgSamplesPerSec=11.982053429113305, CurrSamplesPerSec=11.895075012603613, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0003, 'learning_rate': 8.122222222222223e-06, 'epoch': 35.53} [2022-12-19 20:55:53,874] [INFO] [timer.py:197:stop] 0/1351, RunningAvgSamplesPerSec=11.982047965863686, CurrSamplesPerSec=11.97468803232703, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:56:00,339] [INFO] [timer.py:197:stop] 0/1352, RunningAvgSamplesPerSec=11.982046594099243, CurrSamplesPerSec=11.98019636982837, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:56:06,824] [INFO] [timer.py:197:stop] 0/1353, RunningAvgSamplesPerSec=11.981960633112893, CurrSamplesPerSec=11.867027275261604, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:56:13,307] [INFO] [timer.py:197:stop] 0/1354, RunningAvgSamplesPerSec=11.981883924154296, CurrSamplesPerSec=11.879139437277534, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:56:19,790] [INFO] [timer.py:197:stop] 0/1355, RunningAvgSamplesPerSec=11.981876090047772, CurrSamplesPerSec=11.971293739510335, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:56:26,314] [INFO] [timer.py:197:stop] 0/1356, RunningAvgSamplesPerSec=11.981870637882945, CurrSamplesPerSec=11.97449840103124, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:56:32,817] [INFO] [timer.py:197:stop] 0/1357, RunningAvgSamplesPerSec=11.981823628207618, CurrSamplesPerSec=11.91850912218277, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:56:39,232] [INFO] [timer.py:197:stop] 0/1358, RunningAvgSamplesPerSec=11.981800850744104, CurrSamplesPerSec=11.951016741812932, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:56:45,775] [INFO] [timer.py:197:stop] 0/1359, RunningAvgSamplesPerSec=11.981652468376021, CurrSamplesPerSec=11.783771421166469, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:56:52,811] [INFO] [logging.py:68:log_dist] [Rank 0] step=1360, skipped=4, lr=[8.1e-06], mom=[[0.9, 0.999]] [2022-12-19 20:56:52,811] [INFO] [timer.py:197:stop] 0/1360, RunningAvgSamplesPerSec=11.98153052594088, CurrSamplesPerSec=11.818310520050566, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:56:59,587] [INFO] [timer.py:197:stop] 0/1361, RunningAvgSamplesPerSec=11.981406352434911, CurrSamplesPerSec=11.81512077604703, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:57:06,727] [INFO] [timer.py:197:stop] 0/1362, RunningAvgSamplesPerSec=11.981361796794712, CurrSamplesPerSec=11.921115377690901, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:57:13,641] [INFO] [timer.py:197:stop] 0/1363, RunningAvgSamplesPerSec=11.981261356868913, CurrSamplesPerSec=11.84620397845723, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:57:20,614] [INFO] [timer.py:197:stop] 0/1364, RunningAvgSamplesPerSec=11.98110859990825, CurrSamplesPerSec=11.776755027898316, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:57:27,139] [INFO] [timer.py:197:stop] 0/1365, RunningAvgSamplesPerSec=11.981023310563135, CurrSamplesPerSec=11.865975507423366, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:57:33,768] [INFO] [timer.py:197:stop] 0/1366, RunningAvgSamplesPerSec=11.980918418242064, CurrSamplesPerSec=11.839637328003906, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:57:40,442] [INFO] [timer.py:197:stop] 0/1367, RunningAvgSamplesPerSec=11.98081868046328, CurrSamplesPerSec=11.846304875756664, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:57:45,127] [INFO] [timer.py:197:stop] 0/1368, RunningAvgSamplesPerSec=11.983234729612306, CurrSamplesPerSec=16.534656719587826, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:57:51,817] [INFO] [timer.py:197:stop] 0/1369, RunningAvgSamplesPerSec=11.983134806986325, CurrSamplesPerSec=11.84817884573462, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:57:58,458] [INFO] [logging.py:68:log_dist] [Rank 0] step=1370, skipped=4, lr=[8.077777777777778e-06], mom=[[0.9, 0.999]] [2022-12-19 20:57:58,459] [INFO] [timer.py:197:stop] 0/1370, RunningAvgSamplesPerSec=11.982997909604531, CurrSamplesPerSec=11.798738870861268, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:58:05,103] [INFO] [timer.py:197:stop] 0/1371, RunningAvgSamplesPerSec=11.98296376158245, CurrSamplesPerSec=11.936430804481208, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:58:11,600] [INFO] [timer.py:197:stop] 0/1372, RunningAvgSamplesPerSec=11.982863435820251, CurrSamplesPerSec=11.847074993112921, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:58:18,148] [INFO] [timer.py:197:stop] 0/1373, RunningAvgSamplesPerSec=11.982776543254179, CurrSamplesPerSec=11.864905570876088, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:58:24,730] [INFO] [timer.py:197:stop] 0/1374, RunningAvgSamplesPerSec=11.982611939769855, CurrSamplesPerSec=11.761115107882148, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:58:31,285] [INFO] [timer.py:197:stop] 0/1375, RunningAvgSamplesPerSec=11.98246311997431, CurrSamplesPerSec=11.781705752491947, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0003, 'learning_rate': 8.066666666666667e-06, 'epoch': 36.18} [2022-12-19 20:58:37,958] [INFO] [timer.py:197:stop] 0/1376, RunningAvgSamplesPerSec=11.98235473568969, CurrSamplesPerSec=11.835369881189727, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:58:44,542] [INFO] [timer.py:197:stop] 0/1377, RunningAvgSamplesPerSec=11.98232498671935, CurrSamplesPerSec=11.941588964653295, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:58:51,127] [INFO] [timer.py:197:stop] 0/1378, RunningAvgSamplesPerSec=11.982321640931913, CurrSamplesPerSec=11.977722950096528, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:58:57,678] [INFO] [timer.py:197:stop] 0/1379, RunningAvgSamplesPerSec=11.98219347101762, CurrSamplesPerSec=11.808391657426267, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:59:04,270] [INFO] [logging.py:68:log_dist] [Rank 0] step=1380, skipped=4, lr=[8.055555555555557e-06], mom=[[0.9, 0.999]] [2022-12-19 20:59:04,271] [INFO] [timer.py:197:stop] 0/1380, RunningAvgSamplesPerSec=11.982080976941162, CurrSamplesPerSec=11.829155095213663, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:59:10,789] [INFO] [timer.py:197:stop] 0/1381, RunningAvgSamplesPerSec=11.981944082024215, CurrSamplesPerSec=11.796228869346336, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:59:17,494] [INFO] [timer.py:197:stop] 0/1382, RunningAvgSamplesPerSec=11.981796514034817, CurrSamplesPerSec=11.781701098573128, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:59:24,000] [INFO] [timer.py:197:stop] 0/1383, RunningAvgSamplesPerSec=11.981692140968873, CurrSamplesPerSec=11.839369445432766, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:59:30,412] [INFO] [timer.py:197:stop] 0/1384, RunningAvgSamplesPerSec=11.981645900844894, CurrSamplesPerSec=11.918127065972417, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:59:37,044] [INFO] [timer.py:197:stop] 0/1385, RunningAvgSamplesPerSec=11.981632464785609, CurrSamplesPerSec=11.96309258397815, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:59:43,545] [INFO] [timer.py:197:stop] 0/1386, RunningAvgSamplesPerSec=11.98154233466429, CurrSamplesPerSec=11.858176738682028, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:59:50,036] [INFO] [timer.py:197:stop] 0/1387, RunningAvgSamplesPerSec=11.981531849614287, CurrSamplesPerSec=11.967038107039665, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 20:59:56,739] [INFO] [timer.py:197:stop] 0/1388, RunningAvgSamplesPerSec=11.98142505329145, CurrSamplesPerSec=11.835317177231717, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:00:03,212] [INFO] [timer.py:197:stop] 0/1389, RunningAvgSamplesPerSec=11.981378940819054, CurrSamplesPerSec=11.917806411661072, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:00:09,748] [INFO] [logging.py:68:log_dist] [Rank 0] step=1390, skipped=4, lr=[8.033333333333335e-06], mom=[[0.9, 0.999]] [2022-12-19 21:00:09,749] [INFO] [timer.py:197:stop] 0/1390, RunningAvgSamplesPerSec=11.981274208980494, CurrSamplesPerSec=11.837752485886314, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:00:16,231] [INFO] [timer.py:197:stop] 0/1391, RunningAvgSamplesPerSec=11.981195295589487, CurrSamplesPerSec=11.87265648285926, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:00:22,920] [INFO] [timer.py:197:stop] 0/1392, RunningAvgSamplesPerSec=11.981121216089466, CurrSamplesPerSec=11.879101587686238, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:00:29,429] [INFO] [timer.py:197:stop] 0/1393, RunningAvgSamplesPerSec=11.981035533762924, CurrSamplesPerSec=11.863110189450147, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:00:35,966] [INFO] [timer.py:197:stop] 0/1394, RunningAvgSamplesPerSec=11.980965169007295, CurrSamplesPerSec=11.883881480226155, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:00:42,661] [INFO] [timer.py:197:stop] 0/1395, RunningAvgSamplesPerSec=11.980907986204143, CurrSamplesPerSec=11.901835243161527, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:00:49,586] [INFO] [timer.py:197:stop] 0/1396, RunningAvgSamplesPerSec=11.980833273642752, CurrSamplesPerSec=11.877655598528445, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:00:56,028] [INFO] [timer.py:197:stop] 0/1397, RunningAvgSamplesPerSec=11.980788475688, CurrSamplesPerSec=11.918664174141895, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:01:02,530] [INFO] [timer.py:197:stop] 0/1398, RunningAvgSamplesPerSec=11.980677693010687, CurrSamplesPerSec=11.828105338895451, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:01:09,042] [INFO] [timer.py:197:stop] 0/1399, RunningAvgSamplesPerSec=11.980591773517459, CurrSamplesPerSec=11.861837916049607, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:01:15,550] [INFO] [logging.py:68:log_dist] [Rank 0] step=1400, skipped=4, lr=[8.011111111111113e-06], mom=[[0.9, 0.999]] [2022-12-19 21:01:15,550] [INFO] [timer.py:197:stop] 0/1400, RunningAvgSamplesPerSec=11.980522254425523, CurrSamplesPerSec=11.88418557971958, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0002, 'learning_rate': 8.011111111111113e-06, 'epoch': 36.84} [2022-12-19 21:01:22,039] [INFO] [timer.py:197:stop] 0/1401, RunningAvgSamplesPerSec=11.980518434887875, CurrSamplesPerSec=11.975181101805797, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:01:28,549] [INFO] [timer.py:197:stop] 0/1402, RunningAvgSamplesPerSec=11.980511100185435, CurrSamplesPerSec=11.970258638930371, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:01:35,062] [INFO] [timer.py:197:stop] 0/1403, RunningAvgSamplesPerSec=11.980397793306091, CurrSamplesPerSec=11.823842554000272, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:01:41,643] [INFO] [timer.py:197:stop] 0/1404, RunningAvgSamplesPerSec=11.98030896677523, CurrSamplesPerSec=11.857143298855867, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:01:48,238] [INFO] [timer.py:197:stop] 0/1405, RunningAvgSamplesPerSec=11.980220924045154, CurrSamplesPerSec=11.858044733291742, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:01:52,890] [INFO] [timer.py:197:stop] 0/1406, RunningAvgSamplesPerSec=11.982558232120859, CurrSamplesPerSec=16.49857060981025, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:01:59,376] [INFO] [timer.py:197:stop] 0/1407, RunningAvgSamplesPerSec=11.982542899627159, CurrSamplesPerSec=11.961054709829677, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:02:05,853] [INFO] [timer.py:197:stop] 0/1408, RunningAvgSamplesPerSec=11.98251989560634, CurrSamplesPerSec=11.950286252575383, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:02:12,480] [INFO] [timer.py:197:stop] 0/1409, RunningAvgSamplesPerSec=11.982454147207243, CurrSamplesPerSec=11.890720111258343, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:02:18,983] [INFO] [logging.py:68:log_dist] [Rank 0] step=1410, skipped=4, lr=[7.98888888888889e-06], mom=[[0.9, 0.999]] [2022-12-19 21:02:18,983] [INFO] [timer.py:197:stop] 0/1410, RunningAvgSamplesPerSec=11.982374369829332, CurrSamplesPerSec=11.871170063271524, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:02:25,484] [INFO] [timer.py:197:stop] 0/1411, RunningAvgSamplesPerSec=11.98229967878743, CurrSamplesPerSec=11.878050306545083, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:02:31,961] [INFO] [timer.py:197:stop] 0/1412, RunningAvgSamplesPerSec=11.982222372935048, CurrSamplesPerSec=11.874280366130726, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:02:38,474] [INFO] [timer.py:197:stop] 0/1413, RunningAvgSamplesPerSec=11.982131225791836, CurrSamplesPerSec=11.854978530484203, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:02:44,976] [INFO] [timer.py:197:stop] 0/1414, RunningAvgSamplesPerSec=11.982052619873524, CurrSamplesPerSec=11.872157642844124, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:02:51,456] [INFO] [timer.py:197:stop] 0/1415, RunningAvgSamplesPerSec=11.982025621303253, CurrSamplesPerSec=11.944024629561667, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:02:58,141] [INFO] [timer.py:197:stop] 0/1416, RunningAvgSamplesPerSec=11.982013098917864, CurrSamplesPerSec=11.964345077627602, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:03:04,628] [INFO] [timer.py:197:stop] 0/1417, RunningAvgSamplesPerSec=11.981962161191175, CurrSamplesPerSec=11.910366893340468, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:03:11,157] [INFO] [timer.py:197:stop] 0/1418, RunningAvgSamplesPerSec=11.981883107267286, CurrSamplesPerSec=11.871057192496979, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:03:17,714] [INFO] [timer.py:197:stop] 0/1419, RunningAvgSamplesPerSec=11.981804411669591, CurrSamplesPerSec=11.871398961292499, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:03:24,364] [INFO] [logging.py:68:log_dist] [Rank 0] step=1420, skipped=4, lr=[7.966666666666668e-06], mom=[[0.9, 0.999]] [2022-12-19 21:03:24,364] [INFO] [timer.py:197:stop] 0/1420, RunningAvgSamplesPerSec=11.981710156097234, CurrSamplesPerSec=11.849623420238665, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:03:30,889] [INFO] [timer.py:197:stop] 0/1421, RunningAvgSamplesPerSec=11.981585644077043, CurrSamplesPerSec=11.807593320670438, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:03:37,441] [INFO] [timer.py:197:stop] 0/1422, RunningAvgSamplesPerSec=11.981474820138711, CurrSamplesPerSec=11.826254385522418, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:03:43,862] [INFO] [timer.py:197:stop] 0/1423, RunningAvgSamplesPerSec=11.981462080546693, CurrSamplesPerSec=11.963399151415013, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:03:50,442] [INFO] [timer.py:197:stop] 0/1424, RunningAvgSamplesPerSec=11.981397354902644, CurrSamplesPerSec=11.890123372084375, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:03:57,113] [INFO] [timer.py:197:stop] 0/1425, RunningAvgSamplesPerSec=11.981043297118187, CurrSamplesPerSec=11.497890561393024, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0002, 'learning_rate': 7.955555555555557e-06, 'epoch': 37.5} [2022-12-19 21:04:03,786] [INFO] [timer.py:197:stop] 0/1426, RunningAvgSamplesPerSec=11.980943760073416, CurrSamplesPerSec=11.840958641606791, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:04:10,465] [INFO] [timer.py:197:stop] 0/1427, RunningAvgSamplesPerSec=11.980800896012239, CurrSamplesPerSec=11.780761599368592, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:04:16,985] [INFO] [timer.py:197:stop] 0/1428, RunningAvgSamplesPerSec=11.980707048250473, CurrSamplesPerSec=11.84845131093036, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:04:23,516] [INFO] [timer.py:197:stop] 0/1429, RunningAvgSamplesPerSec=11.980620215398877, CurrSamplesPerSec=11.858064114851475, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:04:30,033] [INFO] [logging.py:68:log_dist] [Rank 0] step=1430, skipped=4, lr=[7.944444444444445e-06], mom=[[0.9, 0.999]] [2022-12-19 21:04:30,034] [INFO] [timer.py:197:stop] 0/1430, RunningAvgSamplesPerSec=11.980583345690269, CurrSamplesPerSec=11.928200473590103, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:04:36,811] [INFO] [timer.py:197:stop] 0/1431, RunningAvgSamplesPerSec=11.980510061037894, CurrSamplesPerSec=11.876766419064666, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:04:43,294] [INFO] [timer.py:197:stop] 0/1432, RunningAvgSamplesPerSec=11.98049984950412, CurrSamplesPerSec=11.96592533196822, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:04:49,760] [INFO] [timer.py:197:stop] 0/1433, RunningAvgSamplesPerSec=11.980500294682008, CurrSamplesPerSec=11.98113693291271, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:04:56,306] [INFO] [timer.py:197:stop] 0/1434, RunningAvgSamplesPerSec=11.980420661376451, CurrSamplesPerSec=11.867539850148331, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:05:02,789] [INFO] [timer.py:197:stop] 0/1435, RunningAvgSamplesPerSec=11.980346788611667, CurrSamplesPerSec=11.875487537406274, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:05:09,253] [INFO] [timer.py:197:stop] 0/1436, RunningAvgSamplesPerSec=11.980282718597376, CurrSamplesPerSec=11.889169135240525, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:05:15,684] [INFO] [timer.py:197:stop] 0/1437, RunningAvgSamplesPerSec=11.980273193284175, CurrSamplesPerSec=11.966629460892523, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:05:22,154] [INFO] [timer.py:197:stop] 0/1438, RunningAvgSamplesPerSec=11.980185155488941, CurrSamplesPerSec=11.855170154426318, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:05:28,655] [INFO] [timer.py:197:stop] 0/1439, RunningAvgSamplesPerSec=11.980078998976385, CurrSamplesPerSec=11.829554927275474, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:05:35,260] [INFO] [logging.py:68:log_dist] [Rank 0] step=1440, skipped=4, lr=[7.922222222222223e-06], mom=[[0.9, 0.999]] [2022-12-19 21:05:35,260] [INFO] [timer.py:197:stop] 0/1440, RunningAvgSamplesPerSec=11.980080220193031, CurrSamplesPerSec=11.981835365795414, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:05:41,793] [INFO] [timer.py:197:stop] 0/1441, RunningAvgSamplesPerSec=11.979979093046312, CurrSamplesPerSec=11.836303497412628, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:05:48,390] [INFO] [timer.py:197:stop] 0/1442, RunningAvgSamplesPerSec=11.979967806444666, CurrSamplesPerSec=11.963748390856624, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:05:54,861] [INFO] [timer.py:197:stop] 0/1443, RunningAvgSamplesPerSec=11.979895991865515, CurrSamplesPerSec=11.877368650180395, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:05:59,543] [INFO] [timer.py:197:stop] 0/1444, RunningAvgSamplesPerSec=11.98215668193944, CurrSamplesPerSec=16.45735092737553, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:06:06,059] [INFO] [timer.py:197:stop] 0/1445, RunningAvgSamplesPerSec=11.98206305327611, CurrSamplesPerSec=11.848555907424329, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:06:12,476] [INFO] [timer.py:197:stop] 0/1446, RunningAvgSamplesPerSec=11.981994913979445, CurrSamplesPerSec=11.884470752899988, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:06:18,921] [INFO] [timer.py:197:stop] 0/1447, RunningAvgSamplesPerSec=11.98192742198954, CurrSamplesPerSec=11.885255839067263, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:06:25,414] [INFO] [timer.py:197:stop] 0/1448, RunningAvgSamplesPerSec=11.981835553314939, CurrSamplesPerSec=11.85054097593444, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:06:31,903] [INFO] [timer.py:197:stop] 0/1449, RunningAvgSamplesPerSec=11.981785543197136, CurrSamplesPerSec=11.909905039039053, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:06:38,349] [INFO] [logging.py:68:log_dist] [Rank 0] step=1450, skipped=4, lr=[7.9e-06], mom=[[0.9, 0.999]] [2022-12-19 21:06:38,350] [INFO] [timer.py:197:stop] 0/1450, RunningAvgSamplesPerSec=11.981768853593337, CurrSamplesPerSec=11.95766760774088, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0002, 'learning_rate': 7.9e-06, 'epoch': 38.16} [2022-12-19 21:06:44,876] [INFO] [timer.py:197:stop] 0/1451, RunningAvgSamplesPerSec=11.981665239670576, CurrSamplesPerSec=11.833489005742628, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:06:51,349] [INFO] [timer.py:197:stop] 0/1452, RunningAvgSamplesPerSec=11.981634724988645, CurrSamplesPerSec=11.937581632000517, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:06:57,930] [INFO] [timer.py:197:stop] 0/1453, RunningAvgSamplesPerSec=11.981442504555945, CurrSamplesPerSec=11.709063493737656, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:07:04,468] [INFO] [timer.py:197:stop] 0/1454, RunningAvgSamplesPerSec=11.981341956665716, CurrSamplesPerSec=11.837203331794347, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:07:11,175] [INFO] [timer.py:197:stop] 0/1455, RunningAvgSamplesPerSec=11.981268584966376, CurrSamplesPerSec=11.875672469567188, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:07:17,665] [INFO] [timer.py:197:stop] 0/1456, RunningAvgSamplesPerSec=11.981189721790338, CurrSamplesPerSec=11.8676878073271, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:07:24,221] [INFO] [timer.py:197:stop] 0/1457, RunningAvgSamplesPerSec=11.981084955527963, CurrSamplesPerSec=11.830668552507039, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:07:30,723] [INFO] [timer.py:197:stop] 0/1458, RunningAvgSamplesPerSec=11.981002746859737, CurrSamplesPerSec=11.86257231004085, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:07:37,231] [INFO] [timer.py:197:stop] 0/1459, RunningAvgSamplesPerSec=11.98090575257255, CurrSamplesPerSec=11.84132845332711, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:07:43,746] [INFO] [logging.py:68:log_dist] [Rank 0] step=1460, skipped=4, lr=[7.877777777777778e-06], mom=[[0.9, 0.999]] [2022-12-19 21:07:43,746] [INFO] [timer.py:197:stop] 0/1460, RunningAvgSamplesPerSec=11.980788359481796, CurrSamplesPerSec=11.81215573423115, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:07:50,467] [INFO] [timer.py:197:stop] 0/1461, RunningAvgSamplesPerSec=11.980704733191775, CurrSamplesPerSec=11.86000678281576, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:07:56,945] [INFO] [timer.py:197:stop] 0/1462, RunningAvgSamplesPerSec=11.980666163387005, CurrSamplesPerSec=11.924656078639027, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:08:03,517] [INFO] [timer.py:197:stop] 0/1463, RunningAvgSamplesPerSec=11.98058439300641, CurrSamplesPerSec=11.862378349900391, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:08:10,055] [INFO] [timer.py:197:stop] 0/1464, RunningAvgSamplesPerSec=11.980470814606116, CurrSamplesPerSec=11.816801263121398, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:08:16,675] [INFO] [timer.py:197:stop] 0/1465, RunningAvgSamplesPerSec=11.980317700458494, CurrSamplesPerSec=11.76057355628564, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:08:23,222] [INFO] [timer.py:197:stop] 0/1466, RunningAvgSamplesPerSec=11.980244133746465, CurrSamplesPerSec=11.873574983222607, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:08:29,763] [INFO] [timer.py:197:stop] 0/1467, RunningAvgSamplesPerSec=11.980172070555382, CurrSamplesPerSec=11.875593137167723, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:08:36,257] [INFO] [timer.py:197:stop] 0/1468, RunningAvgSamplesPerSec=11.980131668583185, CurrSamplesPerSec=11.921233967192158, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:08:42,763] [INFO] [timer.py:197:stop] 0/1469, RunningAvgSamplesPerSec=11.980070041527572, CurrSamplesPerSec=11.890401456674137, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:08:49,255] [INFO] [logging.py:68:log_dist] [Rank 0] step=1470, skipped=4, lr=[7.855555555555556e-06], mom=[[0.9, 0.999]] [2022-12-19 21:08:49,255] [INFO] [timer.py:197:stop] 0/1470, RunningAvgSamplesPerSec=11.980035991295614, CurrSamplesPerSec=11.93029185443802, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:08:55,777] [INFO] [timer.py:197:stop] 0/1471, RunningAvgSamplesPerSec=11.979988436720186, CurrSamplesPerSec=11.91058303654726, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:09:02,373] [INFO] [timer.py:197:stop] 0/1472, RunningAvgSamplesPerSec=11.979907076701746, CurrSamplesPerSec=11.86157060034067, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:09:08,862] [INFO] [timer.py:197:stop] 0/1473, RunningAvgSamplesPerSec=11.979883400026456, CurrSamplesPerSec=11.945179579662645, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:09:15,334] [INFO] [timer.py:197:stop] 0/1474, RunningAvgSamplesPerSec=11.979872200233293, CurrSamplesPerSec=11.963419945253637, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:09:21,829] [INFO] [timer.py:197:stop] 0/1475, RunningAvgSamplesPerSec=11.979792760725074, CurrSamplesPerSec=11.863988935595579, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0002, 'learning_rate': 7.844444444444446e-06, 'epoch': 38.82} [2022-12-19 21:09:28,393] [INFO] [timer.py:197:stop] 0/1476, RunningAvgSamplesPerSec=11.979722040717537, CurrSamplesPerSec=11.876450088267445, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:09:34,877] [INFO] [timer.py:197:stop] 0/1477, RunningAvgSamplesPerSec=11.979713769058256, CurrSamplesPerSec=11.967533747977681, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:09:41,411] [INFO] [timer.py:197:stop] 0/1478, RunningAvgSamplesPerSec=11.97964024332195, CurrSamplesPerSec=11.87216341864653, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:09:47,899] [INFO] [timer.py:197:stop] 0/1479, RunningAvgSamplesPerSec=11.979636975199124, CurrSamplesPerSec=11.974815168784227, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:09:54,355] [INFO] [logging.py:68:log_dist] [Rank 0] step=1480, skipped=4, lr=[7.833333333333333e-06], mom=[[0.9, 0.999]] [2022-12-19 21:09:54,356] [INFO] [timer.py:197:stop] 0/1480, RunningAvgSamplesPerSec=11.979583083184465, CurrSamplesPerSec=11.900510333235003, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:10:00,868] [INFO] [timer.py:197:stop] 0/1481, RunningAvgSamplesPerSec=11.979509075417576, CurrSamplesPerSec=11.871115990027988, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:10:05,793] [INFO] [timer.py:197:stop] 0/1482, RunningAvgSamplesPerSec=11.98175595798845, CurrSamplesPerSec=16.581495284071014, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:10:12,332] [INFO] [timer.py:197:stop] 0/1483, RunningAvgSamplesPerSec=11.98172147751766, CurrSamplesPerSec=11.93090695006206, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:10:18,874] [INFO] [timer.py:197:stop] 0/1484, RunningAvgSamplesPerSec=11.981697782123328, CurrSamplesPerSec=11.94670745456634, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:10:25,395] [INFO] [timer.py:197:stop] 0/1485, RunningAvgSamplesPerSec=11.981629168675276, CurrSamplesPerSec=11.880800325572368, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:10:31,908] [INFO] [timer.py:197:stop] 0/1486, RunningAvgSamplesPerSec=11.981559496728417, CurrSamplesPerSec=11.879119986762992, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:10:38,408] [INFO] [timer.py:197:stop] 0/1487, RunningAvgSamplesPerSec=11.981480195291201, CurrSamplesPerSec=11.864942281188245, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:10:44,936] [INFO] [timer.py:197:stop] 0/1488, RunningAvgSamplesPerSec=11.981421172885575, CurrSamplesPerSec=11.894409847341413, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:10:51,526] [INFO] [timer.py:197:stop] 0/1489, RunningAvgSamplesPerSec=11.981317490980343, CurrSamplesPerSec=11.829203574062847, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:10:58,042] [INFO] [logging.py:68:log_dist] [Rank 0] step=1490, skipped=4, lr=[7.811111111111111e-06], mom=[[0.9, 0.999]] [2022-12-19 21:10:58,043] [INFO] [timer.py:197:stop] 0/1490, RunningAvgSamplesPerSec=11.981283409146888, CurrSamplesPerSec=11.93081733304562, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:11:04,520] [INFO] [timer.py:197:stop] 0/1491, RunningAvgSamplesPerSec=11.981243432673136, CurrSamplesPerSec=11.922052510859448, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:11:11,094] [INFO] [timer.py:197:stop] 0/1492, RunningAvgSamplesPerSec=11.98106469526318, CurrSamplesPerSec=11.720711892916533, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:11:17,767] [INFO] [timer.py:197:stop] 0/1493, RunningAvgSamplesPerSec=11.981022714502592, CurrSamplesPerSec=11.918796474224989, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:11:24,269] [INFO] [timer.py:197:stop] 0/1494, RunningAvgSamplesPerSec=11.980943306110944, CurrSamplesPerSec=11.863704744497111, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:11:30,769] [INFO] [timer.py:197:stop] 0/1495, RunningAvgSamplesPerSec=11.98091343853537, CurrSamplesPerSec=11.936516259598811, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:11:37,390] [INFO] [timer.py:197:stop] 0/1496, RunningAvgSamplesPerSec=11.980813080303719, CurrSamplesPerSec=11.832830188562806, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:11:43,970] [INFO] [timer.py:197:stop] 0/1497, RunningAvgSamplesPerSec=11.980706998995704, CurrSamplesPerSec=11.824292028132108, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:11:50,458] [INFO] [timer.py:197:stop] 0/1498, RunningAvgSamplesPerSec=11.980644599561648, CurrSamplesPerSec=11.8880786911661, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:11:56,868] [INFO] [timer.py:197:stop] 0/1499, RunningAvgSamplesPerSec=11.980602468388104, CurrSamplesPerSec=11.91790429934887, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:12:03,471] [INFO] [logging.py:68:log_dist] [Rank 0] step=1500, skipped=4, lr=[7.788888888888889e-06], mom=[[0.9, 0.999]] [2022-12-19 21:12:03,472] [INFO] [timer.py:197:stop] 0/1500, RunningAvgSamplesPerSec=11.980580035819317, CurrSamplesPerSec=11.947092408834784, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0002, 'learning_rate': 7.788888888888889e-06, 'epoch': 39.47} [2022-12-19 21:12:10,095] [INFO] [timer.py:197:stop] 0/1501, RunningAvgSamplesPerSec=11.98052992890709, CurrSamplesPerSec=11.905937421716827, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:12:16,553] [INFO] [timer.py:197:stop] 0/1502, RunningAvgSamplesPerSec=11.980504787307721, CurrSamplesPerSec=11.942935790061666, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:12:23,129] [INFO] [timer.py:197:stop] 0/1503, RunningAvgSamplesPerSec=11.980372096981892, CurrSamplesPerSec=11.784591372781506, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:12:29,601] [INFO] [timer.py:197:stop] 0/1504, RunningAvgSamplesPerSec=11.980259998351245, CurrSamplesPerSec=11.814331928578858, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:12:36,069] [INFO] [timer.py:197:stop] 0/1505, RunningAvgSamplesPerSec=11.980193277763334, CurrSamplesPerSec=11.88081084233073, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:12:42,739] [INFO] [timer.py:197:stop] 0/1506, RunningAvgSamplesPerSec=11.980182498572253, CurrSamplesPerSec=11.964003268540441, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:12:49,421] [INFO] [timer.py:197:stop] 0/1507, RunningAvgSamplesPerSec=11.980128630557658, CurrSamplesPerSec=11.89965570916969, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:12:55,917] [INFO] [timer.py:197:stop] 0/1508, RunningAvgSamplesPerSec=11.98013131407839, CurrSamplesPerSec=11.984171375658747, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:13:02,416] [INFO] [timer.py:197:stop] 0/1509, RunningAvgSamplesPerSec=11.980040459955276, CurrSamplesPerSec=11.844760237502847, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:13:08,998] [INFO] [logging.py:68:log_dist] [Rank 0] step=1510, skipped=4, lr=[7.766666666666666e-06], mom=[[0.9, 0.999]] [2022-12-19 21:13:08,999] [INFO] [timer.py:197:stop] 0/1510, RunningAvgSamplesPerSec=11.979911813362317, CurrSamplesPerSec=11.789130846942879, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:13:15,565] [INFO] [timer.py:197:stop] 0/1511, RunningAvgSamplesPerSec=11.97982100239201, CurrSamplesPerSec=11.844426794873153, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:13:22,106] [INFO] [timer.py:197:stop] 0/1512, RunningAvgSamplesPerSec=11.979735790036717, CurrSamplesPerSec=11.85251676629222, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:13:28,657] [INFO] [timer.py:197:stop] 0/1513, RunningAvgSamplesPerSec=11.979739766499915, CurrSamplesPerSec=11.985747238976288, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:13:35,365] [INFO] [timer.py:197:stop] 0/1514, RunningAvgSamplesPerSec=11.979640092426685, CurrSamplesPerSec=11.830903712733885, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:13:41,989] [INFO] [timer.py:197:stop] 0/1515, RunningAvgSamplesPerSec=11.97963308136236, CurrSamplesPerSec=11.969041730544772, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:13:48,543] [INFO] [timer.py:197:stop] 0/1516, RunningAvgSamplesPerSec=11.97957403362226, CurrSamplesPerSec=11.890896563705724, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:13:54,984] [INFO] [timer.py:197:stop] 0/1517, RunningAvgSamplesPerSec=11.97956555492441, CurrSamplesPerSec=11.966742555988622, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:14:01,466] [INFO] [timer.py:197:stop] 0/1518, RunningAvgSamplesPerSec=11.979527873361311, CurrSamplesPerSec=11.922711239517483, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:14:08,077] [INFO] [timer.py:197:stop] 0/1519, RunningAvgSamplesPerSec=11.979522218645776, CurrSamplesPerSec=11.970955804065262, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:14:12,886] [INFO] [logging.py:68:log_dist] [Rank 0] step=1520, skipped=4, lr=[7.744444444444446e-06], mom=[[0.9, 0.999]] [2022-12-19 21:14:12,887] [INFO] [timer.py:197:stop] 0/1520, RunningAvgSamplesPerSec=11.981662037274063, CurrSamplesPerSec=16.435099879330625, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:14:19,331] [INFO] [timer.py:197:stop] 0/1521, RunningAvgSamplesPerSec=11.9815518149292, CurrSamplesPerSec=11.81654013472334, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:14:25,943] [INFO] [timer.py:197:stop] 0/1522, RunningAvgSamplesPerSec=11.981470669754533, CurrSamplesPerSec=11.859467088979937, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:14:32,684] [INFO] [timer.py:197:stop] 0/1523, RunningAvgSamplesPerSec=11.981407430248463, CurrSamplesPerSec=11.8860489234938, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:14:39,468] [INFO] [timer.py:197:stop] 0/1524, RunningAvgSamplesPerSec=11.981335919443893, CurrSamplesPerSec=11.873547147790257, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:14:46,034] [INFO] [timer.py:197:stop] 0/1525, RunningAvgSamplesPerSec=11.981270659730416, CurrSamplesPerSec=11.882762549133869, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0002, 'learning_rate': 7.733333333333334e-06, 'epoch': 40.13} [2022-12-19 21:14:52,541] [INFO] [timer.py:197:stop] 0/1526, RunningAvgSamplesPerSec=11.981181914746816, CurrSamplesPerSec=11.847531987200707, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:14:59,006] [INFO] [timer.py:197:stop] 0/1527, RunningAvgSamplesPerSec=11.981169142180644, CurrSamplesPerSec=11.961735345467574, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:15:05,426] [INFO] [timer.py:197:stop] 0/1528, RunningAvgSamplesPerSec=11.981086082538026, CurrSamplesPerSec=11.85574611066734, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:15:12,084] [INFO] [timer.py:197:stop] 0/1529, RunningAvgSamplesPerSec=11.98107563792849, CurrSamplesPerSec=11.965158352485162, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:15:18,792] [INFO] [logging.py:68:log_dist] [Rank 0] step=1530, skipped=4, lr=[7.722222222222223e-06], mom=[[0.9, 0.999]] [2022-12-19 21:15:18,793] [INFO] [timer.py:197:stop] 0/1530, RunningAvgSamplesPerSec=11.981016679670608, CurrSamplesPerSec=11.89165931990022, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:15:25,403] [INFO] [timer.py:197:stop] 0/1531, RunningAvgSamplesPerSec=11.980964257971003, CurrSamplesPerSec=11.901396212051853, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:15:31,952] [INFO] [timer.py:197:stop] 0/1532, RunningAvgSamplesPerSec=11.980946440248085, CurrSamplesPerSec=11.953764989829287, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:15:38,436] [INFO] [timer.py:197:stop] 0/1533, RunningAvgSamplesPerSec=11.98089039526604, CurrSamplesPerSec=11.895751322651439, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:15:44,871] [INFO] [timer.py:197:stop] 0/1534, RunningAvgSamplesPerSec=11.980887503803437, CurrSamplesPerSec=11.976462310692376, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:15:51,447] [INFO] [timer.py:197:stop] 0/1535, RunningAvgSamplesPerSec=11.980743412393009, CurrSamplesPerSec=11.763991680573207, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:15:57,948] [INFO] [timer.py:197:stop] 0/1536, RunningAvgSamplesPerSec=11.980727713739903, CurrSamplesPerSec=11.95670995516737, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:16:04,674] [INFO] [timer.py:197:stop] 0/1537, RunningAvgSamplesPerSec=11.980629921813597, CurrSamplesPerSec=11.832473425081663, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:16:11,182] [INFO] [timer.py:197:stop] 0/1538, RunningAvgSamplesPerSec=11.980629580710808, CurrSamplesPerSec=11.980106010826729, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:16:17,685] [INFO] [timer.py:197:stop] 0/1539, RunningAvgSamplesPerSec=11.980610058314195, CurrSamplesPerSec=11.950698571648308, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:16:24,268] [INFO] [logging.py:68:log_dist] [Rank 0] step=1540, skipped=4, lr=[7.7e-06], mom=[[0.9, 0.999]] [2022-12-19 21:16:24,269] [INFO] [timer.py:197:stop] 0/1540, RunningAvgSamplesPerSec=11.980505192979479, CurrSamplesPerSec=11.821468147795514, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:16:30,881] [INFO] [timer.py:197:stop] 0/1541, RunningAvgSamplesPerSec=11.980425694254096, CurrSamplesPerSec=11.859392688439591, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:16:37,562] [INFO] [timer.py:197:stop] 0/1542, RunningAvgSamplesPerSec=11.980352740366953, CurrSamplesPerSec=11.869119824692733, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:16:44,260] [INFO] [timer.py:197:stop] 0/1543, RunningAvgSamplesPerSec=11.980285269102644, CurrSamplesPerSec=11.877273529337327, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:16:50,927] [INFO] [timer.py:197:stop] 0/1544, RunningAvgSamplesPerSec=11.980211376396444, CurrSamplesPerSec=11.867415506010778, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:16:57,430] [INFO] [timer.py:197:stop] 0/1545, RunningAvgSamplesPerSec=11.980202355878394, CurrSamplesPerSec=11.966308858584089, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:17:03,928] [INFO] [timer.py:197:stop] 0/1546, RunningAvgSamplesPerSec=11.980124291952043, CurrSamplesPerSec=11.860871443280047, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:17:10,396] [INFO] [timer.py:197:stop] 0/1547, RunningAvgSamplesPerSec=11.980107799963609, CurrSamplesPerSec=11.954698212688031, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:17:16,971] [INFO] [timer.py:197:stop] 0/1548, RunningAvgSamplesPerSec=11.980096249919983, CurrSamplesPerSec=11.962277990684738, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:17:23,582] [INFO] [timer.py:197:stop] 0/1549, RunningAvgSamplesPerSec=11.980051947733983, CurrSamplesPerSec=11.911950363725785, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:17:30,091] [INFO] [logging.py:68:log_dist] [Rank 0] step=1550, skipped=4, lr=[7.677777777777778e-06], mom=[[0.9, 0.999]] [2022-12-19 21:17:30,092] [INFO] [timer.py:197:stop] 0/1550, RunningAvgSamplesPerSec=11.98003911837065, CurrSamplesPerSec=11.960224940155536, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0002, 'learning_rate': 7.677777777777778e-06, 'epoch': 40.79} [2022-12-19 21:17:36,548] [INFO] [timer.py:197:stop] 0/1551, RunningAvgSamplesPerSec=11.980046739284461, CurrSamplesPerSec=11.991855549927733, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:17:43,050] [INFO] [timer.py:197:stop] 0/1552, RunningAvgSamplesPerSec=11.979992817188009, CurrSamplesPerSec=11.897046172790546, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:17:49,514] [INFO] [timer.py:197:stop] 0/1553, RunningAvgSamplesPerSec=11.979936764116605, CurrSamplesPerSec=11.893680464777667, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:17:55,982] [INFO] [timer.py:197:stop] 0/1554, RunningAvgSamplesPerSec=11.979935075269244, CurrSamplesPerSec=11.977316245989256, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:18:02,510] [INFO] [timer.py:197:stop] 0/1555, RunningAvgSamplesPerSec=11.979800336784223, CurrSamplesPerSec=11.774276061005503, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:18:08,990] [INFO] [timer.py:197:stop] 0/1556, RunningAvgSamplesPerSec=11.979709489629158, CurrSamplesPerSec=11.840267134780637, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:18:15,463] [INFO] [timer.py:197:stop] 0/1557, RunningAvgSamplesPerSec=11.979709946780337, CurrSamplesPerSec=11.98042040187127, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:18:20,085] [INFO] [timer.py:197:stop] 0/1558, RunningAvgSamplesPerSec=11.981853745242443, CurrSamplesPerSec=16.601598070016095, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:18:26,554] [INFO] [timer.py:197:stop] 0/1559, RunningAvgSamplesPerSec=11.981793533272752, CurrSamplesPerSec=11.888831081448862, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:18:33,095] [INFO] [logging.py:68:log_dist] [Rank 0] step=1560, skipped=4, lr=[7.655555555555556e-06], mom=[[0.9, 0.999]] [2022-12-19 21:18:33,096] [INFO] [timer.py:197:stop] 0/1560, RunningAvgSamplesPerSec=11.98170411791646, CurrSamplesPerSec=11.844084486938481, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:18:39,757] [INFO] [timer.py:197:stop] 0/1561, RunningAvgSamplesPerSec=11.981617712486544, CurrSamplesPerSec=11.848494718251189, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:18:46,198] [INFO] [timer.py:197:stop] 0/1562, RunningAvgSamplesPerSec=11.981583373987704, CurrSamplesPerSec=11.928287931100218, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:18:52,706] [INFO] [timer.py:197:stop] 0/1563, RunningAvgSamplesPerSec=11.981520563370749, CurrSamplesPerSec=11.884331321498562, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:18:59,218] [INFO] [timer.py:197:stop] 0/1564, RunningAvgSamplesPerSec=11.981440534137434, CurrSamplesPerSec=11.857804826257734, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:19:05,682] [INFO] [timer.py:197:stop] 0/1565, RunningAvgSamplesPerSec=11.981440532863672, CurrSamplesPerSec=11.981438543249512, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:19:12,071] [INFO] [timer.py:197:stop] 0/1566, RunningAvgSamplesPerSec=11.98138706884306, CurrSamplesPerSec=11.89840195515808, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:19:18,523] [INFO] [timer.py:197:stop] 0/1567, RunningAvgSamplesPerSec=11.981374508300386, CurrSamplesPerSec=11.961761996833493, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:19:25,077] [INFO] [timer.py:197:stop] 0/1568, RunningAvgSamplesPerSec=11.981258708754659, CurrSamplesPerSec=11.80273446954969, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:19:31,517] [INFO] [timer.py:197:stop] 0/1569, RunningAvgSamplesPerSec=11.981250952166949, CurrSamplesPerSec=11.969116445859543, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:19:38,014] [INFO] [logging.py:68:log_dist] [Rank 0] step=1570, skipped=4, lr=[7.633333333333334e-06], mom=[[0.9, 0.999]] [2022-12-19 21:19:38,014] [INFO] [timer.py:197:stop] 0/1570, RunningAvgSamplesPerSec=11.981247714163251, CurrSamplesPerSec=11.97617591160501, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:19:44,549] [INFO] [timer.py:197:stop] 0/1571, RunningAvgSamplesPerSec=11.981178392544118, CurrSamplesPerSec=11.873459965959873, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:19:51,039] [INFO] [timer.py:197:stop] 0/1572, RunningAvgSamplesPerSec=11.981124976856144, CurrSamplesPerSec=11.89789831264522, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:19:57,485] [INFO] [timer.py:197:stop] 0/1573, RunningAvgSamplesPerSec=11.98113443715532, CurrSamplesPerSec=11.996005553899595, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:20:03,974] [INFO] [timer.py:197:stop] 0/1574, RunningAvgSamplesPerSec=11.981065267447372, CurrSamplesPerSec=11.87337698685268, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:20:10,465] [INFO] [timer.py:197:stop] 0/1575, RunningAvgSamplesPerSec=11.981053198699312, CurrSamplesPerSec=11.962111140695551, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0002, 'learning_rate': 7.622222222222223e-06, 'epoch': 41.45} [2022-12-19 21:20:16,980] [INFO] [timer.py:197:stop] 0/1576, RunningAvgSamplesPerSec=11.981006737053919, CurrSamplesPerSec=11.908365959454619, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:20:23,484] [INFO] [timer.py:197:stop] 0/1577, RunningAvgSamplesPerSec=11.98097858796247, CurrSamplesPerSec=11.936835267461852, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:20:29,976] [INFO] [timer.py:197:stop] 0/1578, RunningAvgSamplesPerSec=11.980964330818141, CurrSamplesPerSec=11.958551362076843, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:20:36,522] [INFO] [timer.py:197:stop] 0/1579, RunningAvgSamplesPerSec=11.980865341699745, CurrSamplesPerSec=11.826865052718425, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:20:43,011] [INFO] [logging.py:68:log_dist] [Rank 0] step=1580, skipped=4, lr=[7.611111111111111e-06], mom=[[0.9, 0.999]] [2022-12-19 21:20:43,012] [INFO] [timer.py:197:stop] 0/1580, RunningAvgSamplesPerSec=11.980871860453664, CurrSamplesPerSec=11.991160769282347, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:20:49,450] [INFO] [timer.py:197:stop] 0/1581, RunningAvgSamplesPerSec=11.98086098316813, CurrSamplesPerSec=11.963721197435271, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:20:55,929] [INFO] [timer.py:197:stop] 0/1582, RunningAvgSamplesPerSec=11.980869689166733, CurrSamplesPerSec=11.994632262049707, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:21:02,474] [INFO] [timer.py:197:stop] 0/1583, RunningAvgSamplesPerSec=11.98082233041505, CurrSamplesPerSec=11.906460228937553, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:21:08,885] [INFO] [timer.py:197:stop] 0/1584, RunningAvgSamplesPerSec=11.980829211356458, CurrSamplesPerSec=11.991717873041368, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:21:15,413] [INFO] [timer.py:197:stop] 0/1585, RunningAvgSamplesPerSec=11.980820558227668, CurrSamplesPerSec=11.967146941790284, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:21:21,903] [INFO] [timer.py:197:stop] 0/1586, RunningAvgSamplesPerSec=11.980802383781992, CurrSamplesPerSec=11.952101201512836, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:21:28,427] [INFO] [timer.py:197:stop] 0/1587, RunningAvgSamplesPerSec=11.980748403484602, CurrSamplesPerSec=11.89584990271198, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:21:34,914] [INFO] [timer.py:197:stop] 0/1588, RunningAvgSamplesPerSec=11.980759177021463, CurrSamplesPerSec=11.997859621414728, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:21:41,448] [INFO] [timer.py:197:stop] 0/1589, RunningAvgSamplesPerSec=11.980696480473933, CurrSamplesPerSec=11.88207877397393, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:21:47,850] [INFO] [logging.py:68:log_dist] [Rank 0] step=1590, skipped=4, lr=[7.588888888888889e-06], mom=[[0.9, 0.999]] [2022-12-19 21:21:47,850] [INFO] [timer.py:197:stop] 0/1590, RunningAvgSamplesPerSec=11.980705576292042, CurrSamplesPerSec=11.995158063817161, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:21:54,364] [INFO] [timer.py:197:stop] 0/1591, RunningAvgSamplesPerSec=11.980652243921137, CurrSamplesPerSec=11.896555298623705, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:22:00,839] [INFO] [timer.py:197:stop] 0/1592, RunningAvgSamplesPerSec=11.980599089749866, CurrSamplesPerSec=11.896728760806466, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:22:07,357] [INFO] [timer.py:197:stop] 0/1593, RunningAvgSamplesPerSec=11.980540164818432, CurrSamplesPerSec=11.887576975068836, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:22:13,894] [INFO] [timer.py:197:stop] 0/1594, RunningAvgSamplesPerSec=11.980465964107802, CurrSamplesPerSec=11.863565275617551, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:22:20,296] [INFO] [timer.py:197:stop] 0/1595, RunningAvgSamplesPerSec=11.980442083648278, CurrSamplesPerSec=11.94254472779822, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:22:24,910] [INFO] [timer.py:197:stop] 0/1596, RunningAvgSamplesPerSec=11.982520388299175, CurrSamplesPerSec=16.558340609548964, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:22:31,448] [INFO] [timer.py:197:stop] 0/1597, RunningAvgSamplesPerSec=11.982437988894446, CurrSamplesPerSec=11.852518336302841, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:22:37,933] [INFO] [timer.py:197:stop] 0/1598, RunningAvgSamplesPerSec=11.982398406856253, CurrSamplesPerSec=11.919596158453919, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:22:44,408] [INFO] [timer.py:197:stop] 0/1599, RunningAvgSamplesPerSec=11.982351150416498, CurrSamplesPerSec=11.907401925843274, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:22:50,915] [INFO] [logging.py:68:log_dist] [Rank 0] step=1600, skipped=4, lr=[7.566666666666667e-06], mom=[[0.9, 0.999]] [2022-12-19 21:22:50,915] [INFO] [timer.py:197:stop] 0/1600, RunningAvgSamplesPerSec=11.982295117920357, CurrSamplesPerSec=11.8934749468287, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0002, 'learning_rate': 7.566666666666667e-06, 'epoch': 42.11} [2022-12-19 21:22:57,432] [INFO] [timer.py:197:stop] 0/1601, RunningAvgSamplesPerSec=11.982257707401939, CurrSamplesPerSec=11.922772668086669, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:23:03,947] [INFO] [timer.py:197:stop] 0/1602, RunningAvgSamplesPerSec=11.982221606468169, CurrSamplesPerSec=11.92477314938439, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:23:10,480] [INFO] [timer.py:197:stop] 0/1603, RunningAvgSamplesPerSec=11.982183377373495, CurrSamplesPerSec=11.921327675743635, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:23:16,938] [INFO] [timer.py:197:stop] 0/1604, RunningAvgSamplesPerSec=11.982179220872135, CurrSamplesPerSec=11.975528358200165, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:23:23,452] [INFO] [timer.py:197:stop] 0/1605, RunningAvgSamplesPerSec=11.982149286408104, CurrSamplesPerSec=11.934385554578641, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:23:29,972] [INFO] [timer.py:197:stop] 0/1606, RunningAvgSamplesPerSec=11.98216946692985, CurrSamplesPerSec=12.014606471114641, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:23:36,419] [INFO] [timer.py:197:stop] 0/1607, RunningAvgSamplesPerSec=11.982188092799795, CurrSamplesPerSec=12.012138712390056, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:23:42,918] [INFO] [timer.py:197:stop] 0/1608, RunningAvgSamplesPerSec=11.982151970017856, CurrSamplesPerSec=11.924454256190247, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:23:49,416] [INFO] [timer.py:197:stop] 0/1609, RunningAvgSamplesPerSec=11.982110685753986, CurrSamplesPerSec=11.916173247329814, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:23:56,226] [INFO] [logging.py:68:log_dist] [Rank 0] step=1610, skipped=4, lr=[7.544444444444445e-06], mom=[[0.9, 0.999]] [2022-12-19 21:23:56,227] [INFO] [timer.py:197:stop] 0/1610, RunningAvgSamplesPerSec=11.982060426375956, CurrSamplesPerSec=11.901834715460463, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:24:02,957] [INFO] [timer.py:197:stop] 0/1611, RunningAvgSamplesPerSec=11.982017192717453, CurrSamplesPerSec=11.912898744815257, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:24:09,866] [INFO] [timer.py:197:stop] 0/1612, RunningAvgSamplesPerSec=11.982005947909446, CurrSamplesPerSec=11.963940347935317, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:24:16,788] [INFO] [timer.py:197:stop] 0/1613, RunningAvgSamplesPerSec=11.981900666852496, CurrSamplesPerSec=11.814764039622647, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:24:23,711] [INFO] [timer.py:197:stop] 0/1614, RunningAvgSamplesPerSec=11.981844215530032, CurrSamplesPerSec=11.891586622066407, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:24:30,216] [INFO] [timer.py:197:stop] 0/1615, RunningAvgSamplesPerSec=11.981813803975106, CurrSamplesPerSec=11.932990262205717, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:24:36,697] [INFO] [timer.py:197:stop] 0/1616, RunningAvgSamplesPerSec=11.981713824940051, CurrSamplesPerSec=11.822590664848764, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:24:43,234] [INFO] [timer.py:197:stop] 0/1617, RunningAvgSamplesPerSec=11.981615288678501, CurrSamplesPerSec=11.824662362159453, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:24:49,844] [INFO] [timer.py:197:stop] 0/1618, RunningAvgSamplesPerSec=11.981591209686364, CurrSamplesPerSec=11.942829520608115, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:24:56,467] [INFO] [timer.py:197:stop] 0/1619, RunningAvgSamplesPerSec=11.981518710719298, CurrSamplesPerSec=11.865495585548958, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:25:02,989] [INFO] [logging.py:68:log_dist] [Rank 0] step=1620, skipped=4, lr=[7.5222222222222226e-06], mom=[[0.9, 0.999]] [2022-12-19 21:25:02,989] [INFO] [timer.py:197:stop] 0/1620, RunningAvgSamplesPerSec=11.98148103059911, CurrSamplesPerSec=11.920860735886249, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:25:09,494] [INFO] [timer.py:197:stop] 0/1621, RunningAvgSamplesPerSec=11.981501679422761, CurrSamplesPerSec=12.015004955982505, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:25:15,954] [INFO] [timer.py:197:stop] 0/1622, RunningAvgSamplesPerSec=11.981529713316863, CurrSamplesPerSec=12.027089277290258, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:25:22,459] [INFO] [timer.py:197:stop] 0/1623, RunningAvgSamplesPerSec=11.981436969250717, CurrSamplesPerSec=11.833053437473373, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:25:29,093] [INFO] [timer.py:197:stop] 0/1624, RunningAvgSamplesPerSec=11.981353255110122, CurrSamplesPerSec=11.847173290924525, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:25:35,733] [INFO] [timer.py:197:stop] 0/1625, RunningAvgSamplesPerSec=11.981323906586553, CurrSamplesPerSec=11.933909102476067, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 7.511111111111111e-06, 'epoch': 42.76} [2022-12-19 21:25:42,194] [INFO] [timer.py:197:stop] 0/1626, RunningAvgSamplesPerSec=11.981345380989088, CurrSamplesPerSec=12.016300079899631, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:25:48,703] [INFO] [timer.py:197:stop] 0/1627, RunningAvgSamplesPerSec=11.981272310800412, CurrSamplesPerSec=11.863770809847768, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:25:55,245] [INFO] [timer.py:197:stop] 0/1628, RunningAvgSamplesPerSec=11.981177456815782, CurrSamplesPerSec=11.828998714537466, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:26:01,806] [INFO] [timer.py:197:stop] 0/1629, RunningAvgSamplesPerSec=11.981083259626025, CurrSamplesPerSec=11.829853124713548, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:26:08,262] [INFO] [logging.py:68:log_dist] [Rank 0] step=1630, skipped=4, lr=[7.500000000000001e-06], mom=[[0.9, 0.999]] [2022-12-19 21:26:08,263] [INFO] [timer.py:197:stop] 0/1630, RunningAvgSamplesPerSec=11.981079028705723, CurrSamplesPerSec=11.974199276548777, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:26:14,831] [INFO] [timer.py:197:stop] 0/1631, RunningAvgSamplesPerSec=11.98108819934984, CurrSamplesPerSec=11.99603664689459, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:26:21,309] [INFO] [timer.py:197:stop] 0/1632, RunningAvgSamplesPerSec=11.981050431237723, CurrSamplesPerSec=11.919840689547192, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:26:27,806] [INFO] [timer.py:197:stop] 0/1633, RunningAvgSamplesPerSec=11.981019624651523, CurrSamplesPerSec=11.931014598317265, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:26:32,468] [INFO] [timer.py:197:stop] 0/1634, RunningAvgSamplesPerSec=11.98307313491452, CurrSamplesPerSec=16.632725435054997, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:26:38,880] [INFO] [timer.py:197:stop] 0/1635, RunningAvgSamplesPerSec=11.983089902461371, CurrSamplesPerSec=12.010517210422016, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:26:45,398] [INFO] [timer.py:197:stop] 0/1636, RunningAvgSamplesPerSec=11.983059836247481, CurrSamplesPerSec=11.934162179758005, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:26:51,836] [INFO] [timer.py:197:stop] 0/1637, RunningAvgSamplesPerSec=11.983065418391984, CurrSamplesPerSec=11.992193594914767, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:26:58,385] [INFO] [timer.py:197:stop] 0/1638, RunningAvgSamplesPerSec=11.983024533849328, CurrSamplesPerSec=11.916549359206055, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:27:05,020] [INFO] [timer.py:197:stop] 0/1639, RunningAvgSamplesPerSec=11.982962644840033, CurrSamplesPerSec=11.882561090274864, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:27:11,633] [INFO] [logging.py:68:log_dist] [Rank 0] step=1640, skipped=4, lr=[7.477777777777779e-06], mom=[[0.9, 0.999]] [2022-12-19 21:27:11,634] [INFO] [timer.py:197:stop] 0/1640, RunningAvgSamplesPerSec=11.98283323860618, CurrSamplesPerSec=11.774677356579664, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:27:18,135] [INFO] [timer.py:197:stop] 0/1641, RunningAvgSamplesPerSec=11.982790703722703, CurrSamplesPerSec=11.9135215654856, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:27:24,727] [INFO] [timer.py:197:stop] 0/1642, RunningAvgSamplesPerSec=11.982609133824303, CurrSamplesPerSec=11.692232104231454, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:27:31,301] [INFO] [timer.py:197:stop] 0/1643, RunningAvgSamplesPerSec=11.98251880795065, CurrSamplesPerSec=11.836194420059647, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:27:37,827] [INFO] [timer.py:197:stop] 0/1644, RunningAvgSamplesPerSec=11.98245343084948, CurrSamplesPerSec=11.87612221499357, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:27:44,456] [INFO] [timer.py:197:stop] 0/1645, RunningAvgSamplesPerSec=11.982361115277186, CurrSamplesPerSec=11.832673710703157, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:27:51,080] [INFO] [timer.py:197:stop] 0/1646, RunningAvgSamplesPerSec=11.982250282583491, CurrSamplesPerSec=11.802879777297996, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:27:57,763] [INFO] [timer.py:197:stop] 0/1647, RunningAvgSamplesPerSec=11.982220829554945, CurrSamplesPerSec=11.933995052047397, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:28:04,206] [INFO] [timer.py:197:stop] 0/1648, RunningAvgSamplesPerSec=11.982250243732654, CurrSamplesPerSec=12.03083287006104, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:28:10,663] [INFO] [timer.py:197:stop] 0/1649, RunningAvgSamplesPerSec=11.982249843740773, CurrSamplesPerSec=11.98159149330097, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:28:17,152] [INFO] [logging.py:68:log_dist] [Rank 0] step=1650, skipped=4, lr=[7.455555555555556e-06], mom=[[0.9, 0.999]] [2022-12-19 21:28:17,153] [INFO] [timer.py:197:stop] 0/1650, RunningAvgSamplesPerSec=11.982266553807412, CurrSamplesPerSec=12.009851430379443, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 7.455555555555556e-06, 'epoch': 43.42} [2022-12-19 21:28:23,794] [INFO] [timer.py:197:stop] 0/1651, RunningAvgSamplesPerSec=11.982221370293797, CurrSamplesPerSec=11.908219099325207, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:28:30,572] [INFO] [timer.py:197:stop] 0/1652, RunningAvgSamplesPerSec=11.98223617676552, CurrSamplesPerSec=12.006701932067772, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:28:37,071] [INFO] [timer.py:197:stop] 0/1653, RunningAvgSamplesPerSec=11.982239211797287, CurrSamplesPerSec=11.987249109291767, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:28:43,555] [INFO] [timer.py:197:stop] 0/1654, RunningAvgSamplesPerSec=11.982200759313278, CurrSamplesPerSec=11.919050498618057, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:28:50,093] [INFO] [timer.py:197:stop] 0/1655, RunningAvgSamplesPerSec=11.982099176488607, CurrSamplesPerSec=11.816603595144969, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:28:56,647] [INFO] [timer.py:197:stop] 0/1656, RunningAvgSamplesPerSec=11.982054789812041, CurrSamplesPerSec=11.909130430211087, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:29:03,263] [INFO] [timer.py:197:stop] 0/1657, RunningAvgSamplesPerSec=11.98202913271697, CurrSamplesPerSec=11.939742156093999, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:29:09,816] [INFO] [timer.py:197:stop] 0/1658, RunningAvgSamplesPerSec=11.98200177173784, CurrSamplesPerSec=11.936889941070058, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:29:16,445] [INFO] [timer.py:197:stop] 0/1659, RunningAvgSamplesPerSec=11.981964286316135, CurrSamplesPerSec=11.920208563662936, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:29:22,958] [INFO] [logging.py:68:log_dist] [Rank 0] step=1660, skipped=4, lr=[7.433333333333334e-06], mom=[[0.9, 0.999]] [2022-12-19 21:29:22,959] [INFO] [timer.py:197:stop] 0/1660, RunningAvgSamplesPerSec=11.98189425488096, CurrSamplesPerSec=11.866965895157888, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:29:29,488] [INFO] [timer.py:197:stop] 0/1661, RunningAvgSamplesPerSec=11.981822778381165, CurrSamplesPerSec=11.864476077100422, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:29:35,973] [INFO] [timer.py:197:stop] 0/1662, RunningAvgSamplesPerSec=11.981794422320045, CurrSamplesPerSec=11.934935803283818, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:29:42,359] [INFO] [timer.py:197:stop] 0/1663, RunningAvgSamplesPerSec=11.981743583703798, CurrSamplesPerSec=11.897942083025232, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:29:48,873] [INFO] [timer.py:197:stop] 0/1664, RunningAvgSamplesPerSec=11.981706940142452, CurrSamplesPerSec=11.921149789579973, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:29:55,497] [INFO] [timer.py:197:stop] 0/1665, RunningAvgSamplesPerSec=11.981670080064534, CurrSamplesPerSec=11.920720449310945, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:30:02,084] [INFO] [timer.py:197:stop] 0/1666, RunningAvgSamplesPerSec=11.981683795062168, CurrSamplesPerSec=12.004535361968985, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:30:08,550] [INFO] [timer.py:197:stop] 0/1667, RunningAvgSamplesPerSec=11.98170870786476, CurrSamplesPerSec=12.023307623843978, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:30:15,058] [INFO] [timer.py:197:stop] 0/1668, RunningAvgSamplesPerSec=11.981687080238327, CurrSamplesPerSec=11.945785047703847, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:30:21,490] [INFO] [timer.py:197:stop] 0/1669, RunningAvgSamplesPerSec=11.981685033176312, CurrSamplesPerSec=11.97827559888822, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:30:28,073] [INFO] [logging.py:68:log_dist] [Rank 0] step=1670, skipped=4, lr=[7.411111111111112e-06], mom=[[0.9, 0.999]] [2022-12-19 21:30:28,073] [INFO] [timer.py:197:stop] 0/1670, RunningAvgSamplesPerSec=11.981646150595335, CurrSamplesPerSec=11.917177852284338, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:30:34,648] [INFO] [timer.py:197:stop] 0/1671, RunningAvgSamplesPerSec=11.98161883146648, CurrSamplesPerSec=11.93622327571813, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:30:39,317] [INFO] [timer.py:197:stop] 0/1672, RunningAvgSamplesPerSec=11.98361999491057, CurrSamplesPerSec=16.615197059671875, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:30:45,849] [INFO] [timer.py:197:stop] 0/1673, RunningAvgSamplesPerSec=11.98359360001985, CurrSamplesPerSec=11.939675772980774, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:30:52,416] [INFO] [timer.py:197:stop] 0/1674, RunningAvgSamplesPerSec=11.983613224247115, CurrSamplesPerSec=12.016495340820954, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:30:59,007] [INFO] [timer.py:197:stop] 0/1675, RunningAvgSamplesPerSec=11.983577205699804, CurrSamplesPerSec=11.923655508905158, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 7.4e-06, 'epoch': 44.08} [2022-12-19 21:31:05,569] [INFO] [timer.py:197:stop] 0/1676, RunningAvgSamplesPerSec=11.983599797260238, CurrSamplesPerSec=12.02151513243461, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:31:12,189] [INFO] [timer.py:197:stop] 0/1677, RunningAvgSamplesPerSec=11.98356230889577, CurrSamplesPerSec=11.921133907144945, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:31:18,731] [INFO] [timer.py:197:stop] 0/1678, RunningAvgSamplesPerSec=11.983478509715262, CurrSamplesPerSec=11.844740899408057, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:31:25,247] [INFO] [timer.py:197:stop] 0/1679, RunningAvgSamplesPerSec=11.983456248256974, CurrSamplesPerSec=11.946261916925078, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:31:31,783] [INFO] [logging.py:68:log_dist] [Rank 0] step=1680, skipped=4, lr=[7.38888888888889e-06], mom=[[0.9, 0.999]] [2022-12-19 21:31:31,784] [INFO] [timer.py:197:stop] 0/1680, RunningAvgSamplesPerSec=11.983382520542499, CurrSamplesPerSec=11.861004559785972, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:31:38,315] [INFO] [timer.py:197:stop] 0/1681, RunningAvgSamplesPerSec=11.983401581651288, CurrSamplesPerSec=12.015471770795132, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:31:44,964] [INFO] [timer.py:197:stop] 0/1682, RunningAvgSamplesPerSec=11.983361702478488, CurrSamplesPerSec=11.916776836419537, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:31:51,583] [INFO] [timer.py:197:stop] 0/1683, RunningAvgSamplesPerSec=11.98337493648801, CurrSamplesPerSec=12.005649423646172, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:31:58,055] [INFO] [timer.py:197:stop] 0/1684, RunningAvgSamplesPerSec=11.983345210442701, CurrSamplesPerSec=11.933583354227682, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:32:04,532] [INFO] [timer.py:197:stop] 0/1685, RunningAvgSamplesPerSec=11.983304086293455, CurrSamplesPerSec=11.914530482810743, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:32:11,167] [INFO] [timer.py:197:stop] 0/1686, RunningAvgSamplesPerSec=11.983235668069268, CurrSamplesPerSec=11.869184375937797, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:32:17,670] [INFO] [timer.py:197:stop] 0/1687, RunningAvgSamplesPerSec=11.98316875568917, CurrSamplesPerSec=11.871538614420635, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:32:24,205] [INFO] [timer.py:197:stop] 0/1688, RunningAvgSamplesPerSec=11.983099309052353, CurrSamplesPerSec=11.867214042559475, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:32:30,678] [INFO] [timer.py:197:stop] 0/1689, RunningAvgSamplesPerSec=11.983054912709598, CurrSamplesPerSec=11.908667615182397, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:32:37,179] [INFO] [logging.py:68:log_dist] [Rank 0] step=1690, skipped=4, lr=[7.3666666666666676e-06], mom=[[0.9, 0.999]] [2022-12-19 21:32:37,179] [INFO] [timer.py:197:stop] 0/1690, RunningAvgSamplesPerSec=11.983014114437038, CurrSamplesPerSec=11.914580721629955, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:32:43,644] [INFO] [timer.py:197:stop] 0/1691, RunningAvgSamplesPerSec=11.982994383462653, CurrSamplesPerSec=11.949780868007531, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:32:50,227] [INFO] [timer.py:197:stop] 0/1692, RunningAvgSamplesPerSec=11.982896152907854, CurrSamplesPerSec=11.81925185475882, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:32:56,777] [INFO] [timer.py:197:stop] 0/1693, RunningAvgSamplesPerSec=11.982845458021737, CurrSamplesPerSec=11.89777965962041, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:33:03,260] [INFO] [timer.py:197:stop] 0/1694, RunningAvgSamplesPerSec=11.982849959384366, CurrSamplesPerSec=11.990466604730061, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:33:09,753] [INFO] [timer.py:197:stop] 0/1695, RunningAvgSamplesPerSec=11.982815322773265, CurrSamplesPerSec=11.924495573615749, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:33:16,206] [INFO] [timer.py:197:stop] 0/1696, RunningAvgSamplesPerSec=11.982791302816116, CurrSamplesPerSec=11.942263136322547, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:33:22,803] [INFO] [timer.py:197:stop] 0/1697, RunningAvgSamplesPerSec=11.982721312094771, CurrSamplesPerSec=11.865319361687702, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:33:29,287] [INFO] [timer.py:197:stop] 0/1698, RunningAvgSamplesPerSec=11.98267587860052, CurrSamplesPerSec=11.906158156943066, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:33:35,728] [INFO] [timer.py:197:stop] 0/1699, RunningAvgSamplesPerSec=11.982646836014098, CurrSamplesPerSec=11.933592373076623, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:33:42,141] [INFO] [logging.py:68:log_dist] [Rank 0] step=1700, skipped=4, lr=[7.344444444444445e-06], mom=[[0.9, 0.999]] [2022-12-19 21:33:42,141] [INFO] [timer.py:197:stop] 0/1700, RunningAvgSamplesPerSec=11.982597981795001, CurrSamplesPerSec=11.90026237410618, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 7.344444444444445e-06, 'epoch': 44.74} [2022-12-19 21:33:48,561] [INFO] [timer.py:197:stop] 0/1701, RunningAvgSamplesPerSec=11.98261684427808, CurrSamplesPerSec=12.014731230066305, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:33:55,017] [INFO] [timer.py:197:stop] 0/1702, RunningAvgSamplesPerSec=11.982582832967749, CurrSamplesPerSec=11.92507510716601, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:34:01,546] [INFO] [timer.py:197:stop] 0/1703, RunningAvgSamplesPerSec=11.982544551649275, CurrSamplesPerSec=11.91781805227484, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:34:08,044] [INFO] [timer.py:197:stop] 0/1704, RunningAvgSamplesPerSec=11.982468821357704, CurrSamplesPerSec=11.855022509201886, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:34:14,513] [INFO] [timer.py:197:stop] 0/1705, RunningAvgSamplesPerSec=11.98246455480272, CurrSamplesPerSec=11.975207278893665, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:34:21,014] [INFO] [timer.py:197:stop] 0/1706, RunningAvgSamplesPerSec=11.98240660732661, CurrSamplesPerSec=11.88452863104639, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:34:27,492] [INFO] [timer.py:197:stop] 0/1707, RunningAvgSamplesPerSec=11.982433970442909, CurrSamplesPerSec=12.029242973445816, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:34:33,931] [INFO] [timer.py:197:stop] 0/1708, RunningAvgSamplesPerSec=11.982450114391215, CurrSamplesPerSec=12.010038959017487, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:34:40,478] [INFO] [timer.py:197:stop] 0/1709, RunningAvgSamplesPerSec=11.982436187509913, CurrSamplesPerSec=11.95872397310783, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:34:45,118] [INFO] [logging.py:68:log_dist] [Rank 0] step=1710, skipped=4, lr=[7.322222222222223e-06], mom=[[0.9, 0.999]] [2022-12-19 21:34:45,119] [INFO] [timer.py:197:stop] 0/1710, RunningAvgSamplesPerSec=11.984379557929813, CurrSamplesPerSec=16.572459124403217, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:34:51,548] [INFO] [timer.py:197:stop] 0/1711, RunningAvgSamplesPerSec=11.984376738027912, CurrSamplesPerSec=11.979562281590974, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:34:58,085] [INFO] [timer.py:197:stop] 0/1712, RunningAvgSamplesPerSec=11.984276764184143, CurrSamplesPerSec=11.815824431589316, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:35:04,553] [INFO] [timer.py:197:stop] 0/1713, RunningAvgSamplesPerSec=11.98421892794508, CurrSamplesPerSec=11.886128921959804, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:35:11,048] [INFO] [timer.py:197:stop] 0/1714, RunningAvgSamplesPerSec=11.984187488385006, CurrSamplesPerSec=11.930634921502659, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:35:17,511] [INFO] [timer.py:197:stop] 0/1715, RunningAvgSamplesPerSec=11.984094751645989, CurrSamplesPerSec=11.827406473389603, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:35:24,038] [INFO] [timer.py:197:stop] 0/1716, RunningAvgSamplesPerSec=11.98403699441524, CurrSamplesPerSec=11.885909455059197, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:35:30,545] [INFO] [timer.py:197:stop] 0/1717, RunningAvgSamplesPerSec=11.984007203691071, CurrSamplesPerSec=11.933162666599333, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:35:37,024] [INFO] [timer.py:197:stop] 0/1718, RunningAvgSamplesPerSec=11.983973679070353, CurrSamplesPerSec=11.926753635457697, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:35:43,489] [INFO] [timer.py:197:stop] 0/1719, RunningAvgSamplesPerSec=11.98393029404217, CurrSamplesPerSec=11.909941499918451, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:35:49,945] [INFO] [logging.py:68:log_dist] [Rank 0] step=1720, skipped=4, lr=[7.3e-06], mom=[[0.9, 0.999]] [2022-12-19 21:35:49,945] [INFO] [timer.py:197:stop] 0/1720, RunningAvgSamplesPerSec=11.9839477389403, CurrSamplesPerSec=12.013975725217769, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:35:56,468] [INFO] [timer.py:197:stop] 0/1721, RunningAvgSamplesPerSec=11.98392251116039, CurrSamplesPerSec=11.940737460223001, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:36:02,972] [INFO] [timer.py:197:stop] 0/1722, RunningAvgSamplesPerSec=11.983888912345584, CurrSamplesPerSec=11.926409731802298, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:36:09,437] [INFO] [timer.py:197:stop] 0/1723, RunningAvgSamplesPerSec=11.9838604323071, CurrSamplesPerSec=11.93507430203043, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:36:15,873] [INFO] [timer.py:197:stop] 0/1724, RunningAvgSamplesPerSec=11.983885610651448, CurrSamplesPerSec=12.027374883320974, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:36:22,555] [INFO] [timer.py:197:stop] 0/1725, RunningAvgSamplesPerSec=11.983618362176369, CurrSamplesPerSec=11.54044535850125, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 7.28888888888889e-06, 'epoch': 45.39} [2022-12-19 21:36:29,061] [INFO] [timer.py:197:stop] 0/1726, RunningAvgSamplesPerSec=11.9835560029953, CurrSamplesPerSec=11.877066475974372, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:36:35,493] [INFO] [timer.py:197:stop] 0/1727, RunningAvgSamplesPerSec=11.983578716960656, CurrSamplesPerSec=12.022866047355864, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:36:41,903] [INFO] [timer.py:197:stop] 0/1728, RunningAvgSamplesPerSec=11.98360254283499, CurrSamplesPerSec=12.024843701039336, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:36:48,328] [INFO] [timer.py:197:stop] 0/1729, RunningAvgSamplesPerSec=11.983618596258953, CurrSamplesPerSec=12.01139105798458, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:36:54,909] [INFO] [logging.py:68:log_dist] [Rank 0] step=1730, skipped=4, lr=[7.277777777777778e-06], mom=[[0.9, 0.999]] [2022-12-19 21:36:54,909] [INFO] [timer.py:197:stop] 0/1730, RunningAvgSamplesPerSec=11.983494288872617, CurrSamplesPerSec=11.772595771494801, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:37:01,394] [INFO] [timer.py:197:stop] 0/1731, RunningAvgSamplesPerSec=11.983476092466178, CurrSamplesPerSec=11.952115037893307, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:37:07,929] [INFO] [timer.py:197:stop] 0/1732, RunningAvgSamplesPerSec=11.983427205976033, CurrSamplesPerSec=11.899494821455573, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:37:14,460] [INFO] [timer.py:197:stop] 0/1733, RunningAvgSamplesPerSec=11.983381398518546, CurrSamplesPerSec=11.90465541816453, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:37:20,959] [INFO] [timer.py:197:stop] 0/1734, RunningAvgSamplesPerSec=11.983396123943171, CurrSamplesPerSec=12.008940199798076, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:37:27,488] [INFO] [timer.py:197:stop] 0/1735, RunningAvgSamplesPerSec=11.983361515075478, CurrSamplesPerSec=11.923717476848212, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:37:34,011] [INFO] [timer.py:197:stop] 0/1736, RunningAvgSamplesPerSec=11.983348532948375, CurrSamplesPerSec=11.960892690520565, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:37:40,460] [INFO] [timer.py:197:stop] 0/1737, RunningAvgSamplesPerSec=11.983322776290722, CurrSamplesPerSec=11.938826665307493, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:37:46,914] [INFO] [timer.py:197:stop] 0/1738, RunningAvgSamplesPerSec=11.983290466090866, CurrSamplesPerSec=11.92749343996751, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:37:53,366] [INFO] [timer.py:197:stop] 0/1739, RunningAvgSamplesPerSec=11.983310081715326, CurrSamplesPerSec=12.017459904523953, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:37:59,730] [INFO] [logging.py:68:log_dist] [Rank 0] step=1740, skipped=4, lr=[7.255555555555556e-06], mom=[[0.9, 0.999]] [2022-12-19 21:37:59,730] [INFO] [timer.py:197:stop] 0/1740, RunningAvgSamplesPerSec=11.98333395783772, CurrSamplesPerSec=12.024950896390061, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:38:06,257] [INFO] [timer.py:197:stop] 0/1741, RunningAvgSamplesPerSec=11.983329553106383, CurrSamplesPerSec=11.975679020315413, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:38:12,750] [INFO] [timer.py:197:stop] 0/1742, RunningAvgSamplesPerSec=11.983312835723082, CurrSamplesPerSec=11.954311703399796, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:38:19,215] [INFO] [timer.py:197:stop] 0/1743, RunningAvgSamplesPerSec=11.98330272967039, CurrSamplesPerSec=11.965743978847502, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:38:25,622] [INFO] [timer.py:197:stop] 0/1744, RunningAvgSamplesPerSec=11.983301755481536, CurrSamplesPerSec=11.981605932843221, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:38:32,128] [INFO] [timer.py:197:stop] 0/1745, RunningAvgSamplesPerSec=11.983318667500786, CurrSamplesPerSec=12.012852053924137, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:38:39,202] [INFO] [timer.py:197:stop] 0/1746, RunningAvgSamplesPerSec=11.98326164050332, CurrSamplesPerSec=11.884681747770234, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:38:46,204] [INFO] [timer.py:197:stop] 0/1747, RunningAvgSamplesPerSec=11.983180850505871, CurrSamplesPerSec=11.843921440344767, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:38:51,102] [INFO] [timer.py:197:stop] 0/1748, RunningAvgSamplesPerSec=11.985016294281538, CurrSamplesPerSec=16.356856044982482, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:38:58,003] [INFO] [timer.py:197:stop] 0/1749, RunningAvgSamplesPerSec=11.984991878045964, CurrSamplesPerSec=11.942512317537043, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:39:04,702] [INFO] [logging.py:68:log_dist] [Rank 0] step=1750, skipped=4, lr=[7.233333333333334e-06], mom=[[0.9, 0.999]] [2022-12-19 21:39:04,703] [INFO] [timer.py:197:stop] 0/1750, RunningAvgSamplesPerSec=11.98493038754394, CurrSamplesPerSec=11.87846133531671, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 7.233333333333334e-06, 'epoch': 46.05} [2022-12-19 21:39:11,117] [INFO] [timer.py:197:stop] 0/1751, RunningAvgSamplesPerSec=11.984943448121035, CurrSamplesPerSec=12.007816933070943, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:39:17,587] [INFO] [timer.py:197:stop] 0/1752, RunningAvgSamplesPerSec=11.984898025340968, CurrSamplesPerSec=11.905977026717094, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:39:24,108] [INFO] [timer.py:197:stop] 0/1753, RunningAvgSamplesPerSec=11.984918676377681, CurrSamplesPerSec=12.021167357329741, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:39:30,842] [INFO] [timer.py:197:stop] 0/1754, RunningAvgSamplesPerSec=11.984885849818976, CurrSamplesPerSec=11.927681055412357, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:39:37,280] [INFO] [timer.py:197:stop] 0/1755, RunningAvgSamplesPerSec=11.984890006220313, CurrSamplesPerSec=11.99217645112862, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:39:43,737] [INFO] [timer.py:197:stop] 0/1756, RunningAvgSamplesPerSec=11.984901364243706, CurrSamplesPerSec=12.004845130896125, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:39:50,182] [INFO] [timer.py:197:stop] 0/1757, RunningAvgSamplesPerSec=11.984846855979146, CurrSamplesPerSec=11.889996447646189, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:39:56,633] [INFO] [timer.py:197:stop] 0/1758, RunningAvgSamplesPerSec=11.98485630795293, CurrSamplesPerSec=12.001467526592661, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:40:03,248] [INFO] [timer.py:197:stop] 0/1759, RunningAvgSamplesPerSec=11.984824315974437, CurrSamplesPerSec=11.928908651249271, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:40:09,891] [INFO] [logging.py:68:log_dist] [Rank 0] step=1760, skipped=4, lr=[7.211111111111112e-06], mom=[[0.9, 0.999]] [2022-12-19 21:40:09,892] [INFO] [timer.py:197:stop] 0/1760, RunningAvgSamplesPerSec=11.984788685900769, CurrSamplesPerSec=11.922512130575303, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:40:16,553] [INFO] [timer.py:197:stop] 0/1761, RunningAvgSamplesPerSec=11.984756495325104, CurrSamplesPerSec=11.928431576218008, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:40:23,069] [INFO] [timer.py:197:stop] 0/1762, RunningAvgSamplesPerSec=11.984761572939007, CurrSamplesPerSec=11.993699760675977, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:40:29,541] [INFO] [timer.py:197:stop] 0/1763, RunningAvgSamplesPerSec=11.98477385911681, CurrSamplesPerSec=12.006436639591792, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:40:36,127] [INFO] [timer.py:197:stop] 0/1764, RunningAvgSamplesPerSec=11.984658909130781, CurrSamplesPerSec=11.785596163074597, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:40:42,584] [INFO] [timer.py:197:stop] 0/1765, RunningAvgSamplesPerSec=11.984658134842757, CurrSamplesPerSec=11.983293994720816, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:40:49,124] [INFO] [timer.py:197:stop] 0/1766, RunningAvgSamplesPerSec=11.984674378472103, CurrSamplesPerSec=12.013380529598376, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:40:55,793] [INFO] [timer.py:197:stop] 0/1767, RunningAvgSamplesPerSec=11.984615011762951, CurrSamplesPerSec=11.88079979973494, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:41:02,289] [INFO] [timer.py:197:stop] 0/1768, RunningAvgSamplesPerSec=11.984570101995423, CurrSamplesPerSec=11.905825473007772, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:41:08,813] [INFO] [timer.py:197:stop] 0/1769, RunningAvgSamplesPerSec=11.98452307590679, CurrSamplesPerSec=11.90204685506011, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:41:15,268] [INFO] [logging.py:68:log_dist] [Rank 0] step=1770, skipped=4, lr=[7.188888888888889e-06], mom=[[0.9, 0.999]] [2022-12-19 21:41:15,269] [INFO] [timer.py:197:stop] 0/1770, RunningAvgSamplesPerSec=11.98452577652348, CurrSamplesPerSec=11.989299668158223, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:41:21,784] [INFO] [timer.py:197:stop] 0/1771, RunningAvgSamplesPerSec=11.984499355008646, CurrSamplesPerSec=11.937967591098031, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:41:28,343] [INFO] [timer.py:197:stop] 0/1772, RunningAvgSamplesPerSec=11.984467621386695, CurrSamplesPerSec=11.928592717294011, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:41:34,974] [INFO] [timer.py:197:stop] 0/1773, RunningAvgSamplesPerSec=11.984415275240861, CurrSamplesPerSec=11.892473807118478, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:41:41,536] [INFO] [timer.py:197:stop] 0/1774, RunningAvgSamplesPerSec=11.984411569797668, CurrSamplesPerSec=11.977852823323278, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:41:48,197] [INFO] [timer.py:197:stop] 0/1775, RunningAvgSamplesPerSec=11.984360563842776, CurrSamplesPerSec=11.894654927312608, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 7.177777777777778e-06, 'epoch': 46.71} [2022-12-19 21:41:54,840] [INFO] [timer.py:197:stop] 0/1776, RunningAvgSamplesPerSec=11.984304253967872, CurrSamplesPerSec=11.885292149180918, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:42:01,415] [INFO] [timer.py:197:stop] 0/1777, RunningAvgSamplesPerSec=11.984270827705922, CurrSamplesPerSec=11.925264765661515, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:42:08,057] [INFO] [timer.py:197:stop] 0/1778, RunningAvgSamplesPerSec=11.98421407016559, CurrSamplesPerSec=11.88430974938531, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:42:14,616] [INFO] [timer.py:197:stop] 0/1779, RunningAvgSamplesPerSec=11.984172563739168, CurrSamplesPerSec=11.910908058825902, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:42:21,129] [INFO] [logging.py:68:log_dist] [Rank 0] step=1780, skipped=4, lr=[7.166666666666667e-06], mom=[[0.9, 0.999]] [2022-12-19 21:42:21,130] [INFO] [timer.py:197:stop] 0/1780, RunningAvgSamplesPerSec=11.984117913766116, CurrSamplesPerSec=11.887785974424796, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:42:27,533] [INFO] [timer.py:197:stop] 0/1781, RunningAvgSamplesPerSec=11.984127919188763, CurrSamplesPerSec=12.001944022350354, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:42:34,026] [INFO] [timer.py:197:stop] 0/1782, RunningAvgSamplesPerSec=11.984134495541923, CurrSamplesPerSec=11.995845266706734, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:42:40,439] [INFO] [timer.py:197:stop] 0/1783, RunningAvgSamplesPerSec=11.984138147566162, CurrSamplesPerSec=11.990642280754894, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:42:47,050] [INFO] [timer.py:197:stop] 0/1784, RunningAvgSamplesPerSec=11.984120470796105, CurrSamplesPerSec=11.952720677011706, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:42:53,626] [INFO] [timer.py:197:stop] 0/1785, RunningAvgSamplesPerSec=11.984071977487108, CurrSamplesPerSec=11.898275908686145, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:42:58,496] [INFO] [timer.py:197:stop] 0/1786, RunningAvgSamplesPerSec=11.985886820224465, CurrSamplesPerSec=16.419333033738436, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:43:05,019] [INFO] [timer.py:197:stop] 0/1787, RunningAvgSamplesPerSec=11.985794509661009, CurrSamplesPerSec=11.823345724330707, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:43:11,534] [INFO] [timer.py:197:stop] 0/1788, RunningAvgSamplesPerSec=11.985778893094057, CurrSamplesPerSec=11.957968037597585, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:43:18,197] [INFO] [timer.py:197:stop] 0/1789, RunningAvgSamplesPerSec=11.985720324595192, CurrSamplesPerSec=11.882022497523542, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:43:24,708] [INFO] [logging.py:68:log_dist] [Rank 0] step=1790, skipped=4, lr=[7.1444444444444446e-06], mom=[[0.9, 0.999]] [2022-12-19 21:43:24,708] [INFO] [timer.py:197:stop] 0/1790, RunningAvgSamplesPerSec=11.985637015680787, CurrSamplesPerSec=11.838591453674773, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:43:31,211] [INFO] [timer.py:197:stop] 0/1791, RunningAvgSamplesPerSec=11.985598905571031, CurrSamplesPerSec=11.917843450056536, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:43:37,633] [INFO] [timer.py:197:stop] 0/1792, RunningAvgSamplesPerSec=11.98559158297315, CurrSamplesPerSec=11.972505766024982, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:43:44,216] [INFO] [timer.py:197:stop] 0/1793, RunningAvgSamplesPerSec=11.9854599833142, CurrSamplesPerSec=11.754439626811681, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:43:50,776] [INFO] [timer.py:197:stop] 0/1794, RunningAvgSamplesPerSec=11.985461067566542, CurrSamplesPerSec=11.987403278362898, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:43:57,277] [INFO] [timer.py:197:stop] 0/1795, RunningAvgSamplesPerSec=11.985453020358069, CurrSamplesPerSec=11.971049762084514, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:44:03,799] [INFO] [timer.py:197:stop] 0/1796, RunningAvgSamplesPerSec=11.985407881802923, CurrSamplesPerSec=11.905017603705218, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:44:10,279] [INFO] [timer.py:197:stop] 0/1797, RunningAvgSamplesPerSec=11.985406381667115, CurrSamplesPerSec=11.982715742530273, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:44:16,797] [INFO] [timer.py:197:stop] 0/1798, RunningAvgSamplesPerSec=11.985332224696922, CurrSamplesPerSec=11.853683398848226, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:44:23,339] [INFO] [timer.py:197:stop] 0/1799, RunningAvgSamplesPerSec=11.985309137421515, CurrSamplesPerSec=11.94398742825544, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:44:29,822] [INFO] [logging.py:68:log_dist] [Rank 0] step=1800, skipped=4, lr=[7.122222222222222e-06], mom=[[0.9, 0.999]] [2022-12-19 21:44:29,823] [INFO] [timer.py:197:stop] 0/1800, RunningAvgSamplesPerSec=11.98530805991732, CurrSamplesPerSec=11.983372097814435, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 7.122222222222222e-06, 'epoch': 47.37} [2022-12-19 21:44:36,320] [INFO] [timer.py:197:stop] 0/1801, RunningAvgSamplesPerSec=11.985332475316728, CurrSamplesPerSec=12.029392833857713, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:44:42,848] [INFO] [timer.py:197:stop] 0/1802, RunningAvgSamplesPerSec=11.985301163099315, CurrSamplesPerSec=11.9292341443356, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:44:49,287] [INFO] [timer.py:197:stop] 0/1803, RunningAvgSamplesPerSec=11.985275512810091, CurrSamplesPerSec=11.939282269042875, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:44:55,771] [INFO] [timer.py:197:stop] 0/1804, RunningAvgSamplesPerSec=11.98526753892771, CurrSamplesPerSec=11.970923773259445, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:45:02,236] [INFO] [timer.py:197:stop] 0/1805, RunningAvgSamplesPerSec=11.985272473893103, CurrSamplesPerSec=11.994171888364736, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:45:08,935] [INFO] [timer.py:197:stop] 0/1806, RunningAvgSamplesPerSec=11.98517822964499, CurrSamplesPerSec=11.817632581364814, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:45:15,540] [INFO] [timer.py:197:stop] 0/1807, RunningAvgSamplesPerSec=11.98514512072562, CurrSamplesPerSec=11.925712977029375, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:45:22,195] [INFO] [timer.py:197:stop] 0/1808, RunningAvgSamplesPerSec=11.985145000771485, CurrSamplesPerSec=11.984928487466064, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:45:28,729] [INFO] [timer.py:197:stop] 0/1809, RunningAvgSamplesPerSec=11.985166652182741, CurrSamplesPerSec=12.024397164086347, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:45:35,214] [INFO] [logging.py:68:log_dist] [Rank 0] step=1810, skipped=4, lr=[7.100000000000001e-06], mom=[[0.9, 0.999]] [2022-12-19 21:45:35,215] [INFO] [timer.py:197:stop] 0/1810, RunningAvgSamplesPerSec=11.985135271274256, CurrSamplesPerSec=11.928697143276636, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:45:41,693] [INFO] [timer.py:197:stop] 0/1811, RunningAvgSamplesPerSec=11.985143533021116, CurrSamplesPerSec=12.000099421371196, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:45:48,469] [INFO] [timer.py:197:stop] 0/1812, RunningAvgSamplesPerSec=11.985091616251685, CurrSamplesPerSec=11.891904813356973, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:45:55,083] [INFO] [timer.py:197:stop] 0/1813, RunningAvgSamplesPerSec=11.985089311275072, CurrSamplesPerSec=11.980918756177124, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:46:01,741] [INFO] [timer.py:197:stop] 0/1814, RunningAvgSamplesPerSec=11.985052599819172, CurrSamplesPerSec=11.91893512786409, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:46:08,235] [INFO] [timer.py:197:stop] 0/1815, RunningAvgSamplesPerSec=11.985055023015022, CurrSamplesPerSec=11.989447463990627, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:46:14,782] [INFO] [timer.py:197:stop] 0/1816, RunningAvgSamplesPerSec=11.9850199566921, CurrSamplesPerSec=11.921780356442573, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:46:21,245] [INFO] [timer.py:197:stop] 0/1817, RunningAvgSamplesPerSec=11.984989426711289, CurrSamplesPerSec=11.929862915804025, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:46:27,730] [INFO] [timer.py:197:stop] 0/1818, RunningAvgSamplesPerSec=11.985005718459028, CurrSamplesPerSec=12.014648415645482, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:46:34,340] [INFO] [timer.py:197:stop] 0/1819, RunningAvgSamplesPerSec=11.984972888453841, CurrSamplesPerSec=11.925648869083355, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:46:40,913] [INFO] [logging.py:68:log_dist] [Rank 0] step=1820, skipped=4, lr=[7.077777777777778e-06], mom=[[0.9, 0.999]] [2022-12-19 21:46:40,913] [INFO] [timer.py:197:stop] 0/1820, RunningAvgSamplesPerSec=11.984980772494776, CurrSamplesPerSec=11.999323227429155, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:46:47,429] [INFO] [timer.py:197:stop] 0/1821, RunningAvgSamplesPerSec=11.984947706208484, CurrSamplesPerSec=11.925133381480697, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:46:53,915] [INFO] [timer.py:197:stop] 0/1822, RunningAvgSamplesPerSec=11.984965134539761, CurrSamplesPerSec=12.016751395061732, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:47:00,285] [INFO] [timer.py:197:stop] 0/1823, RunningAvgSamplesPerSec=11.984978565352804, CurrSamplesPerSec=12.009472629647034, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:47:04,927] [INFO] [timer.py:197:stop] 0/1824, RunningAvgSamplesPerSec=11.986762304737248, CurrSamplesPerSec=16.443236397821572, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:47:11,369] [INFO] [timer.py:197:stop] 0/1825, RunningAvgSamplesPerSec=11.986791098223744, CurrSamplesPerSec=12.039483572581231, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 7.066666666666667e-06, 'epoch': 48.03} [2022-12-19 21:47:17,845] [INFO] [timer.py:197:stop] 0/1826, RunningAvgSamplesPerSec=11.98681219315209, CurrSamplesPerSec=12.025392087324711, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:47:24,345] [INFO] [timer.py:197:stop] 0/1827, RunningAvgSamplesPerSec=11.98678385747151, CurrSamplesPerSec=11.935321591310668, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:47:30,825] [INFO] [timer.py:197:stop] 0/1828, RunningAvgSamplesPerSec=11.986755310833766, CurrSamplesPerSec=11.934883270117338, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:47:37,316] [INFO] [timer.py:197:stop] 0/1829, RunningAvgSamplesPerSec=11.986690449570945, CurrSamplesPerSec=11.86941319738902, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:47:43,710] [INFO] [logging.py:68:log_dist] [Rank 0] step=1830, skipped=4, lr=[7.055555555555557e-06], mom=[[0.9, 0.999]] [2022-12-19 21:47:43,711] [INFO] [timer.py:197:stop] 0/1830, RunningAvgSamplesPerSec=11.9867146667743, CurrSamplesPerSec=12.031123506969623, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:47:50,211] [INFO] [timer.py:197:stop] 0/1831, RunningAvgSamplesPerSec=11.986674084552588, CurrSamplesPerSec=11.91294632643987, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:47:56,734] [INFO] [timer.py:197:stop] 0/1832, RunningAvgSamplesPerSec=11.986642411597062, CurrSamplesPerSec=11.928991348135268, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:48:03,284] [INFO] [timer.py:197:stop] 0/1833, RunningAvgSamplesPerSec=11.986617715228812, CurrSamplesPerSec=11.941593214512084, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:48:09,839] [INFO] [timer.py:197:stop] 0/1834, RunningAvgSamplesPerSec=11.986528718261477, CurrSamplesPerSec=11.825762041938905, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:48:16,318] [INFO] [timer.py:197:stop] 0/1835, RunningAvgSamplesPerSec=11.986493393485004, CurrSamplesPerSec=11.92212611117282, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:48:22,780] [INFO] [timer.py:197:stop] 0/1836, RunningAvgSamplesPerSec=11.986468679004428, CurrSamplesPerSec=11.941337697128937, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:48:29,246] [INFO] [timer.py:197:stop] 0/1837, RunningAvgSamplesPerSec=11.986445414371419, CurrSamplesPerSec=11.943929500968537, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:48:35,756] [INFO] [timer.py:197:stop] 0/1838, RunningAvgSamplesPerSec=11.986405543171646, CurrSamplesPerSec=11.913686005238489, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:48:42,255] [INFO] [timer.py:197:stop] 0/1839, RunningAvgSamplesPerSec=11.986350553135821, CurrSamplesPerSec=11.886232605784457, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:48:48,765] [INFO] [logging.py:68:log_dist] [Rank 0] step=1840, skipped=4, lr=[7.033333333333334e-06], mom=[[0.9, 0.999]] [2022-12-19 21:48:48,765] [INFO] [timer.py:197:stop] 0/1840, RunningAvgSamplesPerSec=11.986333281536755, CurrSamplesPerSec=11.954689161909286, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:48:55,233] [INFO] [timer.py:197:stop] 0/1841, RunningAvgSamplesPerSec=11.98628126000581, CurrSamplesPerSec=11.89142279180107, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:49:01,624] [INFO] [timer.py:197:stop] 0/1842, RunningAvgSamplesPerSec=11.986295494785649, CurrSamplesPerSec=12.012530582821208, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:49:08,170] [INFO] [timer.py:197:stop] 0/1843, RunningAvgSamplesPerSec=11.9862523890489, CurrSamplesPerSec=11.907459499358351, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:49:14,823] [INFO] [timer.py:197:stop] 0/1844, RunningAvgSamplesPerSec=11.98601201685424, CurrSamplesPerSec=11.559251424605828, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:49:21,297] [INFO] [timer.py:197:stop] 0/1845, RunningAvgSamplesPerSec=11.986001107792998, CurrSamplesPerSec=11.9659402671761, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:49:27,652] [INFO] [timer.py:197:stop] 0/1846, RunningAvgSamplesPerSec=11.986009464973048, CurrSamplesPerSec=12.001431576308136, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:49:34,101] [INFO] [timer.py:197:stop] 0/1847, RunningAvgSamplesPerSec=11.986029094218226, CurrSamplesPerSec=12.022335121501582, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:49:40,519] [INFO] [timer.py:197:stop] 0/1848, RunningAvgSamplesPerSec=11.986047392105451, CurrSamplesPerSec=12.019902400895695, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:49:46,962] [INFO] [timer.py:197:stop] 0/1849, RunningAvgSamplesPerSec=11.986070055546046, CurrSamplesPerSec=12.028053386749237, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:49:53,568] [INFO] [logging.py:68:log_dist] [Rank 0] step=1850, skipped=4, lr=[7.011111111111112e-06], mom=[[0.9, 0.999]] [2022-12-19 21:49:53,569] [INFO] [timer.py:197:stop] 0/1850, RunningAvgSamplesPerSec=11.985956325450656, CurrSamplesPerSec=11.77951674815942, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 7.011111111111112e-06, 'epoch': 48.68} [2022-12-19 21:50:00,140] [INFO] [timer.py:197:stop] 0/1851, RunningAvgSamplesPerSec=11.985895888797849, CurrSamplesPerSec=11.875240619920703, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:50:06,727] [INFO] [timer.py:197:stop] 0/1852, RunningAvgSamplesPerSec=11.985827079091766, CurrSamplesPerSec=11.859934995463643, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:50:13,215] [INFO] [timer.py:197:stop] 0/1853, RunningAvgSamplesPerSec=11.985802028198082, CurrSamplesPerSec=11.939636474525587, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:50:19,733] [INFO] [timer.py:197:stop] 0/1854, RunningAvgSamplesPerSec=11.985809727132072, CurrSamplesPerSec=12.000077426933137, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:50:26,175] [INFO] [timer.py:197:stop] 0/1855, RunningAvgSamplesPerSec=11.985821865232866, CurrSamplesPerSec=12.008343891436722, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:50:32,677] [INFO] [timer.py:197:stop] 0/1856, RunningAvgSamplesPerSec=11.985756998732974, CurrSamplesPerSec=11.866753430471071, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:50:39,194] [INFO] [timer.py:197:stop] 0/1857, RunningAvgSamplesPerSec=11.985681771666467, CurrSamplesPerSec=11.847815926913752, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:50:45,698] [INFO] [timer.py:197:stop] 0/1858, RunningAvgSamplesPerSec=11.985602984319296, CurrSamplesPerSec=11.841214059946124, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:50:52,139] [INFO] [timer.py:197:stop] 0/1859, RunningAvgSamplesPerSec=11.985571212990273, CurrSamplesPerSec=11.926892474294032, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:50:58,617] [INFO] [logging.py:68:log_dist] [Rank 0] step=1860, skipped=4, lr=[6.9888888888888895e-06], mom=[[0.9, 0.999]] [2022-12-19 21:50:58,618] [INFO] [timer.py:197:stop] 0/1860, RunningAvgSamplesPerSec=11.985540086641096, CurrSamplesPerSec=11.928016021955315, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:51:05,164] [INFO] [timer.py:197:stop] 0/1861, RunningAvgSamplesPerSec=11.9854732952232, CurrSamplesPerSec=11.862647274713494, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:51:09,793] [INFO] [timer.py:197:stop] 0/1862, RunningAvgSamplesPerSec=11.98727401788469, CurrSamplesPerSec=16.63282231130994, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:51:16,229] [INFO] [timer.py:197:stop] 0/1863, RunningAvgSamplesPerSec=11.987269900978916, CurrSamplesPerSec=11.979617347300804, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:51:22,733] [INFO] [timer.py:197:stop] 0/1864, RunningAvgSamplesPerSec=11.98726702958665, CurrSamplesPerSec=11.981925750886083, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:51:29,259] [INFO] [timer.py:197:stop] 0/1865, RunningAvgSamplesPerSec=11.987230170593216, CurrSamplesPerSec=11.918989637620882, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:51:35,650] [INFO] [timer.py:197:stop] 0/1866, RunningAvgSamplesPerSec=11.987242510487286, CurrSamplesPerSec=12.010275930514608, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:51:42,091] [INFO] [timer.py:197:stop] 0/1867, RunningAvgSamplesPerSec=11.987246736740195, CurrSamplesPerSec=11.995129655408105, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:51:48,695] [INFO] [timer.py:197:stop] 0/1868, RunningAvgSamplesPerSec=11.987174661986126, CurrSamplesPerSec=11.854246646870394, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:51:55,149] [INFO] [timer.py:197:stop] 0/1869, RunningAvgSamplesPerSec=11.987174957897189, CurrSamplesPerSec=11.98772715338955, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:52:01,544] [INFO] [logging.py:68:log_dist] [Rank 0] step=1870, skipped=4, lr=[6.966666666666667e-06], mom=[[0.9, 0.999]] [2022-12-19 21:52:01,545] [INFO] [timer.py:197:stop] 0/1870, RunningAvgSamplesPerSec=11.987166893369064, CurrSamplesPerSec=11.97212931741964, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:52:08,013] [INFO] [timer.py:197:stop] 0/1871, RunningAvgSamplesPerSec=11.987107270443659, CurrSamplesPerSec=11.876757485903966, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:52:14,557] [INFO] [timer.py:197:stop] 0/1872, RunningAvgSamplesPerSec=11.987062425061186, CurrSamplesPerSec=11.903828704003034, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:52:21,038] [INFO] [timer.py:197:stop] 0/1873, RunningAvgSamplesPerSec=11.98700342626591, CurrSamplesPerSec=11.877682402104142, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:52:27,585] [INFO] [timer.py:197:stop] 0/1874, RunningAvgSamplesPerSec=11.986963505391959, CurrSamplesPerSec=11.91273432679397, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:52:34,070] [INFO] [timer.py:197:stop] 0/1875, RunningAvgSamplesPerSec=11.986936211634479, CurrSamplesPerSec=11.936059274716051, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 6.955555555555557e-06, 'epoch': 49.34} [2022-12-19 21:52:40,629] [INFO] [timer.py:197:stop] 0/1876, RunningAvgSamplesPerSec=11.986835706713933, CurrSamplesPerSec=11.8015020923357, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:52:47,055] [INFO] [timer.py:197:stop] 0/1877, RunningAvgSamplesPerSec=11.986851578941986, CurrSamplesPerSec=12.016670166589304, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:52:53,552] [INFO] [timer.py:197:stop] 0/1878, RunningAvgSamplesPerSec=11.986773312549758, CurrSamplesPerSec=11.841799631099905, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:53:00,118] [INFO] [timer.py:197:stop] 0/1879, RunningAvgSamplesPerSec=11.986670974375485, CurrSamplesPerSec=11.797712653590633, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:53:06,621] [INFO] [logging.py:68:log_dist] [Rank 0] step=1880, skipped=4, lr=[6.944444444444445e-06], mom=[[0.9, 0.999]] [2022-12-19 21:53:06,622] [INFO] [timer.py:197:stop] 0/1880, RunningAvgSamplesPerSec=11.9866732946702, CurrSamplesPerSec=11.991030071666424, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:53:13,124] [INFO] [timer.py:197:stop] 0/1881, RunningAvgSamplesPerSec=11.986574101931387, CurrSamplesPerSec=11.803142378232744, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:53:20,236] [INFO] [timer.py:197:stop] 0/1882, RunningAvgSamplesPerSec=11.98646914872177, CurrSamplesPerSec=11.792455766889198, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:53:27,163] [INFO] [timer.py:197:stop] 0/1883, RunningAvgSamplesPerSec=11.986428143132045, CurrSamplesPerSec=11.909830532718825, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:53:33,872] [INFO] [timer.py:197:stop] 0/1884, RunningAvgSamplesPerSec=11.986313585797172, CurrSamplesPerSec=11.774638620292883, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:53:40,737] [INFO] [timer.py:197:stop] 0/1885, RunningAvgSamplesPerSec=11.986289189699676, CurrSamplesPerSec=11.940551027236607, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:53:47,677] [INFO] [timer.py:197:stop] 0/1886, RunningAvgSamplesPerSec=11.986248979373922, CurrSamplesPerSec=11.911008475799509, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:53:54,212] [INFO] [timer.py:197:stop] 0/1887, RunningAvgSamplesPerSec=11.986217951719146, CurrSamplesPerSec=11.928045703442454, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:54:00,735] [INFO] [timer.py:197:stop] 0/1888, RunningAvgSamplesPerSec=11.98620414537362, CurrSamplesPerSec=11.960235597995615, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:54:07,249] [INFO] [timer.py:197:stop] 0/1889, RunningAvgSamplesPerSec=11.986195145889676, CurrSamplesPerSec=11.969246132513444, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:54:13,812] [INFO] [logging.py:68:log_dist] [Rank 0] step=1890, skipped=4, lr=[6.922222222222222e-06], mom=[[0.9, 0.999]] [2022-12-19 21:54:13,813] [INFO] [timer.py:197:stop] 0/1890, RunningAvgSamplesPerSec=11.986128983141942, CurrSamplesPerSec=11.862567592016221, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:54:20,493] [INFO] [timer.py:197:stop] 0/1891, RunningAvgSamplesPerSec=11.98607615380443, CurrSamplesPerSec=11.887157945601825, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:54:27,170] [INFO] [timer.py:197:stop] 0/1892, RunningAvgSamplesPerSec=11.986000335303048, CurrSamplesPerSec=11.844471217963978, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:54:33,842] [INFO] [timer.py:197:stop] 0/1893, RunningAvgSamplesPerSec=11.98594267055665, CurrSamplesPerSec=11.877938881651264, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:54:40,436] [INFO] [timer.py:197:stop] 0/1894, RunningAvgSamplesPerSec=11.985939092515382, CurrSamplesPerSec=11.979176835796192, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:54:47,039] [INFO] [timer.py:197:stop] 0/1895, RunningAvgSamplesPerSec=11.9859016614642, CurrSamplesPerSec=11.915498315648954, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:54:53,484] [INFO] [timer.py:197:stop] 0/1896, RunningAvgSamplesPerSec=11.985919889740202, CurrSamplesPerSec=12.02052569509226, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:54:59,959] [INFO] [timer.py:197:stop] 0/1897, RunningAvgSamplesPerSec=11.985911720327682, CurrSamplesPerSec=11.9704588120327, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:55:06,460] [INFO] [timer.py:197:stop] 0/1898, RunningAvgSamplesPerSec=11.985859818436099, CurrSamplesPerSec=11.888306662210175, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:55:12,936] [INFO] [timer.py:197:stop] 0/1899, RunningAvgSamplesPerSec=11.985815951928437, CurrSamplesPerSec=11.90321850846121, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:55:17,720] [INFO] [logging.py:68:log_dist] [Rank 0] step=1900, skipped=4, lr=[6.9e-06], mom=[[0.9, 0.999]] [2022-12-19 21:55:17,720] [INFO] [timer.py:197:stop] 0/1900, RunningAvgSamplesPerSec=11.987533872108784, CurrSamplesPerSec=16.46403910612812, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 6.9e-06, 'epoch': 50.0} [2022-12-19 21:55:24,260] [INFO] [timer.py:197:stop] 0/1901, RunningAvgSamplesPerSec=11.987459444820638, CurrSamplesPerSec=11.847842595983758, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:55:30,748] [INFO] [timer.py:197:stop] 0/1902, RunningAvgSamplesPerSec=11.987394174050669, CurrSamplesPerSec=11.864714156500192, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:55:37,165] [INFO] [timer.py:197:stop] 0/1903, RunningAvgSamplesPerSec=11.987402109453292, CurrSamplesPerSec=12.002498371895507, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:55:43,726] [INFO] [timer.py:197:stop] 0/1904, RunningAvgSamplesPerSec=11.987398888884577, CurrSamplesPerSec=11.981279714636829, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:55:50,276] [INFO] [timer.py:197:stop] 0/1905, RunningAvgSamplesPerSec=11.987401660318687, CurrSamplesPerSec=11.992675248189675, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:55:56,806] [INFO] [timer.py:197:stop] 0/1906, RunningAvgSamplesPerSec=11.987410512445404, CurrSamplesPerSec=12.004279828059168, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:56:03,272] [INFO] [timer.py:197:stop] 0/1907, RunningAvgSamplesPerSec=11.98741754392326, CurrSamplesPerSec=12.000820454422309, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:56:09,804] [INFO] [timer.py:197:stop] 0/1908, RunningAvgSamplesPerSec=11.987325105019337, CurrSamplesPerSec=11.813779746031123, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:56:16,270] [INFO] [timer.py:197:stop] 0/1909, RunningAvgSamplesPerSec=11.987282705435517, CurrSamplesPerSec=11.90701054633717, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:56:22,825] [INFO] [logging.py:68:log_dist] [Rank 0] step=1910, skipped=4, lr=[6.8777777777777785e-06], mom=[[0.9, 0.999]] [2022-12-19 21:56:22,825] [INFO] [timer.py:197:stop] 0/1910, RunningAvgSamplesPerSec=11.987236466293094, CurrSamplesPerSec=11.899702657552155, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:56:29,417] [INFO] [timer.py:197:stop] 0/1911, RunningAvgSamplesPerSec=11.987192248333942, CurrSamplesPerSec=11.903414334296581, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:56:35,995] [INFO] [timer.py:197:stop] 0/1912, RunningAvgSamplesPerSec=11.987211480734748, CurrSamplesPerSec=12.024038988933388, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:56:42,507] [INFO] [timer.py:197:stop] 0/1913, RunningAvgSamplesPerSec=11.987175190958185, CurrSamplesPerSec=11.918260412448763, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:56:49,034] [INFO] [timer.py:197:stop] 0/1914, RunningAvgSamplesPerSec=11.987140371100782, CurrSamplesPerSec=11.92096714413219, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:56:55,423] [INFO] [timer.py:197:stop] 0/1915, RunningAvgSamplesPerSec=11.987158088149462, CurrSamplesPerSec=12.021129135596592, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:57:01,937] [INFO] [timer.py:197:stop] 0/1916, RunningAvgSamplesPerSec=11.987100430477474, CurrSamplesPerSec=11.877807487056936, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:57:08,416] [INFO] [timer.py:197:stop] 0/1917, RunningAvgSamplesPerSec=11.987108552552069, CurrSamplesPerSec=12.00267440066208, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:57:15,017] [INFO] [timer.py:197:stop] 0/1918, RunningAvgSamplesPerSec=11.98707809993964, CurrSamplesPerSec=11.929043829446092, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:57:21,627] [INFO] [timer.py:197:stop] 0/1919, RunningAvgSamplesPerSec=11.987049844963062, CurrSamplesPerSec=11.933156831292251, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:57:28,245] [INFO] [logging.py:68:log_dist] [Rank 0] step=1920, skipped=4, lr=[6.855555555555556e-06], mom=[[0.9, 0.999]] [2022-12-19 21:57:28,246] [INFO] [timer.py:197:stop] 0/1920, RunningAvgSamplesPerSec=11.987061003411286, CurrSamplesPerSec=12.00849000836815, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:57:34,819] [INFO] [timer.py:197:stop] 0/1921, RunningAvgSamplesPerSec=11.986995507966771, CurrSamplesPerSec=11.862678728704465, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:57:41,356] [INFO] [timer.py:197:stop] 0/1922, RunningAvgSamplesPerSec=11.986963869048562, CurrSamplesPerSec=11.92655492137333, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:57:47,840] [INFO] [timer.py:197:stop] 0/1923, RunningAvgSamplesPerSec=11.986893677093256, CurrSamplesPerSec=11.8536242504423, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:57:54,343] [INFO] [timer.py:197:stop] 0/1924, RunningAvgSamplesPerSec=11.986859376982563, CurrSamplesPerSec=11.921329264036865, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:58:00,794] [INFO] [timer.py:197:stop] 0/1925, RunningAvgSamplesPerSec=11.986869796222512, CurrSamplesPerSec=12.006929104780703, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 6.844444444444445e-06, 'epoch': 50.66} [2022-12-19 21:58:07,283] [INFO] [timer.py:197:stop] 0/1926, RunningAvgSamplesPerSec=11.986847163537377, CurrSamplesPerSec=11.9434820448972, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:58:13,734] [INFO] [timer.py:197:stop] 0/1927, RunningAvgSamplesPerSec=11.98686287422769, CurrSamplesPerSec=12.01716669951104, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:58:20,166] [INFO] [timer.py:197:stop] 0/1928, RunningAvgSamplesPerSec=11.986824946351328, CurrSamplesPerSec=11.914256027688554, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:58:26,619] [INFO] [timer.py:197:stop] 0/1929, RunningAvgSamplesPerSec=11.986840134440738, CurrSamplesPerSec=12.016163992707098, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:58:33,166] [INFO] [logging.py:68:log_dist] [Rank 0] step=1930, skipped=4, lr=[6.833333333333334e-06], mom=[[0.9, 0.999]] [2022-12-19 21:58:33,167] [INFO] [timer.py:197:stop] 0/1930, RunningAvgSamplesPerSec=11.98681490593478, CurrSamplesPerSec=11.938396050851917, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:58:39,809] [INFO] [timer.py:197:stop] 0/1931, RunningAvgSamplesPerSec=11.98680177719143, CurrSamplesPerSec=11.961542926129717, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:58:46,256] [INFO] [timer.py:197:stop] 0/1932, RunningAvgSamplesPerSec=11.986829928648799, CurrSamplesPerSec=12.041381353378359, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:58:52,710] [INFO] [timer.py:197:stop] 0/1933, RunningAvgSamplesPerSec=11.986811152853216, CurrSamplesPerSec=11.950683142422198, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:58:59,526] [INFO] [timer.py:197:stop] 0/1934, RunningAvgSamplesPerSec=11.986769662938798, CurrSamplesPerSec=11.907184841725, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:59:06,180] [INFO] [timer.py:197:stop] 0/1935, RunningAvgSamplesPerSec=11.986791018446693, CurrSamplesPerSec=12.028192438205705, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:59:12,701] [INFO] [timer.py:197:stop] 0/1936, RunningAvgSamplesPerSec=11.986751991755373, CurrSamplesPerSec=11.91178544329796, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:59:19,130] [INFO] [timer.py:197:stop] 0/1937, RunningAvgSamplesPerSec=11.986769340420151, CurrSamplesPerSec=12.020415887278796, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:59:23,785] [INFO] [timer.py:197:stop] 0/1938, RunningAvgSamplesPerSec=11.988489979082837, CurrSamplesPerSec=16.59902135598813, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:59:30,394] [INFO] [timer.py:197:stop] 0/1939, RunningAvgSamplesPerSec=11.988500843300645, CurrSamplesPerSec=12.009570954349346, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:59:36,821] [INFO] [logging.py:68:log_dist] [Rank 0] step=1940, skipped=4, lr=[6.811111111111111e-06], mom=[[0.9, 0.999]] [2022-12-19 21:59:36,821] [INFO] [timer.py:197:stop] 0/1940, RunningAvgSamplesPerSec=11.98848929954866, CurrSamplesPerSec=11.96617070082275, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:59:43,453] [INFO] [timer.py:197:stop] 0/1941, RunningAvgSamplesPerSec=11.988434344551093, CurrSamplesPerSec=11.882869856379692, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:59:49,941] [INFO] [timer.py:197:stop] 0/1942, RunningAvgSamplesPerSec=11.988407168801531, CurrSamplesPerSec=11.935944105184998, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 21:59:56,401] [INFO] [timer.py:197:stop] 0/1943, RunningAvgSamplesPerSec=11.988374350758207, CurrSamplesPerSec=11.92504385117734, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:00:02,873] [INFO] [timer.py:197:stop] 0/1944, RunningAvgSamplesPerSec=11.988352429642267, CurrSamplesPerSec=11.945954100663679, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:00:09,334] [INFO] [timer.py:197:stop] 0/1945, RunningAvgSamplesPerSec=11.98836787695008, CurrSamplesPerSec=12.018441842006482, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:00:15,787] [INFO] [timer.py:197:stop] 0/1946, RunningAvgSamplesPerSec=11.988329397296653, CurrSamplesPerSec=11.91402706035402, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:00:22,326] [INFO] [timer.py:197:stop] 0/1947, RunningAvgSamplesPerSec=11.988337432990265, CurrSamplesPerSec=12.003979213857523, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:00:28,968] [INFO] [timer.py:197:stop] 0/1948, RunningAvgSamplesPerSec=11.98831259586106, CurrSamplesPerSec=11.940198360814847, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:00:35,425] [INFO] [timer.py:197:stop] 0/1949, RunningAvgSamplesPerSec=11.98828228603181, CurrSamplesPerSec=11.929588284025177, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:00:41,946] [INFO] [logging.py:68:log_dist] [Rank 0] step=1950, skipped=4, lr=[6.788888888888889e-06], mom=[[0.9, 0.999]] [2022-12-19 22:00:41,947] [INFO] [timer.py:197:stop] 0/1950, RunningAvgSamplesPerSec=11.98824941681559, CurrSamplesPerSec=11.924593041497477, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 6.788888888888889e-06, 'epoch': 51.32} [2022-12-19 22:00:48,462] [INFO] [timer.py:197:stop] 0/1951, RunningAvgSamplesPerSec=11.988212855764138, CurrSamplesPerSec=11.91741276065564, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:00:54,926] [INFO] [timer.py:197:stop] 0/1952, RunningAvgSamplesPerSec=11.988187976661553, CurrSamplesPerSec=11.939894043433679, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:01:01,387] [INFO] [timer.py:197:stop] 0/1953, RunningAvgSamplesPerSec=11.988153476107962, CurrSamplesPerSec=11.921253026439215, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:01:07,895] [INFO] [timer.py:197:stop] 0/1954, RunningAvgSamplesPerSec=11.988114197619641, CurrSamplesPerSec=11.911968864699682, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:01:14,315] [INFO] [timer.py:197:stop] 0/1955, RunningAvgSamplesPerSec=11.988135146555726, CurrSamplesPerSec=12.02916750558729, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:01:20,823] [INFO] [timer.py:197:stop] 0/1956, RunningAvgSamplesPerSec=11.988085205836132, CurrSamplesPerSec=11.891338507892609, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:01:27,294] [INFO] [timer.py:197:stop] 0/1957, RunningAvgSamplesPerSec=11.988048429842435, CurrSamplesPerSec=11.916616543432838, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:01:33,762] [INFO] [timer.py:197:stop] 0/1958, RunningAvgSamplesPerSec=11.988057427984296, CurrSamplesPerSec=12.005674660209147, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:01:40,209] [INFO] [timer.py:197:stop] 0/1959, RunningAvgSamplesPerSec=11.988065635355962, CurrSamplesPerSec=12.004140792119454, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:01:46,734] [INFO] [logging.py:68:log_dist] [Rank 0] step=1960, skipped=4, lr=[6.7666666666666665e-06], mom=[[0.9, 0.999]] [2022-12-19 22:01:46,734] [INFO] [timer.py:197:stop] 0/1960, RunningAvgSamplesPerSec=11.988006758236436, CurrSamplesPerSec=11.873881706817556, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:01:53,268] [INFO] [timer.py:197:stop] 0/1961, RunningAvgSamplesPerSec=11.988003799551887, CurrSamplesPerSec=11.982213494752248, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:01:59,721] [INFO] [timer.py:197:stop] 0/1962, RunningAvgSamplesPerSec=11.987964625155913, CurrSamplesPerSec=11.911710384871819, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:02:06,223] [INFO] [timer.py:197:stop] 0/1963, RunningAvgSamplesPerSec=11.987930991476285, CurrSamplesPerSec=11.922369687035163, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:02:12,657] [INFO] [timer.py:197:stop] 0/1964, RunningAvgSamplesPerSec=11.987941296067827, CurrSamplesPerSec=12.008182737180162, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:02:19,286] [INFO] [timer.py:197:stop] 0/1965, RunningAvgSamplesPerSec=11.987957555834235, CurrSamplesPerSec=12.019944382412291, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:02:25,758] [INFO] [timer.py:197:stop] 0/1966, RunningAvgSamplesPerSec=11.987956368605683, CurrSamplesPerSec=11.985626292169187, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:02:32,441] [INFO] [timer.py:197:stop] 0/1967, RunningAvgSamplesPerSec=11.987727687644899, CurrSamplesPerSec=11.554825549269204, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:02:38,983] [INFO] [timer.py:197:stop] 0/1968, RunningAvgSamplesPerSec=11.987693405232301, CurrSamplesPerSec=11.920705097395594, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:02:45,472] [INFO] [timer.py:197:stop] 0/1969, RunningAvgSamplesPerSec=11.987651900999229, CurrSamplesPerSec=11.906606517562475, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:02:52,030] [INFO] [logging.py:68:log_dist] [Rank 0] step=1970, skipped=4, lr=[6.744444444444444e-06], mom=[[0.9, 0.999]] [2022-12-19 22:02:52,030] [INFO] [timer.py:197:stop] 0/1970, RunningAvgSamplesPerSec=11.987596233839135, CurrSamplesPerSec=11.879090548267543, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:02:58,501] [INFO] [timer.py:197:stop] 0/1971, RunningAvgSamplesPerSec=11.987563865846457, CurrSamplesPerSec=11.924200531098775, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:03:05,176] [INFO] [timer.py:197:stop] 0/1972, RunningAvgSamplesPerSec=11.987533495569057, CurrSamplesPerSec=11.928031392706998, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:03:11,664] [INFO] [timer.py:197:stop] 0/1973, RunningAvgSamplesPerSec=11.987510864025095, CurrSamplesPerSec=11.94309200959183, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:03:18,163] [INFO] [timer.py:197:stop] 0/1974, RunningAvgSamplesPerSec=11.987435389274724, CurrSamplesPerSec=11.840499020817735, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:03:24,861] [INFO] [timer.py:197:stop] 0/1975, RunningAvgSamplesPerSec=11.98743234585665, CurrSamplesPerSec=11.98143373019938, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 6.733333333333334e-06, 'epoch': 51.97} [2022-12-19 22:03:29,499] [INFO] [timer.py:197:stop] 0/1976, RunningAvgSamplesPerSec=11.989111833948037, CurrSamplesPerSec=16.569280845608095, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:03:35,978] [INFO] [timer.py:197:stop] 0/1977, RunningAvgSamplesPerSec=11.989088941354254, CurrSamplesPerSec=11.944068739982201, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:03:42,679] [INFO] [timer.py:197:stop] 0/1978, RunningAvgSamplesPerSec=11.989021227019279, CurrSamplesPerSec=11.856761500721095, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:03:49,220] [INFO] [timer.py:197:stop] 0/1979, RunningAvgSamplesPerSec=11.988937525916304, CurrSamplesPerSec=11.8257959055286, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:03:55,720] [INFO] [logging.py:68:log_dist] [Rank 0] step=1980, skipped=4, lr=[6.7222222222222235e-06], mom=[[0.9, 0.999]] [2022-12-19 22:03:55,720] [INFO] [timer.py:197:stop] 0/1980, RunningAvgSamplesPerSec=11.988868520881296, CurrSamplesPerSec=11.853981243596593, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:04:02,330] [INFO] [timer.py:197:stop] 0/1981, RunningAvgSamplesPerSec=11.988800971921444, CurrSamplesPerSec=11.856662520033082, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:04:08,833] [INFO] [timer.py:197:stop] 0/1982, RunningAvgSamplesPerSec=11.988724515311187, CurrSamplesPerSec=11.839303651567858, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:04:15,357] [INFO] [timer.py:197:stop] 0/1983, RunningAvgSamplesPerSec=11.988616782418385, CurrSamplesPerSec=11.779036558195367, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:04:21,872] [INFO] [timer.py:197:stop] 0/1984, RunningAvgSamplesPerSec=11.988616524252235, CurrSamplesPerSec=11.988105118936977, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:04:28,476] [INFO] [timer.py:197:stop] 0/1985, RunningAvgSamplesPerSec=11.988561486973207, CurrSamplesPerSec=11.880461695904867, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:04:34,925] [INFO] [timer.py:197:stop] 0/1986, RunningAvgSamplesPerSec=11.988554584709144, CurrSamplesPerSec=11.974883011611826, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:04:41,392] [INFO] [timer.py:197:stop] 0/1987, RunningAvgSamplesPerSec=11.98855879024554, CurrSamplesPerSec=11.996908388530146, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:04:48,130] [INFO] [timer.py:197:stop] 0/1988, RunningAvgSamplesPerSec=11.9885715577325, CurrSamplesPerSec=12.013968735225083, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:04:54,584] [INFO] [timer.py:197:stop] 0/1989, RunningAvgSamplesPerSec=11.988544948659529, CurrSamplesPerSec=11.935931367677508, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:05:01,287] [INFO] [logging.py:68:log_dist] [Rank 0] step=1990, skipped=4, lr=[6.700000000000001e-06], mom=[[0.9, 0.999]] [2022-12-19 22:05:01,287] [INFO] [timer.py:197:stop] 0/1990, RunningAvgSamplesPerSec=11.988506605383792, CurrSamplesPerSec=11.912799881988326, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:05:07,754] [INFO] [timer.py:197:stop] 0/1991, RunningAvgSamplesPerSec=11.988478816136707, CurrSamplesPerSec=11.933487330857714, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:05:14,159] [INFO] [timer.py:197:stop] 0/1992, RunningAvgSamplesPerSec=11.988495407485823, CurrSamplesPerSec=12.021586735689406, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:05:20,588] [INFO] [timer.py:197:stop] 0/1993, RunningAvgSamplesPerSec=11.988463418828553, CurrSamplesPerSec=11.92514238756196, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:05:27,091] [INFO] [timer.py:197:stop] 0/1994, RunningAvgSamplesPerSec=11.988421329494853, CurrSamplesPerSec=11.905203456736377, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:05:33,526] [INFO] [timer.py:197:stop] 0/1995, RunningAvgSamplesPerSec=11.988429358561119, CurrSamplesPerSec=12.004444635448708, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:05:40,200] [INFO] [timer.py:197:stop] 0/1996, RunningAvgSamplesPerSec=11.988382979211183, CurrSamplesPerSec=11.896656528219074, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:05:46,638] [INFO] [timer.py:197:stop] 0/1997, RunningAvgSamplesPerSec=11.988365764737264, CurrSamplesPerSec=11.95413815501828, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:05:53,105] [INFO] [timer.py:197:stop] 0/1998, RunningAvgSamplesPerSec=11.988337471689794, CurrSamplesPerSec=11.932157486519865, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:05:59,793] [INFO] [timer.py:197:stop] 0/1999, RunningAvgSamplesPerSec=11.988299873193036, CurrSamplesPerSec=11.913720374249117, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:06:06,164] [INFO] [logging.py:68:log_dist] [Rank 0] step=2000, skipped=4, lr=[6.677777777777779e-06], mom=[[0.9, 0.999]] [2022-12-19 22:06:06,164] [INFO] [timer.py:197:stop] 0/2000, RunningAvgSamplesPerSec=11.988322962062853, CurrSamplesPerSec=12.03460954777992, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 6.677777777777779e-06, 'epoch': 52.63} {'eval_loss': 0.411376953125, 'eval_wer': 18.17784256559767, 'eval_runtime': 167.3664, 'eval_samples_per_second': 7.212, 'eval_steps_per_second': 0.227, 'epoch': 52.63} [2022-12-19 22:08:55,380] [INFO] [logging.py:68:log_dist] [Rank 0] [Torch] Checkpoint global_step2000 is begin to save! [2022-12-19 22:08:55,387] [INFO] [logging.py:68:log_dist] [Rank 0] Saving model checkpoint: ./checkpoint-2000/global_step2000/mp_rank_00_model_states.pt [2022-12-19 22:08:55,388] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving ./checkpoint-2000/global_step2000/mp_rank_00_model_states.pt... [2022-12-19 22:08:57,172] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved ./checkpoint-2000/global_step2000/mp_rank_00_model_states.pt. [2022-12-19 22:08:57,173] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving ./checkpoint-2000/global_step2000/zero_pp_rank_0_mp_rank_00_optim_states.pt... [2022-12-19 22:09:04,575] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved ./checkpoint-2000/global_step2000/zero_pp_rank_0_mp_rank_00_optim_states.pt. [2022-12-19 22:09:04,575] [INFO] [engine.py:3269:_save_zero_checkpoint] zero checkpoint saved ./checkpoint-2000/global_step2000/zero_pp_rank_0_mp_rank_00_optim_states.pt [2022-12-19 22:09:04,575] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step2000 is ready now! [2022-12-19 22:10:10,663] [INFO] [timer.py:197:stop] 0/2001, RunningAvgSamplesPerSec=11.988139112510046, CurrSamplesPerSec=11.63173386918905, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:10:17,193] [INFO] [timer.py:197:stop] 0/2002, RunningAvgSamplesPerSec=11.988101521727597, CurrSamplesPerSec=11.913425864696626, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:10:23,728] [INFO] [timer.py:197:stop] 0/2003, RunningAvgSamplesPerSec=11.98805561679869, CurrSamplesPerSec=11.896943882029106, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:10:30,240] [INFO] [timer.py:197:stop] 0/2004, RunningAvgSamplesPerSec=11.988051391691087, CurrSamplesPerSec=11.979602912551211, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:10:36,737] [INFO] [timer.py:197:stop] 0/2005, RunningAvgSamplesPerSec=11.988045378386222, CurrSamplesPerSec=11.97601882538536, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:10:43,317] [INFO] [timer.py:197:stop] 0/2006, RunningAvgSamplesPerSec=11.988030344824711, CurrSamplesPerSec=11.957993606793096, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:10:49,971] [INFO] [timer.py:197:stop] 0/2007, RunningAvgSamplesPerSec=11.987984553367719, CurrSamplesPerSec=11.896915936920815, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:10:56,548] [INFO] [timer.py:197:stop] 0/2008, RunningAvgSamplesPerSec=11.987945332900223, CurrSamplesPerSec=11.909821021340788, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:11:03,062] [INFO] [timer.py:197:stop] 0/2009, RunningAvgSamplesPerSec=11.987951402091154, CurrSamplesPerSec=12.000138582399508, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:11:09,477] [INFO] [logging.py:68:log_dist] [Rank 0] step=2010, skipped=4, lr=[6.655555555555556e-06], mom=[[0.9, 0.999]] [2022-12-19 22:11:09,478] [INFO] [timer.py:197:stop] 0/2010, RunningAvgSamplesPerSec=11.987911785502583, CurrSamplesPerSec=11.908925434641656, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:11:16,036] [INFO] [timer.py:197:stop] 0/2011, RunningAvgSamplesPerSec=11.987876981537273, CurrSamplesPerSec=11.918395878591207, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:11:22,544] [INFO] [timer.py:197:stop] 0/2012, RunningAvgSamplesPerSec=11.987879628800627, CurrSamplesPerSec=11.993200342556333, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:11:28,969] [INFO] [timer.py:197:stop] 0/2013, RunningAvgSamplesPerSec=11.987883277862725, CurrSamplesPerSec=11.995222385240568, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:11:33,667] [INFO] [timer.py:197:stop] 0/2014, RunningAvgSamplesPerSec=11.989519362048608, CurrSamplesPerSec=16.524904463394137, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:11:40,198] [INFO] [timer.py:197:stop] 0/2015, RunningAvgSamplesPerSec=11.9894483117972, CurrSamplesPerSec=11.84818041459623, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:11:46,728] [INFO] [timer.py:197:stop] 0/2016, RunningAvgSamplesPerSec=11.989363996163904, CurrSamplesPerSec=11.822006991855167, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:11:53,235] [INFO] [timer.py:197:stop] 0/2017, RunningAvgSamplesPerSec=11.989271826695015, CurrSamplesPerSec=11.806474168854175, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:11:59,727] [INFO] [timer.py:197:stop] 0/2018, RunningAvgSamplesPerSec=11.989231273285258, CurrSamplesPerSec=11.908069601547135, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:12:06,240] [INFO] [timer.py:197:stop] 0/2019, RunningAvgSamplesPerSec=11.989136579851259, CurrSamplesPerSec=11.801228150388633, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:12:12,774] [INFO] [logging.py:68:log_dist] [Rank 0] step=2020, skipped=4, lr=[6.633333333333334e-06], mom=[[0.9, 0.999]] [2022-12-19 22:12:12,774] [INFO] [timer.py:197:stop] 0/2020, RunningAvgSamplesPerSec=11.989059959582729, CurrSamplesPerSec=11.836484601572572, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:12:19,506] [INFO] [timer.py:197:stop] 0/2021, RunningAvgSamplesPerSec=11.98903130334428, CurrSamplesPerSec=11.93148074298238, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:12:25,960] [INFO] [timer.py:197:stop] 0/2022, RunningAvgSamplesPerSec=11.98897267805674, CurrSamplesPerSec=11.871765951540315, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:12:32,378] [INFO] [timer.py:197:stop] 0/2023, RunningAvgSamplesPerSec=11.988977982987139, CurrSamplesPerSec=11.999703533821418, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:12:38,842] [INFO] [timer.py:197:stop] 0/2024, RunningAvgSamplesPerSec=11.988924187495932, CurrSamplesPerSec=11.881181044087745, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:12:45,363] [INFO] [timer.py:197:stop] 0/2025, RunningAvgSamplesPerSec=11.98886200900845, CurrSamplesPerSec=11.864442516015103, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 6.6222222222222236e-06, 'epoch': 53.29} [2022-12-19 22:12:51,899] [INFO] [timer.py:197:stop] 0/2026, RunningAvgSamplesPerSec=11.988810772859825, CurrSamplesPerSec=11.8860489234938, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:12:58,483] [INFO] [timer.py:197:stop] 0/2027, RunningAvgSamplesPerSec=11.988786281939085, CurrSamplesPerSec=11.939420868685408, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:13:05,071] [INFO] [timer.py:197:stop] 0/2028, RunningAvgSamplesPerSec=11.988778235224558, CurrSamplesPerSec=11.972505766024982, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:13:11,728] [INFO] [timer.py:197:stop] 0/2029, RunningAvgSamplesPerSec=11.988714039896328, CurrSamplesPerSec=11.860050798848295, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:13:18,286] [INFO] [logging.py:68:log_dist] [Rank 0] step=2030, skipped=4, lr=[6.6111111111111115e-06], mom=[[0.9, 0.999]] [2022-12-19 22:13:18,287] [INFO] [timer.py:197:stop] 0/2030, RunningAvgSamplesPerSec=11.988671883045024, CurrSamplesPerSec=11.903825008859062, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:13:24,778] [INFO] [timer.py:197:stop] 0/2031, RunningAvgSamplesPerSec=11.988618966651623, CurrSamplesPerSec=11.882257073911466, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:13:31,334] [INFO] [timer.py:197:stop] 0/2032, RunningAvgSamplesPerSec=11.988547756443191, CurrSamplesPerSec=11.84578367866616, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:13:37,913] [INFO] [timer.py:197:stop] 0/2033, RunningAvgSamplesPerSec=11.988497660171365, CurrSamplesPerSec=11.88765804699442, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:13:44,505] [INFO] [timer.py:197:stop] 0/2034, RunningAvgSamplesPerSec=11.988469469541718, CurrSamplesPerSec=11.931486576650364, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:13:51,145] [INFO] [timer.py:197:stop] 0/2035, RunningAvgSamplesPerSec=11.988439683432755, CurrSamplesPerSec=11.928218495032736, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:13:57,759] [INFO] [timer.py:197:stop] 0/2036, RunningAvgSamplesPerSec=11.988408890430634, CurrSamplesPerSec=11.926132079839084, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:14:04,289] [INFO] [timer.py:197:stop] 0/2037, RunningAvgSamplesPerSec=11.988376654767638, CurrSamplesPerSec=11.923166143675695, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:14:10,737] [INFO] [timer.py:197:stop] 0/2038, RunningAvgSamplesPerSec=11.988400657185625, CurrSamplesPerSec=12.03744550176715, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:14:17,207] [INFO] [timer.py:197:stop] 0/2039, RunningAvgSamplesPerSec=11.988410188824945, CurrSamplesPerSec=12.007848087325069, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:14:23,676] [INFO] [logging.py:68:log_dist] [Rank 0] step=2040, skipped=4, lr=[6.588888888888889e-06], mom=[[0.9, 0.999]] [2022-12-19 22:14:23,677] [INFO] [timer.py:197:stop] 0/2040, RunningAvgSamplesPerSec=11.988408091387418, CurrSamplesPerSec=11.984137133993574, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:14:30,235] [INFO] [timer.py:197:stop] 0/2041, RunningAvgSamplesPerSec=11.988321704204305, CurrSamplesPerSec=11.814813960624718, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:14:36,817] [INFO] [timer.py:197:stop] 0/2042, RunningAvgSamplesPerSec=11.988337221444818, CurrSamplesPerSec=12.020060640450275, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:14:43,407] [INFO] [timer.py:197:stop] 0/2043, RunningAvgSamplesPerSec=11.988301711121688, CurrSamplesPerSec=11.916295970395733, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:14:49,921] [INFO] [timer.py:197:stop] 0/2044, RunningAvgSamplesPerSec=11.988287586532733, CurrSamplesPerSec=11.959528491483816, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:14:56,389] [INFO] [timer.py:197:stop] 0/2045, RunningAvgSamplesPerSec=11.988285460987036, CurrSamplesPerSec=11.98394666830241, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:15:02,847] [INFO] [timer.py:197:stop] 0/2046, RunningAvgSamplesPerSec=11.98828293964291, CurrSamplesPerSec=11.983134047042142, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:15:09,458] [INFO] [timer.py:197:stop] 0/2047, RunningAvgSamplesPerSec=11.988234910834906, CurrSamplesPerSec=11.89086179952573, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:15:16,116] [INFO] [timer.py:197:stop] 0/2048, RunningAvgSamplesPerSec=11.988227244125316, CurrSamplesPerSec=11.97256931074896, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:15:22,559] [INFO] [timer.py:197:stop] 0/2049, RunningAvgSamplesPerSec=11.988222427683738, CurrSamplesPerSec=11.978376085965893, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:15:29,020] [INFO] [logging.py:68:log_dist] [Rank 0] step=2050, skipped=4, lr=[6.566666666666667e-06], mom=[[0.9, 0.999]] [2022-12-19 22:15:29,020] [INFO] [timer.py:197:stop] 0/2050, RunningAvgSamplesPerSec=11.988224589080263, CurrSamplesPerSec=11.992650602032851, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 6.566666666666667e-06, 'epoch': 53.95} [2022-12-19 22:15:35,513] [INFO] [timer.py:197:stop] 0/2051, RunningAvgSamplesPerSec=11.98822896152143, CurrSamplesPerSec=11.997190418174684, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:15:40,196] [INFO] [timer.py:197:stop] 0/2052, RunningAvgSamplesPerSec=11.98986101337474, CurrSamplesPerSec=16.628253866270747, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:15:46,677] [INFO] [timer.py:197:stop] 0/2053, RunningAvgSamplesPerSec=11.989806219461254, CurrSamplesPerSec=11.878521783123567, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:15:53,140] [INFO] [timer.py:197:stop] 0/2054, RunningAvgSamplesPerSec=11.989759642836127, CurrSamplesPerSec=11.894986460162, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:15:59,779] [INFO] [timer.py:197:stop] 0/2055, RunningAvgSamplesPerSec=11.989762768623944, CurrSamplesPerSec=11.996180320068014, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:16:06,349] [INFO] [timer.py:197:stop] 0/2056, RunningAvgSamplesPerSec=11.989755738001811, CurrSamplesPerSec=11.975339234528034, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:16:12,905] [INFO] [timer.py:197:stop] 0/2057, RunningAvgSamplesPerSec=11.989665762010146, CurrSamplesPerSec=11.80766187886112, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:16:19,384] [INFO] [timer.py:197:stop] 0/2058, RunningAvgSamplesPerSec=11.989647415681517, CurrSamplesPerSec=11.95206394987849, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:16:25,888] [INFO] [timer.py:197:stop] 0/2059, RunningAvgSamplesPerSec=11.989614401606376, CurrSamplesPerSec=11.922119757153093, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:16:32,416] [INFO] [logging.py:68:log_dist] [Rank 0] step=2060, skipped=4, lr=[6.544444444444445e-06], mom=[[0.9, 0.999]] [2022-12-19 22:16:32,417] [INFO] [timer.py:197:stop] 0/2060, RunningAvgSamplesPerSec=11.989573245327236, CurrSamplesPerSec=11.905508648449091, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:16:38,868] [INFO] [timer.py:197:stop] 0/2061, RunningAvgSamplesPerSec=11.98958528615156, CurrSamplesPerSec=12.014416648886874, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:16:45,416] [INFO] [timer.py:197:stop] 0/2062, RunningAvgSamplesPerSec=11.989553039278219, CurrSamplesPerSec=11.92352257130293, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:16:51,951] [INFO] [timer.py:197:stop] 0/2063, RunningAvgSamplesPerSec=11.98946527915884, CurrSamplesPerSec=11.811366243209141, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:16:58,470] [INFO] [timer.py:197:stop] 0/2064, RunningAvgSamplesPerSec=11.989433464947814, CurrSamplesPerSec=11.924221188896604, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:17:04,936] [INFO] [timer.py:197:stop] 0/2065, RunningAvgSamplesPerSec=11.989388946437094, CurrSamplesPerSec=11.898289620725182, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:17:11,492] [INFO] [timer.py:197:stop] 0/2066, RunningAvgSamplesPerSec=11.98935998997441, CurrSamplesPerSec=11.929919116143532, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:17:17,936] [INFO] [timer.py:197:stop] 0/2067, RunningAvgSamplesPerSec=11.98936135563249, CurrSamplesPerSec=11.99218073707056, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:17:24,443] [INFO] [timer.py:197:stop] 0/2068, RunningAvgSamplesPerSec=11.989331070346589, CurrSamplesPerSec=11.927116636947602, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:17:30,947] [INFO] [timer.py:197:stop] 0/2069, RunningAvgSamplesPerSec=11.989281491210551, CurrSamplesPerSec=11.887719114889327, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:17:37,376] [INFO] [logging.py:68:log_dist] [Rank 0] step=2070, skipped=4, lr=[6.522222222222223e-06], mom=[[0.9, 0.999]] [2022-12-19 22:17:37,376] [INFO] [timer.py:197:stop] 0/2070, RunningAvgSamplesPerSec=11.989280115458481, CurrSamplesPerSec=11.986437110573181, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:17:43,871] [INFO] [timer.py:197:stop] 0/2071, RunningAvgSamplesPerSec=11.989258897892332, CurrSamplesPerSec=11.945541045736702, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:17:50,349] [INFO] [timer.py:197:stop] 0/2072, RunningAvgSamplesPerSec=11.989239648289724, CurrSamplesPerSec=11.949544149937177, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:17:56,819] [INFO] [timer.py:197:stop] 0/2073, RunningAvgSamplesPerSec=11.989252668949392, CurrSamplesPerSec=12.01626619228679, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:18:03,279] [INFO] [timer.py:197:stop] 0/2074, RunningAvgSamplesPerSec=11.989208895000196, CurrSamplesPerSec=11.89923371786778, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:18:09,877] [INFO] [timer.py:197:stop] 0/2075, RunningAvgSamplesPerSec=11.989128382909177, CurrSamplesPerSec=11.82459777340165, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 6.511111111111112e-06, 'epoch': 54.61} [2022-12-19 22:18:16,379] [INFO] [timer.py:197:stop] 0/2076, RunningAvgSamplesPerSec=11.989090372096955, CurrSamplesPerSec=11.910808700539567, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:18:22,842] [INFO] [timer.py:197:stop] 0/2077, RunningAvgSamplesPerSec=11.989056583552621, CurrSamplesPerSec=11.91938656835922, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:18:29,322] [INFO] [timer.py:197:stop] 0/2078, RunningAvgSamplesPerSec=11.989070902574397, CurrSamplesPerSec=12.018856725182202, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:18:35,742] [INFO] [timer.py:197:stop] 0/2079, RunningAvgSamplesPerSec=11.989039019468013, CurrSamplesPerSec=11.923213277755876, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:18:42,189] [INFO] [logging.py:68:log_dist] [Rank 0] step=2080, skipped=4, lr=[6.5000000000000004e-06], mom=[[0.9, 0.999]] [2022-12-19 22:18:42,190] [INFO] [timer.py:197:stop] 0/2080, RunningAvgSamplesPerSec=11.98905254512671, CurrSamplesPerSec=12.017211351827994, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:18:48,705] [INFO] [timer.py:197:stop] 0/2081, RunningAvgSamplesPerSec=11.989014440931365, CurrSamplesPerSec=11.910353681907422, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:18:55,163] [INFO] [timer.py:197:stop] 0/2082, RunningAvgSamplesPerSec=11.988988622200294, CurrSamplesPerSec=11.935550847178789, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:19:01,615] [INFO] [timer.py:197:stop] 0/2083, RunningAvgSamplesPerSec=11.989000776636056, CurrSamplesPerSec=12.014335451975189, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:19:08,043] [INFO] [timer.py:197:stop] 0/2084, RunningAvgSamplesPerSec=11.989016901165845, CurrSamplesPerSec=12.022666271577478, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:19:14,534] [INFO] [timer.py:197:stop] 0/2085, RunningAvgSamplesPerSec=11.988989477093924, CurrSamplesPerSec=11.93216332084963, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:19:21,020] [INFO] [timer.py:197:stop] 0/2086, RunningAvgSamplesPerSec=11.98900536209495, CurrSamplesPerSec=12.022185436884898, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:19:27,430] [INFO] [timer.py:197:stop] 0/2087, RunningAvgSamplesPerSec=11.988957880175235, CurrSamplesPerSec=11.890815974326317, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:19:33,900] [INFO] [timer.py:197:stop] 0/2088, RunningAvgSamplesPerSec=11.98896572752131, CurrSamplesPerSec=12.005349814686301, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:19:40,308] [INFO] [timer.py:197:stop] 0/2089, RunningAvgSamplesPerSec=11.988985381790268, CurrSamplesPerSec=12.030124939470785, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:19:44,995] [INFO] [logging.py:68:log_dist] [Rank 0] step=2090, skipped=4, lr=[6.477777777777778e-06], mom=[[0.9, 0.999]] [2022-12-19 22:19:44,996] [INFO] [timer.py:197:stop] 0/2090, RunningAvgSamplesPerSec=11.990531299862461, CurrSamplesPerSec=16.4053377531445, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:19:51,510] [INFO] [timer.py:197:stop] 0/2091, RunningAvgSamplesPerSec=11.990497248030803, CurrSamplesPerSec=11.91981634184573, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:19:57,943] [INFO] [timer.py:197:stop] 0/2092, RunningAvgSamplesPerSec=11.990504824436767, CurrSamplesPerSec=12.006352865456037, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:20:04,431] [INFO] [timer.py:197:stop] 0/2093, RunningAvgSamplesPerSec=11.990513345943345, CurrSamplesPerSec=12.008349800508338, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:20:10,931] [INFO] [timer.py:197:stop] 0/2094, RunningAvgSamplesPerSec=11.990453173013405, CurrSamplesPerSec=11.865938790717305, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:20:17,485] [INFO] [timer.py:197:stop] 0/2095, RunningAvgSamplesPerSec=11.990387349992163, CurrSamplesPerSec=11.854249787808692, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:20:23,920] [INFO] [timer.py:197:stop] 0/2096, RunningAvgSamplesPerSec=11.990360756854837, CurrSamplesPerSec=11.93495862086382, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:20:30,413] [INFO] [timer.py:197:stop] 0/2097, RunningAvgSamplesPerSec=11.99031248902336, CurrSamplesPerSec=11.890084925806635, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:20:36,855] [INFO] [timer.py:197:stop] 0/2098, RunningAvgSamplesPerSec=11.990301913387388, CurrSamplesPerSec=11.968186840273408, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:20:43,419] [INFO] [timer.py:197:stop] 0/2099, RunningAvgSamplesPerSec=11.990240015453736, CurrSamplesPerSec=11.861891380637234, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:20:49,851] [INFO] [logging.py:68:log_dist] [Rank 0] step=2100, skipped=4, lr=[6.455555555555556e-06], mom=[[0.9, 0.999]] [2022-12-19 22:20:49,852] [INFO] [timer.py:197:stop] 0/2100, RunningAvgSamplesPerSec=11.990225667178025, CurrSamplesPerSec=11.960212683662926, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 6.455555555555556e-06, 'epoch': 55.26} [2022-12-19 22:20:56,299] [INFO] [timer.py:197:stop] 0/2101, RunningAvgSamplesPerSec=11.990210527599219, CurrSamplesPerSec=11.958531650640312, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:21:02,819] [INFO] [timer.py:197:stop] 0/2102, RunningAvgSamplesPerSec=11.99020116133998, CurrSamplesPerSec=11.970573580964237, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:21:09,349] [INFO] [timer.py:197:stop] 0/2103, RunningAvgSamplesPerSec=11.99014098339343, CurrSamplesPerSec=11.865085978023583, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:21:15,808] [INFO] [timer.py:197:stop] 0/2104, RunningAvgSamplesPerSec=11.990130470153556, CurrSamplesPerSec=11.968082788933286, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:21:22,329] [INFO] [timer.py:197:stop] 0/2105, RunningAvgSamplesPerSec=11.990092986591947, CurrSamplesPerSec=11.911817158410193, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:21:28,814] [INFO] [timer.py:197:stop] 0/2106, RunningAvgSamplesPerSec=11.990083042815433, CurrSamplesPerSec=11.96920770654523, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:21:35,408] [INFO] [timer.py:197:stop] 0/2107, RunningAvgSamplesPerSec=11.989963200723734, CurrSamplesPerSec=11.743011282957074, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:21:41,890] [INFO] [timer.py:197:stop] 0/2108, RunningAvgSamplesPerSec=11.989977323495985, CurrSamplesPerSec=12.019779687374763, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:21:48,354] [INFO] [timer.py:197:stop] 0/2109, RunningAvgSamplesPerSec=11.989943071920637, CurrSamplesPerSec=11.918240833612565, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:21:54,867] [INFO] [logging.py:68:log_dist] [Rank 0] step=2110, skipped=4, lr=[6.433333333333333e-06], mom=[[0.9, 0.999]] [2022-12-19 22:21:54,868] [INFO] [timer.py:197:stop] 0/2110, RunningAvgSamplesPerSec=11.98991189911231, CurrSamplesPerSec=11.924588803730357, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:22:01,357] [INFO] [timer.py:197:stop] 0/2111, RunningAvgSamplesPerSec=11.989876147461628, CurrSamplesPerSec=11.914982647436425, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:22:07,898] [INFO] [timer.py:197:stop] 0/2112, RunningAvgSamplesPerSec=11.989834423306093, CurrSamplesPerSec=11.902479601073068, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:22:14,364] [INFO] [timer.py:197:stop] 0/2113, RunningAvgSamplesPerSec=11.98983632936067, CurrSamplesPerSec=11.99385945464458, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:22:20,877] [INFO] [timer.py:197:stop] 0/2114, RunningAvgSamplesPerSec=11.989790618721724, CurrSamplesPerSec=11.894066225335756, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:22:27,364] [INFO] [timer.py:197:stop] 0/2115, RunningAvgSamplesPerSec=11.989776914254882, CurrSamplesPerSec=11.960902816598791, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:22:34,242] [INFO] [timer.py:197:stop] 0/2116, RunningAvgSamplesPerSec=11.989735546694062, CurrSamplesPerSec=11.902958824838283, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:22:40,969] [INFO] [timer.py:197:stop] 0/2117, RunningAvgSamplesPerSec=11.989718201560274, CurrSamplesPerSec=11.953162438452573, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:22:47,801] [INFO] [timer.py:197:stop] 0/2118, RunningAvgSamplesPerSec=11.989658382335348, CurrSamplesPerSec=11.864462442886612, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:22:54,628] [INFO] [timer.py:197:stop] 0/2119, RunningAvgSamplesPerSec=11.98960326023149, CurrSamplesPerSec=11.874089174397554, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:23:01,726] [INFO] [logging.py:68:log_dist] [Rank 0] step=2120, skipped=4, lr=[6.411111111111111e-06], mom=[[0.9, 0.999]] [2022-12-19 22:23:01,727] [INFO] [timer.py:197:stop] 0/2120, RunningAvgSamplesPerSec=11.989479857959896, CurrSamplesPerSec=11.733810736688506, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:23:08,520] [INFO] [timer.py:197:stop] 0/2121, RunningAvgSamplesPerSec=11.989445077117262, CurrSamplesPerSec=11.916229318761822, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:23:14,979] [INFO] [timer.py:197:stop] 0/2122, RunningAvgSamplesPerSec=11.989407981696951, CurrSamplesPerSec=11.911315022297709, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:23:21,430] [INFO] [timer.py:197:stop] 0/2123, RunningAvgSamplesPerSec=11.989401909908867, CurrSamplesPerSec=11.976543530837066, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:23:27,942] [INFO] [timer.py:197:stop] 0/2124, RunningAvgSamplesPerSec=11.989414788561852, CurrSamplesPerSec=12.016792816626072, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:23:34,424] [INFO] [timer.py:197:stop] 0/2125, RunningAvgSamplesPerSec=11.989411722645675, CurrSamplesPerSec=11.982909378584297, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 6.4000000000000006e-06, 'epoch': 55.92} [2022-12-19 22:23:41,069] [INFO] [timer.py:197:stop] 0/2126, RunningAvgSamplesPerSec=11.989371307677912, CurrSamplesPerSec=11.904180281050552, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:23:47,543] [INFO] [timer.py:197:stop] 0/2127, RunningAvgSamplesPerSec=11.989327645012311, CurrSamplesPerSec=11.89730032549267, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:23:52,208] [INFO] [timer.py:197:stop] 0/2128, RunningAvgSamplesPerSec=11.990870256771105, CurrSamplesPerSec=16.50302951051557, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:23:58,919] [INFO] [timer.py:197:stop] 0/2129, RunningAvgSamplesPerSec=11.990824732364405, CurrSamplesPerSec=11.894815157097613, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:24:05,394] [INFO] [logging.py:68:log_dist] [Rank 0] step=2130, skipped=4, lr=[6.3888888888888885e-06], mom=[[0.9, 0.999]] [2022-12-19 22:24:05,395] [INFO] [timer.py:197:stop] 0/2130, RunningAvgSamplesPerSec=11.990791536329638, CurrSamplesPerSec=11.920597105729147, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:24:11,870] [INFO] [timer.py:197:stop] 0/2131, RunningAvgSamplesPerSec=11.990754642228111, CurrSamplesPerSec=11.912754944882161, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:24:18,412] [INFO] [timer.py:197:stop] 0/2132, RunningAvgSamplesPerSec=11.9906978268213, CurrSamplesPerSec=11.87094642373062, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:24:24,877] [INFO] [timer.py:197:stop] 0/2133, RunningAvgSamplesPerSec=11.990669611525927, CurrSamplesPerSec=11.930870890982334, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:24:31,516] [INFO] [timer.py:197:stop] 0/2134, RunningAvgSamplesPerSec=11.99065044271475, CurrSamplesPerSec=11.949940458516169, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:24:38,070] [INFO] [timer.py:197:stop] 0/2135, RunningAvgSamplesPerSec=11.990664512064706, CurrSamplesPerSec=12.020735627378473, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:24:44,573] [INFO] [timer.py:197:stop] 0/2136, RunningAvgSamplesPerSec=11.99065487979535, CurrSamplesPerSec=11.970144409813598, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:24:51,071] [INFO] [timer.py:197:stop] 0/2137, RunningAvgSamplesPerSec=11.990620695516448, CurrSamplesPerSec=11.918112779026927, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:24:57,615] [INFO] [timer.py:197:stop] 0/2138, RunningAvgSamplesPerSec=11.990576057364136, CurrSamplesPerSec=11.896025453068358, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:25:04,301] [INFO] [timer.py:197:stop] 0/2139, RunningAvgSamplesPerSec=11.990527509142366, CurrSamplesPerSec=11.887718061989272, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:25:10,978] [INFO] [logging.py:68:log_dist] [Rank 0] step=2140, skipped=4, lr=[6.366666666666668e-06], mom=[[0.9, 0.999]] [2022-12-19 22:25:10,979] [INFO] [timer.py:197:stop] 0/2140, RunningAvgSamplesPerSec=11.990493418098314, CurrSamplesPerSec=11.91808103038181, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:25:17,417] [INFO] [timer.py:197:stop] 0/2141, RunningAvgSamplesPerSec=11.990467829650827, CurrSamplesPerSec=11.936008323865973, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:25:23,927] [INFO] [timer.py:197:stop] 0/2142, RunningAvgSamplesPerSec=11.990391015717703, CurrSamplesPerSec=11.828308082673006, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:25:30,428] [INFO] [timer.py:197:stop] 0/2143, RunningAvgSamplesPerSec=11.990405153058553, CurrSamplesPerSec=12.020735627378473, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:25:37,053] [INFO] [timer.py:197:stop] 0/2144, RunningAvgSamplesPerSec=11.99041147674148, CurrSamplesPerSec=12.00396579393174, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:25:43,703] [INFO] [timer.py:197:stop] 0/2145, RunningAvgSamplesPerSec=11.990417699808464, CurrSamplesPerSec=12.003762351532313, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:25:50,131] [INFO] [timer.py:197:stop] 0/2146, RunningAvgSamplesPerSec=11.990390139033465, CurrSamplesPerSec=11.931617040170146, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:25:56,614] [INFO] [timer.py:197:stop] 0/2147, RunningAvgSamplesPerSec=11.990359032048957, CurrSamplesPerSec=11.924034741621329, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:26:03,121] [INFO] [timer.py:197:stop] 0/2148, RunningAvgSamplesPerSec=11.990365899221725, CurrSamplesPerSec=12.005114111337289, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:26:09,673] [INFO] [timer.py:197:stop] 0/2149, RunningAvgSamplesPerSec=11.990295176539126, CurrSamplesPerSec=11.840422246732128, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:26:16,243] [INFO] [logging.py:68:log_dist] [Rank 0] step=2150, skipped=4, lr=[6.3444444444444454e-06], mom=[[0.9, 0.999]] [2022-12-19 22:26:16,244] [INFO] [timer.py:197:stop] 0/2150, RunningAvgSamplesPerSec=11.990311627793435, CurrSamplesPerSec=12.025736874448043, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 6.3444444444444454e-06, 'epoch': 56.58} [2022-12-19 22:26:22,769] [INFO] [timer.py:197:stop] 0/2151, RunningAvgSamplesPerSec=11.990315853177917, CurrSamplesPerSec=11.999398857668647, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:26:29,363] [INFO] [timer.py:197:stop] 0/2152, RunningAvgSamplesPerSec=11.990322451101164, CurrSamplesPerSec=12.004518182877646, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:26:35,857] [INFO] [timer.py:197:stop] 0/2153, RunningAvgSamplesPerSec=11.990275224944885, CurrSamplesPerSec=11.889591992823958, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:26:42,266] [INFO] [timer.py:197:stop] 0/2154, RunningAvgSamplesPerSec=11.990280814945413, CurrSamplesPerSec=12.002316981799888, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:26:48,666] [INFO] [timer.py:197:stop] 0/2155, RunningAvgSamplesPerSec=11.990292805004252, CurrSamplesPerSec=12.01615108341017, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:26:55,280] [INFO] [timer.py:197:stop] 0/2156, RunningAvgSamplesPerSec=11.990238155918465, CurrSamplesPerSec=11.873722565714077, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:27:01,864] [INFO] [timer.py:197:stop] 0/2157, RunningAvgSamplesPerSec=11.990211223808956, CurrSamplesPerSec=11.93247891355172, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:27:08,488] [INFO] [timer.py:197:stop] 0/2158, RunningAvgSamplesPerSec=11.990183686844635, CurrSamplesPerSec=11.931133915155597, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:27:15,176] [INFO] [timer.py:197:stop] 0/2159, RunningAvgSamplesPerSec=11.990065099893314, CurrSamplesPerSec=11.73973215254108, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:27:21,861] [INFO] [logging.py:68:log_dist] [Rank 0] step=2160, skipped=4, lr=[6.322222222222223e-06], mom=[[0.9, 0.999]] [2022-12-19 22:27:21,862] [INFO] [timer.py:197:stop] 0/2160, RunningAvgSamplesPerSec=11.99003305898277, CurrSamplesPerSec=11.921317087132918, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:27:28,344] [INFO] [timer.py:197:stop] 0/2161, RunningAvgSamplesPerSec=11.990002468528882, CurrSamplesPerSec=11.924349904480094, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:27:34,862] [INFO] [timer.py:197:stop] 0/2162, RunningAvgSamplesPerSec=11.99001283998224, CurrSamplesPerSec=12.01244672362398, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:27:41,433] [INFO] [timer.py:197:stop] 0/2163, RunningAvgSamplesPerSec=11.989982757172319, CurrSamplesPerSec=11.92535429928206, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:27:47,937] [INFO] [timer.py:197:stop] 0/2164, RunningAvgSamplesPerSec=11.989927674758977, CurrSamplesPerSec=11.87206523076994, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:27:54,627] [INFO] [timer.py:197:stop] 0/2165, RunningAvgSamplesPerSec=11.989843420334953, CurrSamplesPerSec=11.810412653845614, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:27:59,390] [INFO] [timer.py:197:stop] 0/2166, RunningAvgSamplesPerSec=11.991382760965633, CurrSamplesPerSec=16.60169253050962, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:28:05,885] [INFO] [timer.py:197:stop] 0/2167, RunningAvgSamplesPerSec=11.991317819500242, CurrSamplesPerSec=11.852413146510873, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:28:12,384] [INFO] [timer.py:197:stop] 0/2168, RunningAvgSamplesPerSec=11.991268645728784, CurrSamplesPerSec=11.885744728633458, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:28:18,957] [INFO] [timer.py:197:stop] 0/2169, RunningAvgSamplesPerSec=11.991247365548839, CurrSamplesPerSec=11.94533107381017, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:28:25,551] [INFO] [logging.py:68:log_dist] [Rank 0] step=2170, skipped=4, lr=[6.300000000000001e-06], mom=[[0.9, 0.999]] [2022-12-19 22:28:25,552] [INFO] [timer.py:197:stop] 0/2170, RunningAvgSamplesPerSec=11.991215426769653, CurrSamplesPerSec=11.922401458533578, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:28:32,006] [INFO] [timer.py:197:stop] 0/2171, RunningAvgSamplesPerSec=11.991186985902292, CurrSamplesPerSec=11.92984276864144, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:28:38,570] [INFO] [timer.py:197:stop] 0/2172, RunningAvgSamplesPerSec=11.991108632075077, CurrSamplesPerSec=11.82353528544186, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:28:45,042] [INFO] [timer.py:197:stop] 0/2173, RunningAvgSamplesPerSec=11.991116952088525, CurrSamplesPerSec=12.009198618480351, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:28:51,569] [INFO] [timer.py:197:stop] 0/2174, RunningAvgSamplesPerSec=11.991115098727857, CurrSamplesPerSec=11.987092803031995, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:28:58,103] [INFO] [timer.py:197:stop] 0/2175, RunningAvgSamplesPerSec=11.991122638291568, CurrSamplesPerSec=12.007520975719293, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 6.28888888888889e-06, 'epoch': 57.24} [2022-12-19 22:29:04,650] [INFO] [timer.py:197:stop] 0/2176, RunningAvgSamplesPerSec=11.991092926433012, CurrSamplesPerSec=11.926874986781757, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:29:11,082] [INFO] [timer.py:197:stop] 0/2177, RunningAvgSamplesPerSec=11.991100373429413, CurrSamplesPerSec=12.007312041837181, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:29:17,688] [INFO] [timer.py:197:stop] 0/2178, RunningAvgSamplesPerSec=11.991063600222951, CurrSamplesPerSec=11.911612070336146, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:29:24,123] [INFO] [timer.py:197:stop] 0/2179, RunningAvgSamplesPerSec=11.991042955338537, CurrSamplesPerSec=11.946287436083677, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:29:30,665] [INFO] [logging.py:68:log_dist] [Rank 0] step=2180, skipped=4, lr=[6.277777777777778e-06], mom=[[0.9, 0.999]] [2022-12-19 22:29:30,665] [INFO] [timer.py:197:stop] 0/2180, RunningAvgSamplesPerSec=11.991054488593504, CurrSamplesPerSec=12.016215092279639, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:29:37,282] [INFO] [timer.py:197:stop] 0/2181, RunningAvgSamplesPerSec=11.991026131185269, CurrSamplesPerSec=11.929580331555865, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:29:43,927] [INFO] [timer.py:197:stop] 0/2182, RunningAvgSamplesPerSec=11.990978483109911, CurrSamplesPerSec=11.888044996418557, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:29:50,515] [INFO] [timer.py:197:stop] 0/2183, RunningAvgSamplesPerSec=11.99094052612562, CurrSamplesPerSec=11.908761654593484, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:29:57,176] [INFO] [timer.py:197:stop] 0/2184, RunningAvgSamplesPerSec=11.990901860134445, CurrSamplesPerSec=11.907160545698206, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:30:03,622] [INFO] [timer.py:197:stop] 0/2185, RunningAvgSamplesPerSec=11.990898729140023, CurrSamplesPerSec=11.984070791324125, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:30:10,107] [INFO] [timer.py:197:stop] 0/2186, RunningAvgSamplesPerSec=11.990904888185314, CurrSamplesPerSec=12.004365183764389, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:30:16,619] [INFO] [timer.py:197:stop] 0/2187, RunningAvgSamplesPerSec=11.99084877333928, CurrSamplesPerSec=11.869534434348123, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:30:23,100] [INFO] [timer.py:197:stop] 0/2188, RunningAvgSamplesPerSec=11.990817908757489, CurrSamplesPerSec=11.923756140933863, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:30:29,691] [INFO] [timer.py:197:stop] 0/2189, RunningAvgSamplesPerSec=11.990815749153699, CurrSamplesPerSec=11.986096714041961, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:30:36,322] [INFO] [logging.py:68:log_dist] [Rank 0] step=2190, skipped=4, lr=[6.255555555555556e-06], mom=[[0.9, 0.999]] [2022-12-19 22:30:36,323] [INFO] [timer.py:197:stop] 0/2190, RunningAvgSamplesPerSec=11.990810647324707, CurrSamplesPerSec=11.979663324883223, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:30:42,883] [INFO] [timer.py:197:stop] 0/2191, RunningAvgSamplesPerSec=11.99077985219808, CurrSamplesPerSec=11.923776797191975, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:30:49,321] [INFO] [timer.py:197:stop] 0/2192, RunningAvgSamplesPerSec=11.990746044821169, CurrSamplesPerSec=11.917195840435795, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:30:55,777] [INFO] [timer.py:197:stop] 0/2193, RunningAvgSamplesPerSec=11.990714340293238, CurrSamplesPerSec=11.921681346147773, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:31:02,276] [INFO] [timer.py:197:stop] 0/2194, RunningAvgSamplesPerSec=11.990685512727673, CurrSamplesPerSec=11.927855426470707, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:31:08,733] [INFO] [timer.py:197:stop] 0/2195, RunningAvgSamplesPerSec=11.990677176054403, CurrSamplesPerSec=11.972431008389764, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:31:15,193] [INFO] [timer.py:197:stop] 0/2196, RunningAvgSamplesPerSec=11.990661427827089, CurrSamplesPerSec=11.956224795974865, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:31:21,684] [INFO] [timer.py:197:stop] 0/2197, RunningAvgSamplesPerSec=11.990624275165231, CurrSamplesPerSec=11.909661972214666, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:31:28,162] [INFO] [timer.py:197:stop] 0/2198, RunningAvgSamplesPerSec=11.990580318642495, CurrSamplesPerSec=11.894866283957208, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:31:34,620] [INFO] [timer.py:197:stop] 0/2199, RunningAvgSamplesPerSec=11.990583576535919, CurrSamplesPerSec=11.997742183714537, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:31:41,087] [INFO] [logging.py:68:log_dist] [Rank 0] step=2200, skipped=4, lr=[6.2333333333333335e-06], mom=[[0.9, 0.999]] [2022-12-19 22:31:41,087] [INFO] [timer.py:197:stop] 0/2200, RunningAvgSamplesPerSec=11.99054952668905, CurrSamplesPerSec=11.916206043763722, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 6.2333333333333335e-06, 'epoch': 57.89} [2022-12-19 22:31:47,521] [INFO] [timer.py:197:stop] 0/2201, RunningAvgSamplesPerSec=11.990555400524146, CurrSamplesPerSec=12.0034800128211, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:31:53,925] [INFO] [timer.py:197:stop] 0/2202, RunningAvgSamplesPerSec=11.990563588652902, CurrSamplesPerSec=12.008596375135504, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:32:00,399] [INFO] [timer.py:197:stop] 0/2203, RunningAvgSamplesPerSec=11.990522296593277, CurrSamplesPerSec=11.900363139229203, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:32:04,988] [INFO] [timer.py:197:stop] 0/2204, RunningAvgSamplesPerSec=11.992037905374806, CurrSamplesPerSec=16.614248909230472, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:32:11,431] [INFO] [timer.py:197:stop] 0/2205, RunningAvgSamplesPerSec=11.992007397675954, CurrSamplesPerSec=11.92520384118544, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:32:17,926] [INFO] [timer.py:197:stop] 0/2206, RunningAvgSamplesPerSec=11.991963593501382, CurrSamplesPerSec=11.896233695194313, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:32:24,418] [INFO] [timer.py:197:stop] 0/2207, RunningAvgSamplesPerSec=11.99192278373281, CurrSamplesPerSec=11.902647958329208, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:32:30,892] [INFO] [timer.py:197:stop] 0/2208, RunningAvgSamplesPerSec=11.99187693699473, CurrSamplesPerSec=11.891630346019342, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:32:37,331] [INFO] [timer.py:197:stop] 0/2209, RunningAvgSamplesPerSec=11.991840953623472, CurrSamplesPerSec=11.912983863323046, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:32:43,800] [INFO] [logging.py:68:log_dist] [Rank 0] step=2210, skipped=4, lr=[6.211111111111111e-06], mom=[[0.9, 0.999]] [2022-12-19 22:32:43,801] [INFO] [timer.py:197:stop] 0/2210, RunningAvgSamplesPerSec=11.991845975020277, CurrSamplesPerSec=12.002938453493304, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:32:50,314] [INFO] [timer.py:197:stop] 0/2211, RunningAvgSamplesPerSec=11.991817738202124, CurrSamplesPerSec=11.929793461399017, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:32:56,836] [INFO] [timer.py:197:stop] 0/2212, RunningAvgSamplesPerSec=11.991782343739775, CurrSamplesPerSec=11.91410267676835, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:33:03,335] [INFO] [timer.py:197:stop] 0/2213, RunningAvgSamplesPerSec=11.991728357149588, CurrSamplesPerSec=11.873593890383164, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:33:09,747] [INFO] [timer.py:197:stop] 0/2214, RunningAvgSamplesPerSec=11.991718473162786, CurrSamplesPerSec=11.969904749266293, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:33:16,208] [INFO] [timer.py:197:stop] 0/2215, RunningAvgSamplesPerSec=11.991678800843276, CurrSamplesPerSec=11.904561443593092, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:33:22,705] [INFO] [timer.py:197:stop] 0/2216, RunningAvgSamplesPerSec=11.991637602544307, CurrSamplesPerSec=11.901154020050352, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:33:29,158] [INFO] [timer.py:197:stop] 0/2217, RunningAvgSamplesPerSec=11.991642109596459, CurrSamplesPerSec=12.001629037244621, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:33:35,576] [INFO] [timer.py:197:stop] 0/2218, RunningAvgSamplesPerSec=11.991640024213643, CurrSamplesPerSec=11.987022680657835, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:33:41,948] [INFO] [timer.py:197:stop] 0/2219, RunningAvgSamplesPerSec=11.991644770881186, CurrSamplesPerSec=12.002172624961258, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:33:48,421] [INFO] [logging.py:68:log_dist] [Rank 0] step=2220, skipped=4, lr=[6.18888888888889e-06], mom=[[0.9, 0.999]] [2022-12-19 22:33:48,422] [INFO] [timer.py:197:stop] 0/2220, RunningAvgSamplesPerSec=11.991639171798548, CurrSamplesPerSec=11.979238847592352, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:33:54,913] [INFO] [timer.py:197:stop] 0/2221, RunningAvgSamplesPerSec=11.991592507387558, CurrSamplesPerSec=11.888976937191089, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:34:01,408] [INFO] [timer.py:197:stop] 0/2222, RunningAvgSamplesPerSec=11.991599255648067, CurrSamplesPerSec=12.006592376668797, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:34:07,900] [INFO] [timer.py:197:stop] 0/2223, RunningAvgSamplesPerSec=11.991559492465072, CurrSamplesPerSec=11.903930585304677, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:34:14,385] [INFO] [timer.py:197:stop] 0/2224, RunningAvgSamplesPerSec=11.991563693985674, CurrSamplesPerSec=12.000902541803248, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:34:20,807] [INFO] [timer.py:197:stop] 0/2225, RunningAvgSamplesPerSec=11.9915709615148, CurrSamplesPerSec=12.007741196679072, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 6.177777777777778e-06, 'epoch': 58.55} [2022-12-19 22:34:27,271] [INFO] [timer.py:197:stop] 0/2226, RunningAvgSamplesPerSec=11.991564734198713, CurrSamplesPerSec=11.977737380315975, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:34:33,780] [INFO] [timer.py:197:stop] 0/2227, RunningAvgSamplesPerSec=11.991534914121395, CurrSamplesPerSec=11.925579993545755, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:34:40,236] [INFO] [timer.py:197:stop] 0/2228, RunningAvgSamplesPerSec=11.991501964405003, CurrSamplesPerSec=11.918634539325922, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:34:46,708] [INFO] [timer.py:197:stop] 0/2229, RunningAvgSamplesPerSec=11.9914960570004, CurrSamplesPerSec=11.978360585189678, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:34:53,165] [INFO] [logging.py:68:log_dist] [Rank 0] step=2230, skipped=4, lr=[6.166666666666667e-06], mom=[[0.9, 0.999]] [2022-12-19 22:34:53,165] [INFO] [timer.py:197:stop] 0/2230, RunningAvgSamplesPerSec=11.991500879117725, CurrSamplesPerSec=12.002249364409687, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:34:59,695] [INFO] [timer.py:197:stop] 0/2231, RunningAvgSamplesPerSec=11.991471020875283, CurrSamplesPerSec=11.925314035475699, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:35:06,188] [INFO] [timer.py:197:stop] 0/2232, RunningAvgSamplesPerSec=11.99143045873706, CurrSamplesPerSec=11.901694348639001, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:35:12,677] [INFO] [timer.py:197:stop] 0/2233, RunningAvgSamplesPerSec=11.991393884869984, CurrSamplesPerSec=11.910385389395975, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:35:19,108] [INFO] [timer.py:197:stop] 0/2234, RunningAvgSamplesPerSec=11.991402967746001, CurrSamplesPerSec=12.011701180830533, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:35:25,616] [INFO] [timer.py:197:stop] 0/2235, RunningAvgSamplesPerSec=11.9914111117277, CurrSamplesPerSec=12.009616087538717, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:35:32,120] [INFO] [timer.py:197:stop] 0/2236, RunningAvgSamplesPerSec=11.991420498177874, CurrSamplesPerSec=12.012417158160531, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:35:38,611] [INFO] [timer.py:197:stop] 0/2237, RunningAvgSamplesPerSec=11.991396659639145, CurrSamplesPerSec=11.93837693675026, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:35:45,039] [INFO] [timer.py:197:stop] 0/2238, RunningAvgSamplesPerSec=11.991396367996563, CurrSamplesPerSec=11.990744582275125, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:35:51,524] [INFO] [timer.py:197:stop] 0/2239, RunningAvgSamplesPerSec=11.991359076990111, CurrSamplesPerSec=11.908552445342684, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:35:57,996] [INFO] [logging.py:68:log_dist] [Rank 0] step=2240, skipped=4, lr=[6.144444444444445e-06], mom=[[0.9, 0.999]] [2022-12-19 22:35:57,997] [INFO] [timer.py:197:stop] 0/2240, RunningAvgSamplesPerSec=11.991328520876806, CurrSamplesPerSec=11.923362096791854, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:36:04,514] [INFO] [timer.py:197:stop] 0/2241, RunningAvgSamplesPerSec=11.991205994829095, CurrSamplesPerSec=11.723125859957197, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:36:09,183] [INFO] [timer.py:197:stop] 0/2242, RunningAvgSamplesPerSec=11.992679724884079, CurrSamplesPerSec=16.545623801348068, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:36:15,722] [INFO] [timer.py:197:stop] 0/2243, RunningAvgSamplesPerSec=11.99264835178444, CurrSamplesPerSec=11.922782200162757, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:36:22,187] [INFO] [timer.py:197:stop] 0/2244, RunningAvgSamplesPerSec=11.992617633427876, CurrSamplesPerSec=11.924170868745012, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:36:28,685] [INFO] [timer.py:197:stop] 0/2245, RunningAvgSamplesPerSec=11.992578139411934, CurrSamplesPerSec=11.904681815782974, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:36:35,146] [INFO] [timer.py:197:stop] 0/2246, RunningAvgSamplesPerSec=11.992589627679452, CurrSamplesPerSec=12.018413323203502, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:36:41,620] [INFO] [timer.py:197:stop] 0/2247, RunningAvgSamplesPerSec=11.992587294069011, CurrSamplesPerSec=11.987352958855139, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:36:48,052] [INFO] [timer.py:197:stop] 0/2248, RunningAvgSamplesPerSec=11.992562018547211, CurrSamplesPerSec=11.93608581178941, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:36:54,558] [INFO] [timer.py:197:stop] 0/2249, RunningAvgSamplesPerSec=11.992534872892284, CurrSamplesPerSec=11.931874263190736, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:37:01,082] [INFO] [logging.py:68:log_dist] [Rank 0] step=2250, skipped=4, lr=[6.1222222222222224e-06], mom=[[0.9, 0.999]] [2022-12-19 22:37:01,082] [INFO] [timer.py:197:stop] 0/2250, RunningAvgSamplesPerSec=11.99251077833845, CurrSamplesPerSec=11.938613743549931, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 6.1222222222222224e-06, 'epoch': 59.21} [2022-12-19 22:37:07,546] [INFO] [timer.py:197:stop] 0/2251, RunningAvgSamplesPerSec=11.992459711742766, CurrSamplesPerSec=11.878750964744462, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:37:13,963] [INFO] [timer.py:197:stop] 0/2252, RunningAvgSamplesPerSec=11.992463514573583, CurrSamplesPerSec=12.001022187513469, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:37:20,719] [INFO] [timer.py:197:stop] 0/2253, RunningAvgSamplesPerSec=11.992398765798445, CurrSamplesPerSec=11.848463339433192, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:37:27,595] [INFO] [timer.py:197:stop] 0/2254, RunningAvgSamplesPerSec=11.992331042443434, CurrSamplesPerSec=11.84180015349075, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:37:34,251] [INFO] [timer.py:197:stop] 0/2255, RunningAvgSamplesPerSec=11.992314323312769, CurrSamplesPerSec=11.954780735126358, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:37:41,264] [INFO] [timer.py:197:stop] 0/2256, RunningAvgSamplesPerSec=11.992256876037255, CurrSamplesPerSec=11.864210739953172, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:37:48,202] [INFO] [timer.py:197:stop] 0/2257, RunningAvgSamplesPerSec=11.992271666996805, CurrSamplesPerSec=12.02570347233058, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:37:54,652] [INFO] [timer.py:197:stop] 0/2258, RunningAvgSamplesPerSec=11.992226157367064, CurrSamplesPerSec=11.890473086629747, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:38:01,153] [INFO] [timer.py:197:stop] 0/2259, RunningAvgSamplesPerSec=11.992209779297738, CurrSamplesPerSec=11.955374397856511, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:38:07,821] [INFO] [logging.py:68:log_dist] [Rank 0] step=2260, skipped=4, lr=[6.1e-06], mom=[[0.9, 0.999]] [2022-12-19 22:38:07,822] [INFO] [timer.py:197:stop] 0/2260, RunningAvgSamplesPerSec=11.992188483757053, CurrSamplesPerSec=11.944316402128033, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:38:14,458] [INFO] [timer.py:197:stop] 0/2261, RunningAvgSamplesPerSec=11.992143772819317, CurrSamplesPerSec=11.892029671213749, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:38:20,974] [INFO] [timer.py:197:stop] 0/2262, RunningAvgSamplesPerSec=11.99204377429433, CurrSamplesPerSec=11.770325509117727, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:38:27,499] [INFO] [timer.py:197:stop] 0/2263, RunningAvgSamplesPerSec=11.992000170251737, CurrSamplesPerSec=11.894258587969333, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:38:33,909] [INFO] [timer.py:197:stop] 0/2264, RunningAvgSamplesPerSec=11.992009062102353, CurrSamplesPerSec=12.012147312838412, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:38:40,507] [INFO] [timer.py:197:stop] 0/2265, RunningAvgSamplesPerSec=11.992017484476957, CurrSamplesPerSec=12.0110992239009, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:38:47,096] [INFO] [timer.py:197:stop] 0/2266, RunningAvgSamplesPerSec=11.992016175616557, CurrSamplesPerSec=11.989054956258904, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:38:53,556] [INFO] [timer.py:197:stop] 0/2267, RunningAvgSamplesPerSec=11.99197947235371, CurrSamplesPerSec=11.909455373214506, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:39:00,062] [INFO] [timer.py:197:stop] 0/2268, RunningAvgSamplesPerSec=11.991943487556718, CurrSamplesPerSec=11.910988392269335, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:39:06,521] [INFO] [timer.py:197:stop] 0/2269, RunningAvgSamplesPerSec=11.991947351005145, CurrSamplesPerSec=12.000708323801616, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:39:13,130] [INFO] [logging.py:68:log_dist] [Rank 0] step=2270, skipped=4, lr=[6.077777777777778e-06], mom=[[0.9, 0.999]] [2022-12-19 22:39:13,131] [INFO] [timer.py:197:stop] 0/2270, RunningAvgSamplesPerSec=11.991951072158137, CurrSamplesPerSec=12.00039286707534, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:39:19,727] [INFO] [timer.py:197:stop] 0/2271, RunningAvgSamplesPerSec=11.991919897797104, CurrSamplesPerSec=11.9216310474301, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:39:26,269] [INFO] [timer.py:197:stop] 0/2272, RunningAvgSamplesPerSec=11.991930224423589, CurrSamplesPerSec=12.015407232024332, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:39:32,877] [INFO] [timer.py:197:stop] 0/2273, RunningAvgSamplesPerSec=11.991874824335492, CurrSamplesPerSec=11.867422326507741, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:39:39,393] [INFO] [timer.py:197:stop] 0/2274, RunningAvgSamplesPerSec=11.991842385811069, CurrSamplesPerSec=11.918624484689698, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:39:45,874] [INFO] [timer.py:197:stop] 0/2275, RunningAvgSamplesPerSec=11.991806771645125, CurrSamplesPerSec=11.911433945445738, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 6.066666666666667e-06, 'epoch': 59.87} [2022-12-19 22:39:52,323] [INFO] [timer.py:197:stop] 0/2276, RunningAvgSamplesPerSec=11.99176058019966, CurrSamplesPerSec=11.887679104818343, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:39:58,713] [INFO] [timer.py:197:stop] 0/2277, RunningAvgSamplesPerSec=11.991761578678862, CurrSamplesPerSec=11.994032550560972, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:40:05,324] [INFO] [timer.py:197:stop] 0/2278, RunningAvgSamplesPerSec=11.991729755695586, CurrSamplesPerSec=11.919767117448771, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:40:11,938] [INFO] [timer.py:197:stop] 0/2279, RunningAvgSamplesPerSec=11.991736718850193, CurrSamplesPerSec=12.007605840400812, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:40:16,588] [INFO] [logging.py:68:log_dist] [Rank 0] step=2280, skipped=4, lr=[6.055555555555555e-06], mom=[[0.9, 0.999]] [2022-12-19 22:40:16,589] [INFO] [timer.py:197:stop] 0/2280, RunningAvgSamplesPerSec=11.993199130226063, CurrSamplesPerSec=16.60380687869682, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:40:23,099] [INFO] [timer.py:197:stop] 0/2281, RunningAvgSamplesPerSec=11.993186439336736, CurrSamplesPerSec=11.964346144145178, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:40:29,600] [INFO] [timer.py:197:stop] 0/2282, RunningAvgSamplesPerSec=11.993145011836743, CurrSamplesPerSec=11.899469501818087, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:40:36,131] [INFO] [timer.py:197:stop] 0/2283, RunningAvgSamplesPerSec=11.993095209362902, CurrSamplesPerSec=11.880611027105592, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:40:42,625] [INFO] [timer.py:197:stop] 0/2284, RunningAvgSamplesPerSec=11.993036189392134, CurrSamplesPerSec=11.859906699957794, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:40:49,123] [INFO] [timer.py:197:stop] 0/2285, RunningAvgSamplesPerSec=11.993031470605523, CurrSamplesPerSec=11.98227286370099, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:40:55,752] [INFO] [timer.py:197:stop] 0/2286, RunningAvgSamplesPerSec=11.993028089870805, CurrSamplesPerSec=11.985314838619228, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:41:02,288] [INFO] [timer.py:197:stop] 0/2287, RunningAvgSamplesPerSec=11.99302964326932, CurrSamplesPerSec=11.99657865585544, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:41:08,879] [INFO] [timer.py:197:stop] 0/2288, RunningAvgSamplesPerSec=11.993038275108026, CurrSamplesPerSec=12.012794531911059, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:41:15,486] [INFO] [timer.py:197:stop] 0/2289, RunningAvgSamplesPerSec=11.993018724043068, CurrSamplesPerSec=11.94849100120159, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:41:22,011] [INFO] [logging.py:68:log_dist] [Rank 0] step=2290, skipped=4, lr=[6.033333333333335e-06], mom=[[0.9, 0.999]] [2022-12-19 22:41:22,011] [INFO] [timer.py:197:stop] 0/2290, RunningAvgSamplesPerSec=11.992970898267341, CurrSamplesPerSec=11.88458230037679, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:41:28,511] [INFO] [timer.py:197:stop] 0/2291, RunningAvgSamplesPerSec=11.992961885934552, CurrSamplesPerSec=11.972377076675489, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:41:35,056] [INFO] [timer.py:197:stop] 0/2292, RunningAvgSamplesPerSec=11.992889586521285, CurrSamplesPerSec=11.82964980664779, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:41:41,519] [INFO] [timer.py:197:stop] 0/2293, RunningAvgSamplesPerSec=11.99283410474335, CurrSamplesPerSec=11.867113313398862, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:41:48,127] [INFO] [timer.py:197:stop] 0/2294, RunningAvgSamplesPerSec=11.992797931604166, CurrSamplesPerSec=11.910494253058184, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:41:54,741] [INFO] [timer.py:197:stop] 0/2295, RunningAvgSamplesPerSec=11.992767303806067, CurrSamplesPerSec=11.922977081503927, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:42:01,296] [INFO] [timer.py:197:stop] 0/2296, RunningAvgSamplesPerSec=11.992765233073625, CurrSamplesPerSec=11.988018923562715, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:42:07,823] [INFO] [timer.py:197:stop] 0/2297, RunningAvgSamplesPerSec=11.99265432731502, CurrSamplesPerSec=11.743523989169912, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:42:14,300] [INFO] [timer.py:197:stop] 0/2298, RunningAvgSamplesPerSec=11.992621873202422, CurrSamplesPerSec=11.918599612767716, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:42:20,749] [INFO] [timer.py:197:stop] 0/2299, RunningAvgSamplesPerSec=11.9925858519472, CurrSamplesPerSec=11.910447747949291, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:42:27,315] [INFO] [logging.py:68:log_dist] [Rank 0] step=2300, skipped=4, lr=[6.011111111111112e-06], mom=[[0.9, 0.999]] [2022-12-19 22:42:27,316] [INFO] [timer.py:197:stop] 0/2300, RunningAvgSamplesPerSec=11.99256806537745, CurrSamplesPerSec=11.95185108785102, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 6.011111111111112e-06, 'epoch': 60.53} [2022-12-19 22:42:33,995] [INFO] [timer.py:197:stop] 0/2301, RunningAvgSamplesPerSec=11.992536356717578, CurrSamplesPerSec=11.920110108796603, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:42:40,543] [INFO] [timer.py:197:stop] 0/2302, RunningAvgSamplesPerSec=11.99253443054534, CurrSamplesPerSec=11.988107795831798, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:42:47,031] [INFO] [timer.py:197:stop] 0/2303, RunningAvgSamplesPerSec=11.992492500764103, CurrSamplesPerSec=11.896823666269214, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:42:53,492] [INFO] [timer.py:197:stop] 0/2304, RunningAvgSamplesPerSec=11.992476499133701, CurrSamplesPerSec=11.955769495774568, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:43:00,000] [INFO] [timer.py:197:stop] 0/2305, RunningAvgSamplesPerSec=11.992441889871493, CurrSamplesPerSec=11.913297384839616, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:43:06,493] [INFO] [timer.py:197:stop] 0/2306, RunningAvgSamplesPerSec=11.992419756894238, CurrSamplesPerSec=11.941663337618834, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:43:13,073] [INFO] [timer.py:197:stop] 0/2307, RunningAvgSamplesPerSec=11.992388640775719, CurrSamplesPerSec=11.921123318878438, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:43:19,549] [INFO] [timer.py:197:stop] 0/2308, RunningAvgSamplesPerSec=11.99239586541497, CurrSamplesPerSec=12.009071825387895, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:43:26,174] [INFO] [timer.py:197:stop] 0/2309, RunningAvgSamplesPerSec=11.992366263051089, CurrSamplesPerSec=11.92448974678199, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:43:32,634] [INFO] [logging.py:68:log_dist] [Rank 0] step=2310, skipped=4, lr=[5.98888888888889e-06], mom=[[0.9, 0.999]] [2022-12-19 22:43:32,635] [INFO] [timer.py:197:stop] 0/2310, RunningAvgSamplesPerSec=11.992340902902441, CurrSamplesPerSec=11.934119203613907, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:43:39,093] [INFO] [timer.py:197:stop] 0/2311, RunningAvgSamplesPerSec=11.99231244937457, CurrSamplesPerSec=11.926999520183847, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:43:45,575] [INFO] [timer.py:197:stop] 0/2312, RunningAvgSamplesPerSec=11.992278228465965, CurrSamplesPerSec=11.913779595163414, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:43:52,159] [INFO] [timer.py:197:stop] 0/2313, RunningAvgSamplesPerSec=11.992245488595506, CurrSamplesPerSec=11.917090557614651, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:43:58,657] [INFO] [timer.py:197:stop] 0/2314, RunningAvgSamplesPerSec=11.992211885783256, CurrSamplesPerSec=11.915055631610285, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:44:05,299] [INFO] [timer.py:197:stop] 0/2315, RunningAvgSamplesPerSec=11.99223028179972, CurrSamplesPerSec=12.034913316804998, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:44:11,769] [INFO] [timer.py:197:stop] 0/2316, RunningAvgSamplesPerSec=11.992238319525486, CurrSamplesPerSec=12.010858458022001, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:44:18,234] [INFO] [timer.py:197:stop] 0/2317, RunningAvgSamplesPerSec=11.99221073342719, CurrSamplesPerSec=11.928714636132717, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:44:22,877] [INFO] [timer.py:197:stop] 0/2318, RunningAvgSamplesPerSec=11.993632285463045, CurrSamplesPerSec=16.529699250319513, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:44:29,358] [INFO] [timer.py:197:stop] 0/2319, RunningAvgSamplesPerSec=11.993608142544998, CurrSamplesPerSec=11.937952725636308, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:44:35,839] [INFO] [logging.py:68:log_dist] [Rank 0] step=2320, skipped=4, lr=[5.966666666666667e-06], mom=[[0.9, 0.999]] [2022-12-19 22:44:35,840] [INFO] [timer.py:197:stop] 0/2320, RunningAvgSamplesPerSec=11.993582377580015, CurrSamplesPerSec=11.93418074979258, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:44:42,324] [INFO] [timer.py:197:stop] 0/2321, RunningAvgSamplesPerSec=11.993554033259153, CurrSamplesPerSec=11.928210014347066, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:44:48,807] [INFO] [timer.py:197:stop] 0/2322, RunningAvgSamplesPerSec=11.993498482021579, CurrSamplesPerSec=11.866044745258751, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:44:55,340] [INFO] [timer.py:197:stop] 0/2323, RunningAvgSamplesPerSec=11.993471954693545, CurrSamplesPerSec=11.932242880461253, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:45:01,950] [INFO] [timer.py:197:stop] 0/2324, RunningAvgSamplesPerSec=11.993482096343431, CurrSamplesPerSec=12.017067174456486, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:45:08,600] [INFO] [timer.py:197:stop] 0/2325, RunningAvgSamplesPerSec=11.993490576486323, CurrSamplesPerSec=12.013213863888037, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 5.955555555555555e-06, 'epoch': 61.18} [2022-12-19 22:45:15,066] [INFO] [timer.py:197:stop] 0/2326, RunningAvgSamplesPerSec=11.993497596478399, CurrSamplesPerSec=12.009827250895327, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:45:21,524] [INFO] [timer.py:197:stop] 0/2327, RunningAvgSamplesPerSec=11.99347046522932, CurrSamplesPerSec=11.930747337259145, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:45:28,073] [INFO] [timer.py:197:stop] 0/2328, RunningAvgSamplesPerSec=11.993466194928438, CurrSamplesPerSec=11.983545961096297, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:45:34,626] [INFO] [timer.py:197:stop] 0/2329, RunningAvgSamplesPerSec=11.993446358980435, CurrSamplesPerSec=11.947484832503832, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:45:41,099] [INFO] [logging.py:68:log_dist] [Rank 0] step=2330, skipped=4, lr=[5.944444444444445e-06], mom=[[0.9, 0.999]] [2022-12-19 22:45:41,100] [INFO] [timer.py:197:stop] 0/2330, RunningAvgSamplesPerSec=11.993413253362304, CurrSamplesPerSec=11.916868358771643, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:45:47,556] [INFO] [timer.py:197:stop] 0/2331, RunningAvgSamplesPerSec=11.993381985224001, CurrSamplesPerSec=11.921029084135139, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:45:53,971] [INFO] [timer.py:197:stop] 0/2332, RunningAvgSamplesPerSec=11.993389604425088, CurrSamplesPerSec=12.01116102919906, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:46:00,457] [INFO] [timer.py:197:stop] 0/2333, RunningAvgSamplesPerSec=11.993358741189898, CurrSamplesPerSec=11.921876191502752, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:46:06,859] [INFO] [timer.py:197:stop] 0/2334, RunningAvgSamplesPerSec=11.993328167450253, CurrSamplesPerSec=11.922481947087428, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:46:13,345] [INFO] [timer.py:197:stop] 0/2335, RunningAvgSamplesPerSec=11.993299087959208, CurrSamplesPerSec=11.925867156812524, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:46:19,772] [INFO] [timer.py:197:stop] 0/2336, RunningAvgSamplesPerSec=11.993306616101107, CurrSamplesPerSec=12.010895539637902, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:46:26,302] [INFO] [timer.py:197:stop] 0/2337, RunningAvgSamplesPerSec=11.993278619585297, CurrSamplesPerSec=11.928288991199116, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:46:32,836] [INFO] [timer.py:197:stop] 0/2338, RunningAvgSamplesPerSec=11.993253805544093, CurrSamplesPerSec=11.935591710839345, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:46:39,247] [INFO] [timer.py:197:stop] 0/2339, RunningAvgSamplesPerSec=11.99326267319586, CurrSamplesPerSec=12.014013363779732, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:46:45,736] [INFO] [logging.py:68:log_dist] [Rank 0] step=2340, skipped=4, lr=[5.922222222222223e-06], mom=[[0.9, 0.999]] [2022-12-19 22:46:45,736] [INFO] [timer.py:197:stop] 0/2340, RunningAvgSamplesPerSec=11.993228044873415, CurrSamplesPerSec=11.912844290755718, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:46:52,235] [INFO] [timer.py:197:stop] 0/2341, RunningAvgSamplesPerSec=11.993184123232426, CurrSamplesPerSec=11.89136748035138, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:46:58,725] [INFO] [timer.py:197:stop] 0/2342, RunningAvgSamplesPerSec=11.993180174163426, CurrSamplesPerSec=11.983950413355965, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:47:05,193] [INFO] [timer.py:197:stop] 0/2343, RunningAvgSamplesPerSec=11.993139752186806, CurrSamplesPerSec=11.89929279484827, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:47:11,584] [INFO] [timer.py:197:stop] 0/2344, RunningAvgSamplesPerSec=11.993127999906472, CurrSamplesPerSec=11.96567890642078, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:47:18,071] [INFO] [timer.py:197:stop] 0/2345, RunningAvgSamplesPerSec=11.993095047632659, CurrSamplesPerSec=11.916414464030389, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:47:24,616] [INFO] [timer.py:197:stop] 0/2346, RunningAvgSamplesPerSec=11.99301727115946, CurrSamplesPerSec=11.813515631762911, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:47:31,142] [INFO] [timer.py:197:stop] 0/2347, RunningAvgSamplesPerSec=11.993011696753022, CurrSamplesPerSec=11.97995951451334, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:47:37,594] [INFO] [timer.py:197:stop] 0/2348, RunningAvgSamplesPerSec=11.992998518100826, CurrSamplesPerSec=11.96217404206181, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:47:44,064] [INFO] [timer.py:197:stop] 0/2349, RunningAvgSamplesPerSec=11.992996322022952, CurrSamplesPerSec=11.987846536532647, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:47:50,541] [INFO] [logging.py:68:log_dist] [Rank 0] step=2350, skipped=4, lr=[5.9e-06], mom=[[0.9, 0.999]] [2022-12-19 22:47:50,542] [INFO] [timer.py:197:stop] 0/2350, RunningAvgSamplesPerSec=11.992929048251844, CurrSamplesPerSec=11.837090062051407, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 5.9e-06, 'epoch': 61.84} [2022-12-19 22:47:56,996] [INFO] [timer.py:197:stop] 0/2351, RunningAvgSamplesPerSec=11.992932646621572, CurrSamplesPerSec=12.001387577745552, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:48:03,548] [INFO] [timer.py:197:stop] 0/2352, RunningAvgSamplesPerSec=11.992909189720955, CurrSamplesPerSec=11.93806103199089, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:48:10,048] [INFO] [timer.py:197:stop] 0/2353, RunningAvgSamplesPerSec=11.992868689603181, CurrSamplesPerSec=11.898443092241578, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:48:16,561] [INFO] [timer.py:197:stop] 0/2354, RunningAvgSamplesPerSec=11.992806187380308, CurrSamplesPerSec=11.847642842338413, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:48:23,070] [INFO] [timer.py:197:stop] 0/2355, RunningAvgSamplesPerSec=11.992748652840493, CurrSamplesPerSec=11.858937922330506, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:48:27,730] [INFO] [timer.py:197:stop] 0/2356, RunningAvgSamplesPerSec=11.99413735435258, CurrSamplesPerSec=16.486013244307518, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:48:34,252] [INFO] [timer.py:197:stop] 0/2357, RunningAvgSamplesPerSec=11.994112470169412, CurrSamplesPerSec=11.935819915646597, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:48:40,807] [INFO] [timer.py:197:stop] 0/2358, RunningAvgSamplesPerSec=11.994039093150883, CurrSamplesPerSec=11.823691521696185, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:48:47,269] [INFO] [timer.py:197:stop] 0/2359, RunningAvgSamplesPerSec=11.99404161781061, CurrSamplesPerSec=11.99999266863317, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:48:53,690] [INFO] [logging.py:68:log_dist] [Rank 0] step=2360, skipped=4, lr=[5.877777777777778e-06], mom=[[0.9, 0.999]] [2022-12-19 22:48:53,691] [INFO] [timer.py:197:stop] 0/2360, RunningAvgSamplesPerSec=11.994002528856411, CurrSamplesPerSec=11.90257248724611, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:49:00,178] [INFO] [timer.py:197:stop] 0/2361, RunningAvgSamplesPerSec=11.993955447490563, CurrSamplesPerSec=11.883956188348222, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:49:06,676] [INFO] [timer.py:197:stop] 0/2362, RunningAvgSamplesPerSec=11.993952600826569, CurrSamplesPerSec=11.987241079761292, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:49:13,162] [INFO] [timer.py:197:stop] 0/2363, RunningAvgSamplesPerSec=11.993910030803447, CurrSamplesPerSec=11.894279669184193, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:49:19,713] [INFO] [timer.py:197:stop] 0/2364, RunningAvgSamplesPerSec=11.993832805260842, CurrSamplesPerSec=11.814234694950892, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:49:26,164] [INFO] [timer.py:197:stop] 0/2365, RunningAvgSamplesPerSec=11.993820723975158, CurrSamplesPerSec=11.965352488206056, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:49:32,656] [INFO] [timer.py:197:stop] 0/2366, RunningAvgSamplesPerSec=11.993767228496848, CurrSamplesPerSec=11.868676381867111, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:49:39,178] [INFO] [timer.py:197:stop] 0/2367, RunningAvgSamplesPerSec=11.993726493263713, CurrSamplesPerSec=11.898195746628854, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:49:45,672] [INFO] [timer.py:197:stop] 0/2368, RunningAvgSamplesPerSec=11.993683038612327, CurrSamplesPerSec=11.891786279843046, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:49:52,219] [INFO] [timer.py:197:stop] 0/2369, RunningAvgSamplesPerSec=11.993626241381032, CurrSamplesPerSec=11.860733613196125, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:49:58,743] [INFO] [logging.py:68:log_dist] [Rank 0] step=2370, skipped=4, lr=[5.855555555555556e-06], mom=[[0.9, 0.999]] [2022-12-19 22:49:58,744] [INFO] [timer.py:197:stop] 0/2370, RunningAvgSamplesPerSec=11.993565454194911, CurrSamplesPerSec=11.851388559543176, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:50:05,235] [INFO] [timer.py:197:stop] 0/2371, RunningAvgSamplesPerSec=11.993524268749061, CurrSamplesPerSec=11.896784122142385, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:50:11,691] [INFO] [timer.py:197:stop] 0/2372, RunningAvgSamplesPerSec=11.993482136837116, CurrSamplesPerSec=11.894495755945133, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:50:18,234] [INFO] [timer.py:197:stop] 0/2373, RunningAvgSamplesPerSec=11.993437356285387, CurrSamplesPerSec=11.888238743825376, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:50:24,706] [INFO] [timer.py:197:stop] 0/2374, RunningAvgSamplesPerSec=11.993400492560257, CurrSamplesPerSec=11.906629226949322, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:50:31,195] [INFO] [timer.py:197:stop] 0/2375, RunningAvgSamplesPerSec=11.993357761282489, CurrSamplesPerSec=11.892848951305792, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 5.844444444444445e-06, 'epoch': 62.5} [2022-12-19 22:50:37,638] [INFO] [timer.py:197:stop] 0/2376, RunningAvgSamplesPerSec=11.993349708267619, CurrSamplesPerSec=11.974270317260192, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:50:44,112] [INFO] [timer.py:197:stop] 0/2377, RunningAvgSamplesPerSec=11.993293989930796, CurrSamplesPerSec=11.862462223777356, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:50:50,665] [INFO] [timer.py:197:stop] 0/2378, RunningAvgSamplesPerSec=11.993208920642992, CurrSamplesPerSec=11.794517951991669, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:50:57,195] [INFO] [timer.py:197:stop] 0/2379, RunningAvgSamplesPerSec=11.993162826851984, CurrSamplesPerSec=11.884635444014911, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:51:03,717] [INFO] [logging.py:68:log_dist] [Rank 0] step=2380, skipped=4, lr=[5.833333333333334e-06], mom=[[0.9, 0.999]] [2022-12-19 22:51:03,717] [INFO] [timer.py:197:stop] 0/2380, RunningAvgSamplesPerSec=11.993130336630903, CurrSamplesPerSec=11.9163954202516, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:51:10,210] [INFO] [timer.py:197:stop] 0/2381, RunningAvgSamplesPerSec=11.993085224031413, CurrSamplesPerSec=11.886758947512499, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:51:16,644] [INFO] [timer.py:197:stop] 0/2382, RunningAvgSamplesPerSec=11.993084398872336, CurrSamplesPerSec=11.991121666828075, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:51:23,081] [INFO] [timer.py:197:stop] 0/2383, RunningAvgSamplesPerSec=11.993080022578921, CurrSamplesPerSec=11.982673485770245, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:51:29,550] [INFO] [timer.py:197:stop] 0/2384, RunningAvgSamplesPerSec=11.993072204906364, CurrSamplesPerSec=11.974487183593444, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:51:35,990] [INFO] [timer.py:197:stop] 0/2385, RunningAvgSamplesPerSec=11.99307998505076, CurrSamplesPerSec=12.011640982422453, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:51:42,478] [INFO] [timer.py:197:stop] 0/2386, RunningAvgSamplesPerSec=11.993051167451341, CurrSamplesPerSec=11.924769970962755, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:51:48,971] [INFO] [timer.py:197:stop] 0/2387, RunningAvgSamplesPerSec=11.993043112937269, CurrSamplesPerSec=11.973871859203012, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:51:55,462] [INFO] [timer.py:197:stop] 0/2388, RunningAvgSamplesPerSec=11.992990038794979, CurrSamplesPerSec=11.867730831028885, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:52:02,402] [INFO] [timer.py:197:stop] 0/2389, RunningAvgSamplesPerSec=11.99296218937087, CurrSamplesPerSec=11.9268797560982, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:52:09,123] [INFO] [logging.py:68:log_dist] [Rank 0] step=2390, skipped=4, lr=[5.8111111111111116e-06], mom=[[0.9, 0.999]] [2022-12-19 22:52:09,124] [INFO] [timer.py:197:stop] 0/2390, RunningAvgSamplesPerSec=11.992896340736769, CurrSamplesPerSec=11.837749875717838, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:52:15,890] [INFO] [timer.py:197:stop] 0/2391, RunningAvgSamplesPerSec=11.992847561955974, CurrSamplesPerSec=11.87748479427011, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:52:22,338] [INFO] [timer.py:197:stop] 0/2392, RunningAvgSamplesPerSec=11.992820042707805, CurrSamplesPerSec=11.927435142584894, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:52:29,180] [INFO] [timer.py:197:stop] 0/2393, RunningAvgSamplesPerSec=11.992768982761165, CurrSamplesPerSec=11.87196546935016, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:52:33,880] [INFO] [timer.py:197:stop] 0/2394, RunningAvgSamplesPerSec=11.994176345561002, CurrSamplesPerSec=16.67215019349934, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:52:40,390] [INFO] [timer.py:197:stop] 0/2395, RunningAvgSamplesPerSec=11.994145067075177, CurrSamplesPerSec=11.919790935654596, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:52:46,950] [INFO] [timer.py:197:stop] 0/2396, RunningAvgSamplesPerSec=11.994098704497933, CurrSamplesPerSec=11.8841703217617, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:52:53,669] [INFO] [timer.py:197:stop] 0/2397, RunningAvgSamplesPerSec=11.994091899298029, CurrSamplesPerSec=11.977822358980218, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:53:00,113] [INFO] [timer.py:197:stop] 0/2398, RunningAvgSamplesPerSec=11.994103605592056, CurrSamplesPerSec=12.022205897152148, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:53:06,616] [INFO] [timer.py:197:stop] 0/2399, RunningAvgSamplesPerSec=11.99404270689459, CurrSamplesPerSec=11.84988392046741, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:53:13,256] [INFO] [logging.py:68:log_dist] [Rank 0] step=2400, skipped=4, lr=[5.788888888888889e-06], mom=[[0.9, 0.999]] [2022-12-19 22:53:13,257] [INFO] [timer.py:197:stop] 0/2400, RunningAvgSamplesPerSec=11.993810025987884, CurrSamplesPerSec=11.460867140880667, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 5.788888888888889e-06, 'epoch': 63.16} [2022-12-19 22:53:19,903] [INFO] [timer.py:197:stop] 0/2401, RunningAvgSamplesPerSec=11.993744756782625, CurrSamplesPerSec=11.839246213077313, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:53:26,489] [INFO] [timer.py:197:stop] 0/2402, RunningAvgSamplesPerSec=11.993759542481468, CurrSamplesPerSec=12.029335692396918, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:53:33,086] [INFO] [timer.py:197:stop] 0/2403, RunningAvgSamplesPerSec=11.993706075387792, CurrSamplesPerSec=11.86674398777268, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:53:39,587] [INFO] [timer.py:197:stop] 0/2404, RunningAvgSamplesPerSec=11.993666666355502, CurrSamplesPerSec=11.899786532550793, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:53:46,016] [INFO] [timer.py:197:stop] 0/2405, RunningAvgSamplesPerSec=11.993653475477755, CurrSamplesPerSec=11.962052504424461, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:53:52,619] [INFO] [timer.py:197:stop] 0/2406, RunningAvgSamplesPerSec=11.993602116977407, CurrSamplesPerSec=11.871445161963678, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:53:59,167] [INFO] [timer.py:197:stop] 0/2407, RunningAvgSamplesPerSec=11.993546289886632, CurrSamplesPerSec=11.860823752660435, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:54:05,654] [INFO] [timer.py:197:stop] 0/2408, RunningAvgSamplesPerSec=11.993528944545561, CurrSamplesPerSec=11.95195805007209, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:54:12,112] [INFO] [timer.py:197:stop] 0/2409, RunningAvgSamplesPerSec=11.993511832286186, CurrSamplesPerSec=11.95248064924566, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:54:18,748] [INFO] [logging.py:68:log_dist] [Rank 0] step=2410, skipped=4, lr=[5.766666666666667e-06], mom=[[0.9, 0.999]] [2022-12-19 22:54:18,748] [INFO] [timer.py:197:stop] 0/2410, RunningAvgSamplesPerSec=11.993475585322878, CurrSamplesPerSec=11.906859494461779, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:54:25,290] [INFO] [timer.py:197:stop] 0/2411, RunningAvgSamplesPerSec=11.993489281062184, CurrSamplesPerSec=12.026559594842704, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:54:31,849] [INFO] [timer.py:197:stop] 0/2412, RunningAvgSamplesPerSec=11.99350590225323, CurrSamplesPerSec=12.033680530588319, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:54:38,419] [INFO] [timer.py:197:stop] 0/2413, RunningAvgSamplesPerSec=11.993520620736312, CurrSamplesPerSec=12.029097429137966, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:54:44,927] [INFO] [timer.py:197:stop] 0/2414, RunningAvgSamplesPerSec=11.993487593358239, CurrSamplesPerSec=11.914383999409154, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:54:51,405] [INFO] [timer.py:197:stop] 0/2415, RunningAvgSamplesPerSec=11.993435971821228, CurrSamplesPerSec=11.87020469374517, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:54:57,853] [INFO] [timer.py:197:stop] 0/2416, RunningAvgSamplesPerSec=11.993428702324346, CurrSamplesPerSec=11.975913034947336, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:55:04,384] [INFO] [timer.py:197:stop] 0/2417, RunningAvgSamplesPerSec=11.993402241786393, CurrSamplesPerSec=11.92986503656194, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:55:11,011] [INFO] [timer.py:197:stop] 0/2418, RunningAvgSamplesPerSec=11.99341178360107, CurrSamplesPerSec=12.016499644163416, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:55:17,525] [INFO] [timer.py:197:stop] 0/2419, RunningAvgSamplesPerSec=11.993423358081074, CurrSamplesPerSec=12.021452682171933, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:55:23,991] [INFO] [logging.py:68:log_dist] [Rank 0] step=2420, skipped=4, lr=[5.744444444444444e-06], mom=[[0.9, 0.999]] [2022-12-19 22:55:23,992] [INFO] [timer.py:197:stop] 0/2420, RunningAvgSamplesPerSec=11.993395650152591, CurrSamplesPerSec=11.926797618403373, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:55:30,520] [INFO] [timer.py:197:stop] 0/2421, RunningAvgSamplesPerSec=11.99336122960454, CurrSamplesPerSec=11.91070617298193, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:55:37,023] [INFO] [timer.py:197:stop] 0/2422, RunningAvgSamplesPerSec=11.993318713670687, CurrSamplesPerSec=11.891347463001154, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:55:43,555] [INFO] [timer.py:197:stop] 0/2423, RunningAvgSamplesPerSec=11.99328783420295, CurrSamplesPerSec=11.919022449559534, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:55:50,182] [INFO] [timer.py:197:stop] 0/2424, RunningAvgSamplesPerSec=11.99326657284993, CurrSamplesPerSec=11.942012902966344, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:55:56,828] [INFO] [timer.py:197:stop] 0/2425, RunningAvgSamplesPerSec=11.993238663768896, CurrSamplesPerSec=11.926021870430313, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 5.733333333333334e-06, 'epoch': 63.82} [2022-12-19 22:56:03,313] [INFO] [timer.py:197:stop] 0/2426, RunningAvgSamplesPerSec=11.993208814222255, CurrSamplesPerSec=11.921317087132918, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:56:09,727] [INFO] [timer.py:197:stop] 0/2427, RunningAvgSamplesPerSec=11.993221732647374, CurrSamplesPerSec=12.024618004513922, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:56:16,218] [INFO] [timer.py:197:stop] 0/2428, RunningAvgSamplesPerSec=11.993171699799538, CurrSamplesPerSec=11.873057685663747, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:56:22,706] [INFO] [timer.py:197:stop] 0/2429, RunningAvgSamplesPerSec=11.993136031115217, CurrSamplesPerSec=11.907223926845717, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:56:29,193] [INFO] [logging.py:68:log_dist] [Rank 0] step=2430, skipped=4, lr=[5.722222222222222e-06], mom=[[0.9, 0.999]] [2022-12-19 22:56:29,194] [INFO] [timer.py:197:stop] 0/2430, RunningAvgSamplesPerSec=11.993140337389068, CurrSamplesPerSec=12.003600783457932, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:56:35,763] [INFO] [timer.py:197:stop] 0/2431, RunningAvgSamplesPerSec=11.993111788087147, CurrSamplesPerSec=11.924192585810975, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:56:40,538] [INFO] [timer.py:197:stop] 0/2432, RunningAvgSamplesPerSec=11.9944830121654, CurrSamplesPerSec=16.606371723505045, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:56:47,029] [INFO] [timer.py:197:stop] 0/2433, RunningAvgSamplesPerSec=11.994462310048664, CurrSamplesPerSec=11.944366360973103, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:56:53,597] [INFO] [timer.py:197:stop] 0/2434, RunningAvgSamplesPerSec=11.994433819893915, CurrSamplesPerSec=11.925572046419534, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:57:00,223] [INFO] [timer.py:197:stop] 0/2435, RunningAvgSamplesPerSec=11.994438903603921, CurrSamplesPerSec=12.006815248824033, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:57:06,799] [INFO] [timer.py:197:stop] 0/2436, RunningAvgSamplesPerSec=11.99444331287927, CurrSamplesPerSec=12.005180687199552, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:57:13,260] [INFO] [timer.py:197:stop] 0/2437, RunningAvgSamplesPerSec=11.994459665840974, CurrSamplesPerSec=12.034395353974078, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:57:19,798] [INFO] [timer.py:197:stop] 0/2438, RunningAvgSamplesPerSec=11.994389102825249, CurrSamplesPerSec=11.824995734709457, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:57:26,323] [INFO] [timer.py:197:stop] 0/2439, RunningAvgSamplesPerSec=11.994340617623934, CurrSamplesPerSec=11.877382839608016, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:57:32,883] [INFO] [logging.py:68:log_dist] [Rank 0] step=2440, skipped=4, lr=[5.7e-06], mom=[[0.9, 0.999]] [2022-12-19 22:57:32,883] [INFO] [timer.py:197:stop] 0/2440, RunningAvgSamplesPerSec=11.994286672587622, CurrSamplesPerSec=11.864248494712356, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:57:39,513] [INFO] [timer.py:197:stop] 0/2441, RunningAvgSamplesPerSec=11.99425687428106, CurrSamplesPerSec=11.922046156918173, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:57:46,167] [INFO] [timer.py:197:stop] 0/2442, RunningAvgSamplesPerSec=11.994226560667899, CurrSamplesPerSec=11.920744800706055, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:57:52,719] [INFO] [timer.py:197:stop] 0/2443, RunningAvgSamplesPerSec=11.994201406237991, CurrSamplesPerSec=11.933137203483037, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:57:59,149] [INFO] [timer.py:197:stop] 0/2444, RunningAvgSamplesPerSec=11.994208512807482, CurrSamplesPerSec=12.011580784617756, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:58:05,606] [INFO] [timer.py:197:stop] 0/2445, RunningAvgSamplesPerSec=11.994208057757007, CurrSamplesPerSec=11.993096927478646, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:58:12,316] [INFO] [timer.py:197:stop] 0/2446, RunningAvgSamplesPerSec=11.994197029771689, CurrSamplesPerSec=11.967316066384866, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:58:18,790] [INFO] [timer.py:197:stop] 0/2447, RunningAvgSamplesPerSec=11.994168249583785, CurrSamplesPerSec=11.924239728006881, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:58:25,327] [INFO] [timer.py:197:stop] 0/2448, RunningAvgSamplesPerSec=11.994131293264166, CurrSamplesPerSec=11.9044489928255, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:58:32,038] [INFO] [timer.py:197:stop] 0/2449, RunningAvgSamplesPerSec=11.99410767303778, CurrSamplesPerSec=11.936609677773228, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:58:38,478] [INFO] [logging.py:68:log_dist] [Rank 0] step=2450, skipped=4, lr=[5.677777777777779e-06], mom=[[0.9, 0.999]] [2022-12-19 22:58:38,478] [INFO] [timer.py:197:stop] 0/2450, RunningAvgSamplesPerSec=11.994050234469794, CurrSamplesPerSec=11.855126698180474, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 5.677777777777779e-06, 'epoch': 64.47} [2022-12-19 22:58:45,012] [INFO] [timer.py:197:stop] 0/2451, RunningAvgSamplesPerSec=11.993980077064222, CurrSamplesPerSec=11.824660278640124, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:58:51,652] [INFO] [timer.py:197:stop] 0/2452, RunningAvgSamplesPerSec=11.993945861987871, CurrSamplesPerSec=11.910734711403169, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:58:58,176] [INFO] [timer.py:197:stop] 0/2453, RunningAvgSamplesPerSec=11.993893283555511, CurrSamplesPerSec=11.866445501330224, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:59:04,562] [INFO] [timer.py:197:stop] 0/2454, RunningAvgSamplesPerSec=11.993897004640285, CurrSamplesPerSec=12.003024326836206, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:59:11,027] [INFO] [timer.py:197:stop] 0/2455, RunningAvgSamplesPerSec=11.99387897205873, CurrSamplesPerSec=11.949825552920274, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:59:17,613] [INFO] [timer.py:197:stop] 0/2456, RunningAvgSamplesPerSec=11.9938514784928, CurrSamplesPerSec=11.926787020073544, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:59:24,112] [INFO] [timer.py:197:stop] 0/2457, RunningAvgSamplesPerSec=11.993810475321688, CurrSamplesPerSec=11.894026172528951, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:59:30,503] [INFO] [timer.py:197:stop] 0/2458, RunningAvgSamplesPerSec=11.993812237338794, CurrSamplesPerSec=11.998139550683433, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:59:37,157] [INFO] [timer.py:197:stop] 0/2459, RunningAvgSamplesPerSec=11.993810565137206, CurrSamplesPerSec=11.98970504442193, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:59:43,877] [INFO] [logging.py:68:log_dist] [Rank 0] step=2460, skipped=4, lr=[5.6555555555555566e-06], mom=[[0.9, 0.999]] [2022-12-19 22:59:43,878] [INFO] [timer.py:197:stop] 0/2460, RunningAvgSamplesPerSec=11.993753254162998, CurrSamplesPerSec=11.854574883924197, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:59:50,338] [INFO] [timer.py:197:stop] 0/2461, RunningAvgSamplesPerSec=11.993736521542084, CurrSamplesPerSec=11.952748352673314, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 22:59:56,845] [INFO] [timer.py:197:stop] 0/2462, RunningAvgSamplesPerSec=11.993697706256942, CurrSamplesPerSec=11.899004800108939, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:00:03,322] [INFO] [timer.py:197:stop] 0/2463, RunningAvgSamplesPerSec=11.993697125268392, CurrSamplesPerSec=11.992268063804735, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:00:09,753] [INFO] [timer.py:197:stop] 0/2464, RunningAvgSamplesPerSec=11.99370013679229, CurrSamplesPerSec=12.001116081557646, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:00:16,271] [INFO] [timer.py:197:stop] 0/2465, RunningAvgSamplesPerSec=11.993657058717556, CurrSamplesPerSec=11.888528850255192, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:00:22,730] [INFO] [timer.py:197:stop] 0/2466, RunningAvgSamplesPerSec=11.99364424486012, CurrSamplesPerSec=11.962166579153259, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:00:29,473] [INFO] [timer.py:197:stop] 0/2467, RunningAvgSamplesPerSec=11.99348397176344, CurrSamplesPerSec=11.611164901056874, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:00:35,950] [INFO] [timer.py:197:stop] 0/2468, RunningAvgSamplesPerSec=11.99346737854851, CurrSamplesPerSec=11.952704178120571, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:00:42,438] [INFO] [timer.py:197:stop] 0/2469, RunningAvgSamplesPerSec=11.99344159131794, CurrSamplesPerSec=11.9301858098784, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:00:47,072] [INFO] [logging.py:68:log_dist] [Rank 0] step=2470, skipped=4, lr=[5.633333333333334e-06], mom=[[0.9, 0.999]] [2022-12-19 23:00:47,073] [INFO] [timer.py:197:stop] 0/2470, RunningAvgSamplesPerSec=11.994793116435163, CurrSamplesPerSec=16.61335536106964, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:00:53,541] [INFO] [timer.py:197:stop] 0/2471, RunningAvgSamplesPerSec=11.994764695184484, CurrSamplesPerSec=11.925029017884114, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:01:00,023] [INFO] [timer.py:197:stop] 0/2472, RunningAvgSamplesPerSec=11.994730495574764, CurrSamplesPerSec=11.910882162091552, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:01:06,501] [INFO] [timer.py:197:stop] 0/2473, RunningAvgSamplesPerSec=11.994689074941201, CurrSamplesPerSec=11.893245723649175, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:01:13,084] [INFO] [timer.py:197:stop] 0/2474, RunningAvgSamplesPerSec=11.994685724714028, CurrSamplesPerSec=11.986413025277363, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:01:19,564] [INFO] [timer.py:197:stop] 0/2475, RunningAvgSamplesPerSec=11.99468957298046, CurrSamplesPerSec=12.004210041277581, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 5.622222222222222e-06, 'epoch': 65.13} [2022-12-19 23:01:26,005] [INFO] [timer.py:197:stop] 0/2476, RunningAvgSamplesPerSec=11.994693436336153, CurrSamplesPerSec=12.004255134182154, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:01:32,481] [INFO] [timer.py:197:stop] 0/2477, RunningAvgSamplesPerSec=11.99464369136898, CurrSamplesPerSec=11.872825047750405, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:01:38,948] [INFO] [timer.py:197:stop] 0/2478, RunningAvgSamplesPerSec=11.994599434467226, CurrSamplesPerSec=11.886055239123017, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:01:45,600] [INFO] [timer.py:197:stop] 0/2479, RunningAvgSamplesPerSec=11.994564680370615, CurrSamplesPerSec=11.909126731774899, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:01:52,128] [INFO] [logging.py:68:log_dist] [Rank 0] step=2480, skipped=4, lr=[5.611111111111112e-06], mom=[[0.9, 0.999]] [2022-12-19 23:01:52,128] [INFO] [timer.py:197:stop] 0/2480, RunningAvgSamplesPerSec=11.994533082431731, CurrSamplesPerSec=11.916772604206047, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:01:58,656] [INFO] [timer.py:197:stop] 0/2481, RunningAvgSamplesPerSec=11.994436200670176, CurrSamplesPerSec=11.759075918289247, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:02:05,326] [INFO] [timer.py:197:stop] 0/2482, RunningAvgSamplesPerSec=11.99441302691193, CurrSamplesPerSec=11.93723922700843, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:02:11,773] [INFO] [timer.py:197:stop] 0/2483, RunningAvgSamplesPerSec=11.99440908415221, CurrSamplesPerSec=11.984639007981546, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:02:18,299] [INFO] [timer.py:197:stop] 0/2484, RunningAvgSamplesPerSec=11.994378815468242, CurrSamplesPerSec=11.919749650824997, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:02:24,721] [INFO] [timer.py:197:stop] 0/2485, RunningAvgSamplesPerSec=11.994379503553883, CurrSamplesPerSec=11.996087575424264, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:02:31,376] [INFO] [timer.py:197:stop] 0/2486, RunningAvgSamplesPerSec=11.994351487107897, CurrSamplesPerSec=11.925187947946233, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:02:37,876] [INFO] [timer.py:197:stop] 0/2487, RunningAvgSamplesPerSec=11.994322055699913, CurrSamplesPerSec=11.921657520386495, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:02:44,311] [INFO] [timer.py:197:stop] 0/2488, RunningAvgSamplesPerSec=11.994295028332129, CurrSamplesPerSec=11.927506159472177, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:02:50,916] [INFO] [timer.py:197:stop] 0/2489, RunningAvgSamplesPerSec=11.994257895982496, CurrSamplesPerSec=11.902652180515972, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:02:57,336] [INFO] [logging.py:68:log_dist] [Rank 0] step=2490, skipped=4, lr=[5.588888888888889e-06], mom=[[0.9, 0.999]] [2022-12-19 23:02:57,337] [INFO] [timer.py:197:stop] 0/2490, RunningAvgSamplesPerSec=11.994227077265899, CurrSamplesPerSec=11.918067801829599, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:03:03,783] [INFO] [timer.py:197:stop] 0/2491, RunningAvgSamplesPerSec=11.9942187606295, CurrSamplesPerSec=11.973562618479235, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:03:10,476] [INFO] [timer.py:197:stop] 0/2492, RunningAvgSamplesPerSec=11.99419212231677, CurrSamplesPerSec=11.928254008034955, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:03:17,024] [INFO] [timer.py:197:stop] 0/2493, RunningAvgSamplesPerSec=11.994170868802827, CurrSamplesPerSec=11.94148218794398, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:03:23,633] [INFO] [timer.py:197:stop] 0/2494, RunningAvgSamplesPerSec=11.994173049455018, CurrSamplesPerSec=11.999607516251531, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:03:30,140] [INFO] [timer.py:197:stop] 0/2495, RunningAvgSamplesPerSec=11.994150906229311, CurrSamplesPerSec=11.939222794320901, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:03:36,702] [INFO] [timer.py:197:stop] 0/2496, RunningAvgSamplesPerSec=11.994167305963822, CurrSamplesPerSec=12.035191740250305, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:03:43,201] [INFO] [timer.py:197:stop] 0/2497, RunningAvgSamplesPerSec=11.99413811470603, CurrSamplesPerSec=11.921774532262061, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:03:49,702] [INFO] [timer.py:197:stop] 0/2498, RunningAvgSamplesPerSec=11.994113853097593, CurrSamplesPerSec=11.933885227814852, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:03:56,253] [INFO] [timer.py:197:stop] 0/2499, RunningAvgSamplesPerSec=11.994061533038249, CurrSamplesPerSec=11.864877776076565, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:04:02,862] [INFO] [logging.py:68:log_dist] [Rank 0] step=2500, skipped=4, lr=[5.566666666666667e-06], mom=[[0.9, 0.999]] [2022-12-19 23:04:02,862] [INFO] [timer.py:197:stop] 0/2500, RunningAvgSamplesPerSec=11.994075425102839, CurrSamplesPerSec=12.0288645655709, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 5.566666666666667e-06, 'epoch': 65.79} [2022-12-19 23:04:09,371] [INFO] [timer.py:197:stop] 0/2501, RunningAvgSamplesPerSec=11.99407243844835, CurrSamplesPerSec=11.986616415262288, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:04:15,758] [INFO] [timer.py:197:stop] 0/2502, RunningAvgSamplesPerSec=11.994069880127311, CurrSamplesPerSec=11.98768004320711, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:04:22,232] [INFO] [timer.py:197:stop] 0/2503, RunningAvgSamplesPerSec=11.994045919144213, CurrSamplesPerSec=11.934441266941256, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:04:28,690] [INFO] [timer.py:197:stop] 0/2504, RunningAvgSamplesPerSec=11.99402215711698, CurrSamplesPerSec=11.934886453932444, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:04:35,157] [INFO] [timer.py:197:stop] 0/2505, RunningAvgSamplesPerSec=11.994032933383286, CurrSamplesPerSec=12.021055922814593, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:04:41,630] [INFO] [timer.py:197:stop] 0/2506, RunningAvgSamplesPerSec=11.994059228543025, CurrSamplesPerSec=12.060239318303696, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:04:48,197] [INFO] [timer.py:197:stop] 0/2507, RunningAvgSamplesPerSec=11.994021572685252, CurrSamplesPerSec=11.900467071536477, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:04:52,805] [INFO] [timer.py:197:stop] 0/2508, RunningAvgSamplesPerSec=11.99535668581381, CurrSamplesPerSec=16.63350666082712, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:04:59,287] [INFO] [timer.py:197:stop] 0/2509, RunningAvgSamplesPerSec=11.995321981967127, CurrSamplesPerSec=11.908980381215382, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:05:05,718] [INFO] [logging.py:68:log_dist] [Rank 0] step=2510, skipped=4, lr=[5.544444444444445e-06], mom=[[0.9, 0.999]] [2022-12-19 23:05:05,719] [INFO] [timer.py:197:stop] 0/2510, RunningAvgSamplesPerSec=11.995269294193678, CurrSamplesPerSec=11.864620287049643, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:05:12,157] [INFO] [timer.py:197:stop] 0/2511, RunningAvgSamplesPerSec=11.995257774708485, CurrSamplesPerSec=11.966436350622699, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:05:18,578] [INFO] [timer.py:197:stop] 0/2512, RunningAvgSamplesPerSec=11.995273964815027, CurrSamplesPerSec=12.036033024234069, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:05:25,097] [INFO] [timer.py:197:stop] 0/2513, RunningAvgSamplesPerSec=11.995245106032492, CurrSamplesPerSec=11.923244524149265, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:05:31,647] [INFO] [timer.py:197:stop] 0/2514, RunningAvgSamplesPerSec=11.99520790027067, CurrSamplesPerSec=11.902506516803575, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:05:38,162] [INFO] [timer.py:197:stop] 0/2515, RunningAvgSamplesPerSec=11.995185588251106, CurrSamplesPerSec=11.939398565077406, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:05:44,569] [INFO] [timer.py:197:stop] 0/2516, RunningAvgSamplesPerSec=11.995186585575626, CurrSamplesPerSec=11.99769338607688, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:05:51,074] [INFO] [timer.py:197:stop] 0/2517, RunningAvgSamplesPerSec=11.995170690625075, CurrSamplesPerSec=11.955343515292133, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:05:57,665] [INFO] [timer.py:197:stop] 0/2518, RunningAvgSamplesPerSec=11.99514700448771, CurrSamplesPerSec=11.935870864888182, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:06:04,049] [INFO] [timer.py:197:stop] 0/2519, RunningAvgSamplesPerSec=11.995154639706694, CurrSamplesPerSec=12.014395677393962, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:06:10,549] [INFO] [logging.py:68:log_dist] [Rank 0] step=2520, skipped=4, lr=[5.522222222222222e-06], mom=[[0.9, 0.999]] [2022-12-19 23:06:10,550] [INFO] [timer.py:197:stop] 0/2520, RunningAvgSamplesPerSec=11.995164168319766, CurrSamplesPerSec=12.019195756047031, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:06:17,072] [INFO] [timer.py:197:stop] 0/2521, RunningAvgSamplesPerSec=11.995097542983876, CurrSamplesPerSec=11.82964980664779, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:06:23,580] [INFO] [timer.py:197:stop] 0/2522, RunningAvgSamplesPerSec=11.995044522023019, CurrSamplesPerSec=11.862956055276724, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:06:30,045] [INFO] [timer.py:197:stop] 0/2523, RunningAvgSamplesPerSec=11.995049312385436, CurrSamplesPerSec=12.007133191616585, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:06:36,755] [INFO] [timer.py:197:stop] 0/2524, RunningAvgSamplesPerSec=11.995020936347997, CurrSamplesPerSec=11.923909210802773, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:06:43,280] [INFO] [timer.py:197:stop] 0/2525, RunningAvgSamplesPerSec=11.994987489190054, CurrSamplesPerSec=11.911223056691487, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 5.511111111111112e-06, 'epoch': 66.45} [2022-12-19 23:06:49,819] [INFO] [timer.py:197:stop] 0/2526, RunningAvgSamplesPerSec=11.994950376884772, CurrSamplesPerSec=11.90204157786392, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:06:56,337] [INFO] [timer.py:197:stop] 0/2527, RunningAvgSamplesPerSec=11.99494980023888, CurrSamplesPerSec=11.993494522659192, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:07:02,712] [INFO] [timer.py:197:stop] 0/2528, RunningAvgSamplesPerSec=11.994946561710302, CurrSamplesPerSec=11.986774850138218, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:07:09,143] [INFO] [timer.py:197:stop] 0/2529, RunningAvgSamplesPerSec=11.9949484616637, CurrSamplesPerSec=11.999749665714324, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:07:15,641] [INFO] [logging.py:68:log_dist] [Rank 0] step=2530, skipped=4, lr=[5.500000000000001e-06], mom=[[0.9, 0.999]] [2022-12-19 23:07:15,642] [INFO] [timer.py:197:stop] 0/2530, RunningAvgSamplesPerSec=11.99489949827366, CurrSamplesPerSec=11.872432787310418, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:07:22,126] [INFO] [timer.py:197:stop] 0/2531, RunningAvgSamplesPerSec=11.994854987747, CurrSamplesPerSec=11.883378540133842, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:07:28,887] [INFO] [timer.py:197:stop] 0/2532, RunningAvgSamplesPerSec=11.994860488119782, CurrSamplesPerSec=12.00878708794644, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:07:35,404] [INFO] [timer.py:197:stop] 0/2533, RunningAvgSamplesPerSec=11.994823121979685, CurrSamplesPerSec=11.901026332375347, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:07:41,851] [INFO] [timer.py:197:stop] 0/2534, RunningAvgSamplesPerSec=11.994834958597341, CurrSamplesPerSec=12.024868479694296, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:07:48,409] [INFO] [timer.py:197:stop] 0/2535, RunningAvgSamplesPerSec=11.9947993389449, CurrSamplesPerSec=11.90528371324829, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:07:54,870] [INFO] [timer.py:197:stop] 0/2536, RunningAvgSamplesPerSec=11.994760345009702, CurrSamplesPerSec=11.896795721725674, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:08:01,475] [INFO] [timer.py:197:stop] 0/2537, RunningAvgSamplesPerSec=11.994762012770034, CurrSamplesPerSec=11.998989607538697, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:08:07,933] [INFO] [timer.py:197:stop] 0/2538, RunningAvgSamplesPerSec=11.994768711791302, CurrSamplesPerSec=12.011774817203726, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:08:14,368] [INFO] [timer.py:197:stop] 0/2539, RunningAvgSamplesPerSec=11.994777252613607, CurrSamplesPerSec=12.016475975818022, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:08:20,825] [INFO] [logging.py:68:log_dist] [Rank 0] step=2540, skipped=4, lr=[5.477777777777778e-06], mom=[[0.9, 0.999]] [2022-12-19 23:08:20,826] [INFO] [timer.py:197:stop] 0/2540, RunningAvgSamplesPerSec=11.994765703922816, CurrSamplesPerSec=11.965538096379218, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:08:27,253] [INFO] [timer.py:197:stop] 0/2541, RunningAvgSamplesPerSec=11.994771938876797, CurrSamplesPerSec=12.010617164403854, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:08:33,759] [INFO] [timer.py:197:stop] 0/2542, RunningAvgSamplesPerSec=11.994735348262523, CurrSamplesPerSec=11.90254609898134, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:08:40,401] [INFO] [timer.py:197:stop] 0/2543, RunningAvgSamplesPerSec=11.994615677413089, CurrSamplesPerSec=11.69816718639064, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:08:46,770] [INFO] [timer.py:197:stop] 0/2544, RunningAvgSamplesPerSec=11.994617179229403, CurrSamplesPerSec=11.998434509447742, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:08:53,225] [INFO] [timer.py:197:stop] 0/2545, RunningAvgSamplesPerSec=11.994589732151253, CurrSamplesPerSec=11.925222913128406, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:08:57,932] [INFO] [timer.py:197:stop] 0/2546, RunningAvgSamplesPerSec=11.995881474326076, CurrSamplesPerSec=16.520184642220926, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:09:04,428] [INFO] [timer.py:197:stop] 0/2547, RunningAvgSamplesPerSec=11.995883136777568, CurrSamplesPerSec=12.000113905557361, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:09:10,947] [INFO] [timer.py:197:stop] 0/2548, RunningAvgSamplesPerSec=11.995850090710658, CurrSamplesPerSec=11.912333609919765, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:09:17,482] [INFO] [timer.py:197:stop] 0/2549, RunningAvgSamplesPerSec=11.995804467615095, CurrSamplesPerSec=11.880762465396428, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:09:23,972] [INFO] [logging.py:68:log_dist] [Rank 0] step=2550, skipped=4, lr=[5.455555555555556e-06], mom=[[0.9, 0.999]] [2022-12-19 23:09:23,972] [INFO] [timer.py:197:stop] 0/2550, RunningAvgSamplesPerSec=11.995761385108585, CurrSamplesPerSec=11.887025294188264, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 5.455555555555556e-06, 'epoch': 67.11} [2022-12-19 23:09:30,562] [INFO] [timer.py:197:stop] 0/2551, RunningAvgSamplesPerSec=11.99571578890994, CurrSamplesPerSec=11.880651515437187, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:09:36,990] [INFO] [timer.py:197:stop] 0/2552, RunningAvgSamplesPerSec=11.99571737281347, CurrSamplesPerSec=11.999756102750835, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:09:43,433] [INFO] [timer.py:197:stop] 0/2553, RunningAvgSamplesPerSec=11.995711537649248, CurrSamplesPerSec=11.980850310211263, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:09:49,907] [INFO] [timer.py:197:stop] 0/2554, RunningAvgSamplesPerSec=11.995689004009133, CurrSamplesPerSec=11.938479941244108, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:09:56,414] [INFO] [timer.py:197:stop] 0/2555, RunningAvgSamplesPerSec=11.995653460717007, CurrSamplesPerSec=11.905627983720006, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:10:02,886] [INFO] [timer.py:197:stop] 0/2556, RunningAvgSamplesPerSec=11.995637700856951, CurrSamplesPerSec=11.955537332647925, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:10:09,532] [INFO] [timer.py:197:stop] 0/2557, RunningAvgSamplesPerSec=11.995566677072516, CurrSamplesPerSec=11.81687513022048, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:10:16,097] [INFO] [timer.py:197:stop] 0/2558, RunningAvgSamplesPerSec=11.995530535361876, CurrSamplesPerSec=11.903894161219345, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:10:22,550] [INFO] [timer.py:197:stop] 0/2559, RunningAvgSamplesPerSec=11.995533721244904, CurrSamplesPerSec=12.00368237210802, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:10:28,999] [INFO] [logging.py:68:log_dist] [Rank 0] step=2560, skipped=4, lr=[5.4333333333333335e-06], mom=[[0.9, 0.999]] [2022-12-19 23:10:28,999] [INFO] [timer.py:197:stop] 0/2560, RunningAvgSamplesPerSec=11.995537066415517, CurrSamplesPerSec=12.004096773690103, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:10:35,478] [INFO] [timer.py:197:stop] 0/2561, RunningAvgSamplesPerSec=11.995493580476312, CurrSamplesPerSec=11.885278993317007, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:10:41,976] [INFO] [timer.py:197:stop] 0/2562, RunningAvgSamplesPerSec=11.995485420139472, CurrSamplesPerSec=11.974639422042122, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:10:48,407] [INFO] [timer.py:197:stop] 0/2563, RunningAvgSamplesPerSec=11.995487085037128, CurrSamplesPerSec=11.999750738553262, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:10:54,803] [INFO] [timer.py:197:stop] 0/2564, RunningAvgSamplesPerSec=11.995482749902797, CurrSamplesPerSec=11.98439074096835, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:11:01,293] [INFO] [timer.py:197:stop] 0/2565, RunningAvgSamplesPerSec=11.995452969702262, CurrSamplesPerSec=11.919638500782135, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:11:07,761] [INFO] [timer.py:197:stop] 0/2566, RunningAvgSamplesPerSec=11.995448279149231, CurrSamplesPerSec=11.983438432748287, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:11:14,184] [INFO] [timer.py:197:stop] 0/2567, RunningAvgSamplesPerSec=11.995445790103584, CurrSamplesPerSec=11.989067271936335, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:11:20,622] [INFO] [timer.py:197:stop] 0/2568, RunningAvgSamplesPerSec=11.995405873035489, CurrSamplesPerSec=11.893885462833017, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:11:27,062] [INFO] [timer.py:197:stop] 0/2569, RunningAvgSamplesPerSec=11.995371628467163, CurrSamplesPerSec=11.908139331492428, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:11:33,567] [INFO] [logging.py:68:log_dist] [Rank 0] step=2570, skipped=4, lr=[5.411111111111111e-06], mom=[[0.9, 0.999]] [2022-12-19 23:11:33,567] [INFO] [timer.py:197:stop] 0/2570, RunningAvgSamplesPerSec=11.995370892591726, CurrSamplesPerSec=11.993482197884369, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:11:40,134] [INFO] [timer.py:197:stop] 0/2571, RunningAvgSamplesPerSec=11.995336243863276, CurrSamplesPerSec=11.907013715298678, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:11:46,635] [INFO] [timer.py:197:stop] 0/2572, RunningAvgSamplesPerSec=11.995326934678943, CurrSamplesPerSec=11.97145924408169, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:11:53,126] [INFO] [timer.py:197:stop] 0/2573, RunningAvgSamplesPerSec=11.995298273398555, CurrSamplesPerSec=11.92208851665461, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:11:59,609] [INFO] [timer.py:197:stop] 0/2574, RunningAvgSamplesPerSec=11.99529269694805, CurrSamplesPerSec=11.980972764874071, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:12:06,086] [INFO] [timer.py:197:stop] 0/2575, RunningAvgSamplesPerSec=11.99525143468474, CurrSamplesPerSec=11.890055959597222, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 5.400000000000001e-06, 'epoch': 67.76} [2022-12-19 23:12:12,509] [INFO] [timer.py:197:stop] 0/2576, RunningAvgSamplesPerSec=11.9952458772673, CurrSamplesPerSec=11.980963674267237, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:12:19,060] [INFO] [timer.py:197:stop] 0/2577, RunningAvgSamplesPerSec=11.995221611922272, CurrSamplesPerSec=11.933086277576447, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:12:25,571] [INFO] [timer.py:197:stop] 0/2578, RunningAvgSamplesPerSec=11.99517524754916, CurrSamplesPerSec=11.87696400299575, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:12:32,017] [INFO] [timer.py:197:stop] 0/2579, RunningAvgSamplesPerSec=11.995167891072645, CurrSamplesPerSec=11.976247510092694, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:12:38,564] [INFO] [logging.py:68:log_dist] [Rank 0] step=2580, skipped=4, lr=[5.388888888888889e-06], mom=[[0.9, 0.999]] [2022-12-19 23:12:38,565] [INFO] [timer.py:197:stop] 0/2580, RunningAvgSamplesPerSec=11.995130298699044, CurrSamplesPerSec=11.899031172673716, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:12:45,139] [INFO] [timer.py:197:stop] 0/2581, RunningAvgSamplesPerSec=11.995080117518452, CurrSamplesPerSec=11.867093902246719, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:12:51,590] [INFO] [timer.py:197:stop] 0/2582, RunningAvgSamplesPerSec=11.995092864910884, CurrSamplesPerSec=12.028058776280682, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:12:58,098] [INFO] [timer.py:197:stop] 0/2583, RunningAvgSamplesPerSec=11.995043275714673, CurrSamplesPerSec=11.868453885761943, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:13:02,724] [INFO] [timer.py:197:stop] 0/2584, RunningAvgSamplesPerSec=11.996334978708061, CurrSamplesPerSec=16.614009317841763, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:13:09,271] [INFO] [timer.py:197:stop] 0/2585, RunningAvgSamplesPerSec=11.996285995240514, CurrSamplesPerSec=11.871130689501747, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:13:15,732] [INFO] [timer.py:197:stop] 0/2586, RunningAvgSamplesPerSec=11.996249123465535, CurrSamplesPerSec=11.901759782384406, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:13:22,259] [INFO] [timer.py:197:stop] 0/2587, RunningAvgSamplesPerSec=11.996260804306424, CurrSamplesPerSec=12.02652026118116, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:13:28,865] [INFO] [timer.py:197:stop] 0/2588, RunningAvgSamplesPerSec=11.996220089268261, CurrSamplesPerSec=11.891887428293726, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:13:35,504] [INFO] [timer.py:197:stop] 0/2589, RunningAvgSamplesPerSec=11.99619896756129, CurrSamplesPerSec=11.941825898897413, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:13:42,018] [INFO] [logging.py:68:log_dist] [Rank 0] step=2590, skipped=4, lr=[5.366666666666666e-06], mom=[[0.9, 0.999]] [2022-12-19 23:13:42,018] [INFO] [timer.py:197:stop] 0/2590, RunningAvgSamplesPerSec=11.996209989815746, CurrSamplesPerSec=12.024792528053396, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:13:48,642] [INFO] [timer.py:197:stop] 0/2591, RunningAvgSamplesPerSec=11.996212776279876, CurrSamplesPerSec=12.003428484755691, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:13:55,113] [INFO] [timer.py:197:stop] 0/2592, RunningAvgSamplesPerSec=11.996189060166015, CurrSamplesPerSec=11.935100834724244, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:14:01,583] [INFO] [timer.py:197:stop] 0/2593, RunningAvgSamplesPerSec=11.996157250003128, CurrSamplesPerSec=11.914331118199586, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:14:08,053] [INFO] [timer.py:197:stop] 0/2594, RunningAvgSamplesPerSec=11.996149098095017, CurrSamplesPerSec=11.975064641863266, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:14:14,619] [INFO] [timer.py:197:stop] 0/2595, RunningAvgSamplesPerSec=11.996110385326878, CurrSamplesPerSec=11.8965995863597, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:14:21,154] [INFO] [timer.py:197:stop] 0/2596, RunningAvgSamplesPerSec=11.996092761920874, CurrSamplesPerSec=11.95056875457464, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:14:27,633] [INFO] [timer.py:197:stop] 0/2597, RunningAvgSamplesPerSec=11.996051906505532, CurrSamplesPerSec=11.891001384206158, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:14:34,120] [INFO] [timer.py:197:stop] 0/2598, RunningAvgSamplesPerSec=11.996051596444469, CurrSamplesPerSec=11.995247041969106, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:14:40,606] [INFO] [timer.py:197:stop] 0/2599, RunningAvgSamplesPerSec=11.996049369955324, CurrSamplesPerSec=11.99027218878879, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:14:47,133] [INFO] [logging.py:68:log_dist] [Rank 0] step=2600, skipped=4, lr=[5.344444444444446e-06], mom=[[0.9, 0.999]] [2022-12-19 23:14:47,134] [INFO] [timer.py:197:stop] 0/2600, RunningAvgSamplesPerSec=11.995995746322059, CurrSamplesPerSec=11.858333891788265, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 5.344444444444446e-06, 'epoch': 68.42} [2022-12-19 23:14:53,698] [INFO] [timer.py:197:stop] 0/2601, RunningAvgSamplesPerSec=11.99596525877957, CurrSamplesPerSec=11.917278375003374, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:15:00,210] [INFO] [timer.py:197:stop] 0/2602, RunningAvgSamplesPerSec=11.995944214367379, CurrSamplesPerSec=11.941498124645033, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:15:06,705] [INFO] [timer.py:197:stop] 0/2603, RunningAvgSamplesPerSec=11.995909161651701, CurrSamplesPerSec=11.905459542151846, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:15:13,146] [INFO] [timer.py:197:stop] 0/2604, RunningAvgSamplesPerSec=11.995895827094397, CurrSamplesPerSec=11.961312670681851, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:15:19,696] [INFO] [timer.py:197:stop] 0/2605, RunningAvgSamplesPerSec=11.995853545633036, CurrSamplesPerSec=11.886837376420232, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:15:26,214] [INFO] [timer.py:197:stop] 0/2606, RunningAvgSamplesPerSec=11.995780464914626, CurrSamplesPerSec=11.808522040176225, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:15:32,717] [INFO] [timer.py:197:stop] 0/2607, RunningAvgSamplesPerSec=11.995737768576547, CurrSamplesPerSec=11.885577901731386, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:15:39,189] [INFO] [timer.py:197:stop] 0/2608, RunningAvgSamplesPerSec=11.995704749200721, CurrSamplesPerSec=11.910301893372509, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:15:45,649] [INFO] [timer.py:197:stop] 0/2609, RunningAvgSamplesPerSec=11.995706909086158, CurrSamplesPerSec=12.001338213888623, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:15:52,028] [INFO] [logging.py:68:log_dist] [Rank 0] step=2610, skipped=4, lr=[5.322222222222223e-06], mom=[[0.9, 0.999]] [2022-12-19 23:15:52,029] [INFO] [timer.py:197:stop] 0/2610, RunningAvgSamplesPerSec=11.995715053576658, CurrSamplesPerSec=12.016985403753415, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:15:58,517] [INFO] [timer.py:197:stop] 0/2611, RunningAvgSamplesPerSec=11.995678114105877, CurrSamplesPerSec=11.900107800558725, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:16:05,021] [INFO] [timer.py:197:stop] 0/2612, RunningAvgSamplesPerSec=11.995642812557465, CurrSamplesPerSec=11.904243102559734, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:16:11,531] [INFO] [timer.py:197:stop] 0/2613, RunningAvgSamplesPerSec=11.995649635201463, CurrSamplesPerSec=12.013483219483618, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:16:17,967] [INFO] [timer.py:197:stop] 0/2614, RunningAvgSamplesPerSec=11.995652455830127, CurrSamplesPerSec=12.003021643275641, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:16:24,493] [INFO] [timer.py:197:stop] 0/2615, RunningAvgSamplesPerSec=11.995620316867823, CurrSamplesPerSec=11.912256958708136, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:16:30,955] [INFO] [timer.py:197:stop] 0/2616, RunningAvgSamplesPerSec=11.995623673157883, CurrSamplesPerSec=12.004400077955687, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:16:37,433] [INFO] [timer.py:197:stop] 0/2617, RunningAvgSamplesPerSec=11.995594765520057, CurrSamplesPerSec=11.92050340866268, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:16:43,854] [INFO] [timer.py:197:stop] 0/2618, RunningAvgSamplesPerSec=11.99559681961324, CurrSamplesPerSec=12.000970680547566, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:16:50,366] [INFO] [timer.py:197:stop] 0/2619, RunningAvgSamplesPerSec=11.995557015855818, CurrSamplesPerSec=11.892326811847305, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:16:56,802] [INFO] [logging.py:68:log_dist] [Rank 0] step=2620, skipped=4, lr=[5.300000000000001e-06], mom=[[0.9, 0.999]] [2022-12-19 23:16:56,803] [INFO] [timer.py:197:stop] 0/2620, RunningAvgSamplesPerSec=11.995563036578712, CurrSamplesPerSec=12.011339999432623, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:17:03,301] [INFO] [timer.py:197:stop] 0/2621, RunningAvgSamplesPerSec=11.995558414698648, CurrSamplesPerSec=11.983470530560577, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:17:07,921] [INFO] [timer.py:197:stop] 0/2622, RunningAvgSamplesPerSec=11.996838641794675, CurrSamplesPerSec=16.651013803628846, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:17:14,349] [INFO] [timer.py:197:stop] 0/2623, RunningAvgSamplesPerSec=11.996845593065839, CurrSamplesPerSec=12.015085624152503, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:17:20,795] [INFO] [timer.py:197:stop] 0/2624, RunningAvgSamplesPerSec=11.99684699279078, CurrSamplesPerSec=12.000516794525627, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:17:27,335] [INFO] [timer.py:197:stop] 0/2625, RunningAvgSamplesPerSec=11.99681506110908, CurrSamplesPerSec=11.913670671436181, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 5.288888888888889e-06, 'epoch': 69.08} [2022-12-19 23:17:33,785] [INFO] [timer.py:197:stop] 0/2626, RunningAvgSamplesPerSec=11.996805192489328, CurrSamplesPerSec=11.970975556480958, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:17:40,263] [INFO] [timer.py:197:stop] 0/2627, RunningAvgSamplesPerSec=11.996769892982234, CurrSamplesPerSec=11.904853931125237, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:17:46,646] [INFO] [timer.py:197:stop] 0/2628, RunningAvgSamplesPerSec=11.996771899842726, CurrSamplesPerSec=12.00204222381344, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:17:53,161] [INFO] [timer.py:197:stop] 0/2629, RunningAvgSamplesPerSec=11.9967744023476, CurrSamplesPerSec=12.003349583262887, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:17:59,696] [INFO] [logging.py:68:log_dist] [Rank 0] step=2630, skipped=4, lr=[5.2777777777777785e-06], mom=[[0.9, 0.999]] [2022-12-19 23:17:59,697] [INFO] [timer.py:197:stop] 0/2630, RunningAvgSamplesPerSec=11.996733894928251, CurrSamplesPerSec=11.89125685899563, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:18:06,156] [INFO] [timer.py:197:stop] 0/2631, RunningAvgSamplesPerSec=11.99672881536494, CurrSamplesPerSec=11.983394566016234, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:18:12,645] [INFO] [timer.py:197:stop] 0/2632, RunningAvgSamplesPerSec=11.996724028332928, CurrSamplesPerSec=11.98415211469801, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:18:19,112] [INFO] [timer.py:197:stop] 0/2633, RunningAvgSamplesPerSec=11.996727890166495, CurrSamplesPerSec=12.00689312176857, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:18:25,627] [INFO] [timer.py:197:stop] 0/2634, RunningAvgSamplesPerSec=11.99668895400914, CurrSamplesPerSec=11.895115599579986, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:18:32,107] [INFO] [timer.py:197:stop] 0/2635, RunningAvgSamplesPerSec=11.996659169233235, CurrSamplesPerSec=11.918774776810016, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:18:38,515] [INFO] [timer.py:197:stop] 0/2636, RunningAvgSamplesPerSec=11.996619518347481, CurrSamplesPerSec=11.893119786398717, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:18:44,965] [INFO] [timer.py:197:stop] 0/2637, RunningAvgSamplesPerSec=11.996580425206085, CurrSamplesPerSec=11.894485741994915, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:18:51,415] [INFO] [timer.py:197:stop] 0/2638, RunningAvgSamplesPerSec=11.996576211718697, CurrSamplesPerSec=11.985483941974703, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:18:57,891] [INFO] [timer.py:197:stop] 0/2639, RunningAvgSamplesPerSec=11.996574852686694, CurrSamplesPerSec=11.992993514184398, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:19:04,329] [INFO] [logging.py:68:log_dist] [Rank 0] step=2640, skipped=4, lr=[5.255555555555556e-06], mom=[[0.9, 0.999]] [2022-12-19 23:19:04,330] [INFO] [timer.py:197:stop] 0/2640, RunningAvgSamplesPerSec=11.996579571839115, CurrSamplesPerSec=12.009036904031838, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:19:10,878] [INFO] [timer.py:197:stop] 0/2641, RunningAvgSamplesPerSec=11.996534072426016, CurrSamplesPerSec=11.877696066718702, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:19:17,435] [INFO] [timer.py:197:stop] 0/2642, RunningAvgSamplesPerSec=11.99650277278968, CurrSamplesPerSec=11.914468081498933, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:19:23,887] [INFO] [timer.py:197:stop] 0/2643, RunningAvgSamplesPerSec=11.99650732705503, CurrSamplesPerSec=12.008542654310418, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:19:30,397] [INFO] [timer.py:197:stop] 0/2644, RunningAvgSamplesPerSec=11.996463175879933, CurrSamplesPerSec=11.880982793970782, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:19:36,848] [INFO] [timer.py:197:stop] 0/2645, RunningAvgSamplesPerSec=11.996469581883861, CurrSamplesPerSec=12.013418164431052, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:19:43,365] [INFO] [timer.py:197:stop] 0/2646, RunningAvgSamplesPerSec=11.996438495859715, CurrSamplesPerSec=11.914837210625654, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:19:49,893] [INFO] [timer.py:197:stop] 0/2647, RunningAvgSamplesPerSec=11.996386435648844, CurrSamplesPerSec=11.860301277252887, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:19:56,398] [INFO] [timer.py:197:stop] 0/2648, RunningAvgSamplesPerSec=11.996345428686967, CurrSamplesPerSec=11.888854249630926, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:20:02,902] [INFO] [timer.py:197:stop] 0/2649, RunningAvgSamplesPerSec=11.996271985827274, CurrSamplesPerSec=11.805041133419232, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:20:09,388] [INFO] [logging.py:68:log_dist] [Rank 0] step=2650, skipped=4, lr=[5.233333333333334e-06], mom=[[0.9, 0.999]] [2022-12-19 23:20:09,388] [INFO] [timer.py:197:stop] 0/2650, RunningAvgSamplesPerSec=11.996227443433515, CurrSamplesPerSec=11.879471682929745, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 5.233333333333334e-06, 'epoch': 69.74} [2022-12-19 23:20:15,918] [INFO] [timer.py:197:stop] 0/2651, RunningAvgSamplesPerSec=11.996178539543214, CurrSamplesPerSec=11.868064537650286, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:20:22,410] [INFO] [timer.py:197:stop] 0/2652, RunningAvgSamplesPerSec=11.996147885477548, CurrSamplesPerSec=11.915491439779187, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:20:28,914] [INFO] [timer.py:197:stop] 0/2653, RunningAvgSamplesPerSec=11.996098039643064, CurrSamplesPerSec=11.865445759819316, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:20:35,423] [INFO] [timer.py:197:stop] 0/2654, RunningAvgSamplesPerSec=11.99608214840425, CurrSamplesPerSec=11.954101955377544, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:20:41,902] [INFO] [timer.py:197:stop] 0/2655, RunningAvgSamplesPerSec=11.996037234509576, CurrSamplesPerSec=11.878097084599952, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:20:48,377] [INFO] [timer.py:197:stop] 0/2656, RunningAvgSamplesPerSec=11.996000453623141, CurrSamplesPerSec=11.899208399341427, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:20:54,879] [INFO] [timer.py:197:stop] 0/2657, RunningAvgSamplesPerSec=11.995963474727407, CurrSamplesPerSec=11.89861819172782, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:21:01,369] [INFO] [timer.py:197:stop] 0/2658, RunningAvgSamplesPerSec=11.995912946986405, CurrSamplesPerSec=11.863245977635698, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:21:07,868] [INFO] [timer.py:197:stop] 0/2659, RunningAvgSamplesPerSec=11.99589324207262, CurrSamplesPerSec=11.943784419495733, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:21:12,790] [INFO] [logging.py:68:log_dist] [Rank 0] step=2660, skipped=4, lr=[5.211111111111111e-06], mom=[[0.9, 0.999]] [2022-12-19 23:21:12,791] [INFO] [timer.py:197:stop] 0/2660, RunningAvgSamplesPerSec=11.997135368513332, CurrSamplesPerSec=16.55055001114117, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:21:19,751] [INFO] [timer.py:197:stop] 0/2661, RunningAvgSamplesPerSec=11.997104230837477, CurrSamplesPerSec=11.914907548712618, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:21:26,772] [INFO] [timer.py:197:stop] 0/2662, RunningAvgSamplesPerSec=11.997056875739124, CurrSamplesPerSec=11.872448015143775, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:21:33,759] [INFO] [timer.py:197:stop] 0/2663, RunningAvgSamplesPerSec=11.997037985405546, CurrSamplesPerSec=11.946999358222891, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:21:40,433] [INFO] [timer.py:197:stop] 0/2664, RunningAvgSamplesPerSec=11.997044948921303, CurrSamplesPerSec=12.015603539604983, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:21:47,099] [INFO] [timer.py:197:stop] 0/2665, RunningAvgSamplesPerSec=11.997037039411385, CurrSamplesPerSec=11.97601882538536, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:21:53,579] [INFO] [timer.py:197:stop] 0/2666, RunningAvgSamplesPerSec=11.99698839122477, CurrSamplesPerSec=11.868822793053111, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:22:00,056] [INFO] [timer.py:197:stop] 0/2667, RunningAvgSamplesPerSec=11.996942371052711, CurrSamplesPerSec=11.875585256523664, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:22:06,557] [INFO] [timer.py:197:stop] 0/2668, RunningAvgSamplesPerSec=11.996896659068172, CurrSamplesPerSec=11.876299285330575, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:22:13,302] [INFO] [timer.py:197:stop] 0/2669, RunningAvgSamplesPerSec=11.996856993464396, CurrSamplesPerSec=11.892032832206176, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:22:19,995] [INFO] [logging.py:68:log_dist] [Rank 0] step=2670, skipped=4, lr=[5.188888888888889e-06], mom=[[0.9, 0.999]] [2022-12-19 23:22:19,995] [INFO] [timer.py:197:stop] 0/2670, RunningAvgSamplesPerSec=11.996821885252814, CurrSamplesPerSec=11.90391369294753, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:22:26,578] [INFO] [timer.py:197:stop] 0/2671, RunningAvgSamplesPerSec=11.996785016105184, CurrSamplesPerSec=11.899218421245225, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:22:33,011] [INFO] [timer.py:197:stop] 0/2672, RunningAvgSamplesPerSec=11.996788935677014, CurrSamplesPerSec=12.00725940668445, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:22:39,489] [INFO] [timer.py:197:stop] 0/2673, RunningAvgSamplesPerSec=11.996751153003556, CurrSamplesPerSec=11.896712943376556, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:22:46,005] [INFO] [timer.py:197:stop] 0/2674, RunningAvgSamplesPerSec=11.996752169175672, CurrSamplesPerSec=11.999466979338743, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:22:52,525] [INFO] [timer.py:197:stop] 0/2675, RunningAvgSamplesPerSec=11.996743640576053, CurrSamplesPerSec=11.973998444210299, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 5.177777777777779e-06, 'epoch': 70.39} [2022-12-19 23:22:59,195] [INFO] [timer.py:197:stop] 0/2676, RunningAvgSamplesPerSec=11.996704619052416, CurrSamplesPerSec=11.893299471739644, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:23:05,882] [INFO] [timer.py:197:stop] 0/2677, RunningAvgSamplesPerSec=11.996681214429486, CurrSamplesPerSec=11.934422165501191, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:23:12,462] [INFO] [timer.py:197:stop] 0/2678, RunningAvgSamplesPerSec=11.996635374338918, CurrSamplesPerSec=11.875254278917568, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:23:18,919] [INFO] [timer.py:197:stop] 0/2679, RunningAvgSamplesPerSec=11.996606713635927, CurrSamplesPerSec=11.920398067392066, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:23:25,381] [INFO] [logging.py:68:log_dist] [Rank 0] step=2680, skipped=4, lr=[5.1666666666666675e-06], mom=[[0.9, 0.999]] [2022-12-19 23:23:25,382] [INFO] [timer.py:197:stop] 0/2680, RunningAvgSamplesPerSec=11.996605145995055, CurrSamplesPerSec=11.992410039430785, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:23:31,847] [INFO] [timer.py:197:stop] 0/2681, RunningAvgSamplesPerSec=11.996534658847205, CurrSamplesPerSec=11.810695336759888, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:23:38,401] [INFO] [timer.py:197:stop] 0/2682, RunningAvgSamplesPerSec=11.996510165078321, CurrSamplesPerSec=11.931248461565609, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:23:44,960] [INFO] [timer.py:197:stop] 0/2683, RunningAvgSamplesPerSec=11.996507618162122, CurrSamplesPerSec=11.989685765669675, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:23:51,498] [INFO] [timer.py:197:stop] 0/2684, RunningAvgSamplesPerSec=11.996459474936309, CurrSamplesPerSec=11.868761919146383, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:23:57,951] [INFO] [timer.py:197:stop] 0/2685, RunningAvgSamplesPerSec=11.996459880237117, CurrSamplesPerSec=11.997546995545537, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:24:04,470] [INFO] [timer.py:197:stop] 0/2686, RunningAvgSamplesPerSec=11.996394734878898, CurrSamplesPerSec=11.82412067185435, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:24:10,986] [INFO] [timer.py:197:stop] 0/2687, RunningAvgSamplesPerSec=11.99635262519577, CurrSamplesPerSec=11.884385515201748, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:24:17,630] [INFO] [timer.py:197:stop] 0/2688, RunningAvgSamplesPerSec=11.996310277310535, CurrSamplesPerSec=11.883674196411809, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:24:24,255] [INFO] [timer.py:197:stop] 0/2689, RunningAvgSamplesPerSec=11.996277149511366, CurrSamplesPerSec=11.907951273811703, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:24:30,697] [INFO] [logging.py:68:log_dist] [Rank 0] step=2690, skipped=4, lr=[5.144444444444445e-06], mom=[[0.9, 0.999]] [2022-12-19 23:24:30,698] [INFO] [timer.py:197:stop] 0/2690, RunningAvgSamplesPerSec=11.996282052155102, CurrSamplesPerSec=12.009469943202353, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:24:37,176] [INFO] [timer.py:197:stop] 0/2691, RunningAvgSamplesPerSec=11.996282965323301, CurrSamplesPerSec=11.998738063972361, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:24:43,718] [INFO] [timer.py:197:stop] 0/2692, RunningAvgSamplesPerSec=11.996251172425668, CurrSamplesPerSec=11.911365234003894, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:24:50,221] [INFO] [timer.py:197:stop] 0/2693, RunningAvgSamplesPerSec=11.996193035680573, CurrSamplesPerSec=11.841818437199361, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:24:56,794] [INFO] [timer.py:197:stop] 0/2694, RunningAvgSamplesPerSec=11.996163669515205, CurrSamplesPerSec=11.91765667306616, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:25:03,404] [INFO] [timer.py:197:stop] 0/2695, RunningAvgSamplesPerSec=11.996161493877645, CurrSamplesPerSec=11.990307536672809, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:25:10,007] [INFO] [timer.py:197:stop] 0/2696, RunningAvgSamplesPerSec=11.99616084795579, CurrSamplesPerSec=11.99442163268361, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:25:16,628] [INFO] [timer.py:197:stop] 0/2697, RunningAvgSamplesPerSec=11.996081093587382, CurrSamplesPerSec=11.785004754428154, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:25:21,336] [INFO] [timer.py:197:stop] 0/2698, RunningAvgSamplesPerSec=11.997253110046707, CurrSamplesPerSec=16.28515692989626, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:25:27,991] [INFO] [timer.py:197:stop] 0/2699, RunningAvgSamplesPerSec=11.997227329556214, CurrSamplesPerSec=11.928123618049009, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:25:34,604] [INFO] [logging.py:68:log_dist] [Rank 0] step=2700, skipped=4, lr=[5.122222222222223e-06], mom=[[0.9, 0.999]] [2022-12-19 23:25:34,605] [INFO] [timer.py:197:stop] 0/2700, RunningAvgSamplesPerSec=11.99719953123632, CurrSamplesPerSec=11.922693234711987, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 5.122222222222223e-06, 'epoch': 71.05} [2022-12-19 23:25:41,270] [INFO] [timer.py:197:stop] 0/2701, RunningAvgSamplesPerSec=11.99716043031324, CurrSamplesPerSec=11.892586031760883, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:25:47,818] [INFO] [timer.py:197:stop] 0/2702, RunningAvgSamplesPerSec=11.997127937396781, CurrSamplesPerSec=11.910066208381947, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:25:54,318] [INFO] [timer.py:197:stop] 0/2703, RunningAvgSamplesPerSec=11.997095149706537, CurrSamplesPerSec=11.909217079944879, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:26:00,883] [INFO] [timer.py:197:stop] 0/2704, RunningAvgSamplesPerSec=11.9969840706824, CurrSamplesPerSec=11.704282316393593, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:26:07,552] [INFO] [timer.py:197:stop] 0/2705, RunningAvgSamplesPerSec=11.99689210701528, CurrSamplesPerSec=11.75345043431232, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:26:14,202] [INFO] [timer.py:197:stop] 0/2706, RunningAvgSamplesPerSec=11.996810318449063, CurrSamplesPerSec=11.779737472934098, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:26:21,026] [INFO] [timer.py:197:stop] 0/2707, RunningAvgSamplesPerSec=11.996708516128196, CurrSamplesPerSec=11.7276119466562, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:26:27,726] [INFO] [timer.py:197:stop] 0/2708, RunningAvgSamplesPerSec=11.996653231918927, CurrSamplesPerSec=11.848951298855791, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:26:34,465] [INFO] [timer.py:197:stop] 0/2709, RunningAvgSamplesPerSec=11.99656787016443, CurrSamplesPerSec=11.769944120757707, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:26:41,196] [INFO] [logging.py:68:log_dist] [Rank 0] step=2710, skipped=4, lr=[5.1e-06], mom=[[0.9, 0.999]] [2022-12-19 23:26:41,197] [INFO] [timer.py:197:stop] 0/2710, RunningAvgSamplesPerSec=11.99646913168438, CurrSamplesPerSec=11.735011573896983, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:26:47,949] [INFO] [timer.py:197:stop] 0/2711, RunningAvgSamplesPerSec=11.99626822256751, CurrSamplesPerSec=11.47581878279277, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:26:54,945] [INFO] [timer.py:197:stop] 0/2712, RunningAvgSamplesPerSec=11.996043564801278, CurrSamplesPerSec=11.416841316871833, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:27:02,171] [INFO] [timer.py:197:stop] 0/2713, RunningAvgSamplesPerSec=11.995736143657453, CurrSamplesPerSec=11.216746145034849, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:27:09,785] [INFO] [timer.py:197:stop] 0/2714, RunningAvgSamplesPerSec=11.995468156194852, CurrSamplesPerSec=11.31045759758005, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:27:16,515] [INFO] [timer.py:197:stop] 0/2715, RunningAvgSamplesPerSec=11.995290427688797, CurrSamplesPerSec=11.531917045260103, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:27:23,477] [INFO] [timer.py:197:stop] 0/2716, RunningAvgSamplesPerSec=11.994971357847655, CurrSamplesPerSec=11.187620157778788, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:27:30,527] [INFO] [timer.py:197:stop] 0/2717, RunningAvgSamplesPerSec=11.994675867799948, CurrSamplesPerSec=11.242991743565652, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:27:37,276] [INFO] [timer.py:197:stop] 0/2718, RunningAvgSamplesPerSec=11.994446823841649, CurrSamplesPerSec=11.403254256008326, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:27:44,485] [INFO] [timer.py:197:stop] 0/2719, RunningAvgSamplesPerSec=11.994218514252179, CurrSamplesPerSec=11.404622408584524, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:27:51,096] [INFO] [logging.py:68:log_dist] [Rank 0] step=2720, skipped=4, lr=[5.077777777777778e-06], mom=[[0.9, 0.999]] [2022-12-19 23:27:51,097] [INFO] [timer.py:197:stop] 0/2720, RunningAvgSamplesPerSec=11.994218844324998, CurrSamplesPerSec=11.995115719256633, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:27:57,672] [INFO] [timer.py:197:stop] 0/2721, RunningAvgSamplesPerSec=11.994211153220629, CurrSamplesPerSec=11.973343115327138, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:28:04,301] [INFO] [timer.py:197:stop] 0/2722, RunningAvgSamplesPerSec=11.994153652503169, CurrSamplesPerSec=11.839821667496242, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:28:10,802] [INFO] [timer.py:197:stop] 0/2723, RunningAvgSamplesPerSec=11.994142417714016, CurrSamplesPerSec=11.963661478982994, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:28:17,281] [INFO] [timer.py:197:stop] 0/2724, RunningAvgSamplesPerSec=11.994129531275108, CurrSamplesPerSec=11.959167776812597, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:28:23,771] [INFO] [timer.py:197:stop] 0/2725, RunningAvgSamplesPerSec=11.994113285715333, CurrSamplesPerSec=11.950055366321873, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 5.0666666666666676e-06, 'epoch': 71.71} [2022-12-19 23:28:30,326] [INFO] [timer.py:197:stop] 0/2726, RunningAvgSamplesPerSec=11.994090574140353, CurrSamplesPerSec=11.932564312093849, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:28:36,883] [INFO] [timer.py:197:stop] 0/2727, RunningAvgSamplesPerSec=11.99409670710499, CurrSamplesPerSec=12.010826213324727, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:28:43,390] [INFO] [timer.py:197:stop] 0/2728, RunningAvgSamplesPerSec=11.994101173572574, CurrSamplesPerSec=12.006284665580578, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:28:50,024] [INFO] [timer.py:197:stop] 0/2729, RunningAvgSamplesPerSec=11.994071981129114, CurrSamplesPerSec=11.915018081670034, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:28:56,442] [INFO] [logging.py:68:log_dist] [Rank 0] step=2730, skipped=4, lr=[5.0555555555555555e-06], mom=[[0.9, 0.999]] [2022-12-19 23:28:56,442] [INFO] [timer.py:197:stop] 0/2730, RunningAvgSamplesPerSec=11.994066383214019, CurrSamplesPerSec=11.978820280427128, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:29:02,926] [INFO] [timer.py:197:stop] 0/2731, RunningAvgSamplesPerSec=11.99406423954292, CurrSamplesPerSec=11.988219155715687, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:29:09,429] [INFO] [timer.py:197:stop] 0/2732, RunningAvgSamplesPerSec=11.99402804088363, CurrSamplesPerSec=11.896049176486887, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:29:16,034] [INFO] [timer.py:197:stop] 0/2733, RunningAvgSamplesPerSec=11.994033296044734, CurrSamplesPerSec=12.00839707329062, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:29:22,646] [INFO] [timer.py:197:stop] 0/2734, RunningAvgSamplesPerSec=11.994037676609434, CurrSamplesPerSec=12.006012947785482, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:29:29,099] [INFO] [timer.py:197:stop] 0/2735, RunningAvgSamplesPerSec=11.994048430765716, CurrSamplesPerSec=12.0235009583937, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:29:33,721] [INFO] [timer.py:197:stop] 0/2736, RunningAvgSamplesPerSec=11.995253390263796, CurrSamplesPerSec=16.535271901536714, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:29:40,158] [INFO] [timer.py:197:stop] 0/2737, RunningAvgSamplesPerSec=11.995255299183455, CurrSamplesPerSec=12.00047655805986, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:29:46,733] [INFO] [timer.py:197:stop] 0/2738, RunningAvgSamplesPerSec=11.99520438195101, CurrSamplesPerSec=11.857544501441094, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:29:53,127] [INFO] [timer.py:197:stop] 0/2739, RunningAvgSamplesPerSec=11.995210777017435, CurrSamplesPerSec=12.012733247381735, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:29:59,634] [INFO] [logging.py:68:log_dist] [Rank 0] step=2740, skipped=4, lr=[5.033333333333333e-06], mom=[[0.9, 0.999]] [2022-12-19 23:29:59,635] [INFO] [timer.py:197:stop] 0/2740, RunningAvgSamplesPerSec=11.995163456798252, CurrSamplesPerSec=11.867031996834344, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:30:06,079] [INFO] [timer.py:197:stop] 0/2741, RunningAvgSamplesPerSec=11.995135362441145, CurrSamplesPerSec=11.918703334660469, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:30:12,579] [INFO] [timer.py:197:stop] 0/2742, RunningAvgSamplesPerSec=11.995120221115162, CurrSamplesPerSec=11.953791073371887, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:30:19,126] [INFO] [timer.py:197:stop] 0/2743, RunningAvgSamplesPerSec=11.995076847761194, CurrSamplesPerSec=11.877400182287817, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:30:25,637] [INFO] [timer.py:197:stop] 0/2744, RunningAvgSamplesPerSec=11.99502004426388, CurrSamplesPerSec=11.841317484002989, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:30:32,185] [INFO] [timer.py:197:stop] 0/2745, RunningAvgSamplesPerSec=11.994983917162227, CurrSamplesPerSec=11.896735087790208, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:30:38,645] [INFO] [timer.py:197:stop] 0/2746, RunningAvgSamplesPerSec=11.994950533869618, CurrSamplesPerSec=11.90407417227612, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:30:45,170] [INFO] [timer.py:197:stop] 0/2747, RunningAvgSamplesPerSec=11.99489880060451, CurrSamplesPerSec=11.854603677515405, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:30:51,731] [INFO] [timer.py:197:stop] 0/2748, RunningAvgSamplesPerSec=11.994850241130646, CurrSamplesPerSec=11.863020015229859, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:30:58,227] [INFO] [timer.py:197:stop] 0/2749, RunningAvgSamplesPerSec=11.994830002118105, CurrSamplesPerSec=11.939510083950712, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:31:04,796] [INFO] [logging.py:68:log_dist] [Rank 0] step=2750, skipped=4, lr=[5.011111111111111e-06], mom=[[0.9, 0.999]] [2022-12-19 23:31:04,797] [INFO] [timer.py:197:stop] 0/2750, RunningAvgSamplesPerSec=11.994786862553477, CurrSamplesPerSec=11.877442225358148, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 5.011111111111111e-06, 'epoch': 72.37} [2022-12-19 23:31:11,298] [INFO] [timer.py:197:stop] 0/2751, RunningAvgSamplesPerSec=11.994738531835914, CurrSamplesPerSec=11.863380720336483, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:31:17,832] [INFO] [timer.py:197:stop] 0/2752, RunningAvgSamplesPerSec=11.994709991855048, CurrSamplesPerSec=11.916763610762365, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:31:24,291] [INFO] [timer.py:197:stop] 0/2753, RunningAvgSamplesPerSec=11.994718217644026, CurrSamplesPerSec=12.01738189445387, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:31:30,725] [INFO] [timer.py:197:stop] 0/2754, RunningAvgSamplesPerSec=11.994730738864046, CurrSamplesPerSec=12.029275856166072, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:31:37,194] [INFO] [timer.py:197:stop] 0/2755, RunningAvgSamplesPerSec=11.994736772240138, CurrSamplesPerSec=12.011363647550091, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:31:43,721] [INFO] [timer.py:197:stop] 0/2756, RunningAvgSamplesPerSec=11.994691202979272, CurrSamplesPerSec=11.870538014577784, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:31:50,204] [INFO] [timer.py:197:stop] 0/2757, RunningAvgSamplesPerSec=11.994639010700212, CurrSamplesPerSec=11.852604164182981, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:31:56,716] [INFO] [timer.py:197:stop] 0/2758, RunningAvgSamplesPerSec=11.99460219963547, CurrSamplesPerSec=11.89403829374465, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:32:03,145] [INFO] [timer.py:197:stop] 0/2759, RunningAvgSamplesPerSec=11.994598339464275, CurrSamplesPerSec=11.983969138658843, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:32:09,586] [INFO] [logging.py:68:log_dist] [Rank 0] step=2760, skipped=4, lr=[4.988888888888889e-06], mom=[[0.9, 0.999]] [2022-12-19 23:32:09,587] [INFO] [timer.py:197:stop] 0/2760, RunningAvgSamplesPerSec=11.99457071262971, CurrSamplesPerSec=11.918884323102784, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:32:16,118] [INFO] [timer.py:197:stop] 0/2761, RunningAvgSamplesPerSec=11.994571636487413, CurrSamplesPerSec=11.997120177612585, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:32:22,572] [INFO] [timer.py:197:stop] 0/2762, RunningAvgSamplesPerSec=11.994543413801953, CurrSamplesPerSec=11.917179439471989, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:32:29,101] [INFO] [timer.py:197:stop] 0/2763, RunningAvgSamplesPerSec=11.994515247216023, CurrSamplesPerSec=11.917276258718129, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:32:35,565] [INFO] [timer.py:197:stop] 0/2764, RunningAvgSamplesPerSec=11.994520406543298, CurrSamplesPerSec=12.008782252898929, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:32:42,040] [INFO] [timer.py:197:stop] 0/2765, RunningAvgSamplesPerSec=11.994499883613116, CurrSamplesPerSec=11.93808226876145, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:32:48,496] [INFO] [timer.py:197:stop] 0/2766, RunningAvgSamplesPerSec=11.994478455338928, CurrSamplesPerSec=11.93556305317796, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:32:54,996] [INFO] [timer.py:197:stop] 0/2767, RunningAvgSamplesPerSec=11.994438028884355, CurrSamplesPerSec=11.883731014605983, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:33:01,612] [INFO] [timer.py:197:stop] 0/2768, RunningAvgSamplesPerSec=11.994286853790381, CurrSamplesPerSec=11.590369278075817, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:33:08,048] [INFO] [timer.py:197:stop] 0/2769, RunningAvgSamplesPerSec=11.994291348773052, CurrSamplesPerSec=12.006737376889609, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:33:14,531] [INFO] [logging.py:68:log_dist] [Rank 0] step=2770, skipped=4, lr=[4.966666666666667e-06], mom=[[0.9, 0.999]] [2022-12-19 23:33:14,532] [INFO] [timer.py:197:stop] 0/2770, RunningAvgSamplesPerSec=11.994263745089707, CurrSamplesPerSec=11.918367832613626, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:33:21,014] [INFO] [timer.py:197:stop] 0/2771, RunningAvgSamplesPerSec=11.994237401708569, CurrSamplesPerSec=11.92175970710098, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:33:27,522] [INFO] [timer.py:197:stop] 0/2772, RunningAvgSamplesPerSec=11.994216837782492, CurrSamplesPerSec=11.937544470818604, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:33:34,012] [INFO] [timer.py:197:stop] 0/2773, RunningAvgSamplesPerSec=11.994192719495656, CurrSamplesPerSec=11.927755255239576, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:33:38,607] [INFO] [timer.py:197:stop] 0/2774, RunningAvgSamplesPerSec=11.995409543932212, CurrSamplesPerSec=16.68627268518959, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:33:45,041] [INFO] [timer.py:197:stop] 0/2775, RunningAvgSamplesPerSec=11.995348139327294, CurrSamplesPerSec=11.827516952224112, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 4.9555555555555565e-06, 'epoch': 73.03} [2022-12-19 23:33:51,476] [INFO] [timer.py:197:stop] 0/2776, RunningAvgSamplesPerSec=11.995344971363043, CurrSamplesPerSec=11.986566637601742, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:33:57,926] [INFO] [timer.py:197:stop] 0/2777, RunningAvgSamplesPerSec=11.995319549949997, CurrSamplesPerSec=11.925212847373128, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:34:04,412] [INFO] [timer.py:197:stop] 0/2778, RunningAvgSamplesPerSec=11.995285411713645, CurrSamplesPerSec=11.901294374126918, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:34:10,933] [INFO] [timer.py:197:stop] 0/2779, RunningAvgSamplesPerSec=11.995255569798147, CurrSamplesPerSec=11.912982805942818, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:34:17,403] [INFO] [logging.py:68:log_dist] [Rank 0] step=2780, skipped=4, lr=[4.944444444444445e-06], mom=[[0.9, 0.999]] [2022-12-19 23:34:17,404] [INFO] [timer.py:197:stop] 0/2780, RunningAvgSamplesPerSec=11.995253791387068, CurrSamplesPerSec=11.990317177040986, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:34:23,872] [INFO] [timer.py:197:stop] 0/2781, RunningAvgSamplesPerSec=11.995230503197918, CurrSamplesPerSec=11.930883087411376, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:34:30,376] [INFO] [timer.py:197:stop] 0/2782, RunningAvgSamplesPerSec=11.995185872143587, CurrSamplesPerSec=11.872425961052972, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:34:36,908] [INFO] [timer.py:197:stop] 0/2783, RunningAvgSamplesPerSec=11.995158359094326, CurrSamplesPerSec=11.91915687454915, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:34:43,271] [INFO] [timer.py:197:stop] 0/2784, RunningAvgSamplesPerSec=11.995160743670029, CurrSamplesPerSec=12.001795918265152, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:34:49,818] [INFO] [timer.py:197:stop] 0/2785, RunningAvgSamplesPerSec=11.995132672876716, CurrSamplesPerSec=11.917545032958047, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:34:56,331] [INFO] [timer.py:197:stop] 0/2786, RunningAvgSamplesPerSec=11.995099491451521, CurrSamplesPerSec=11.903461312342019, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:35:02,830] [INFO] [timer.py:197:stop] 0/2787, RunningAvgSamplesPerSec=11.99506810062795, CurrSamplesPerSec=11.908308377173617, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:35:09,352] [INFO] [timer.py:197:stop] 0/2788, RunningAvgSamplesPerSec=11.995034416514528, CurrSamplesPerSec=11.90195239395615, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:35:15,816] [INFO] [timer.py:197:stop] 0/2789, RunningAvgSamplesPerSec=11.995010076388672, CurrSamplesPerSec=11.92757982713692, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:35:22,260] [INFO] [logging.py:68:log_dist] [Rank 0] step=2790, skipped=4, lr=[4.922222222222223e-06], mom=[[0.9, 0.999]] [2022-12-19 23:35:22,261] [INFO] [timer.py:197:stop] 0/2790, RunningAvgSamplesPerSec=11.99498463165775, CurrSamplesPerSec=11.924487098223073, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:35:28,795] [INFO] [timer.py:197:stop] 0/2791, RunningAvgSamplesPerSec=11.994983034740743, CurrSamplesPerSec=11.990532482636125, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:35:35,194] [INFO] [timer.py:197:stop] 0/2792, RunningAvgSamplesPerSec=11.99497267346927, CurrSamplesPerSec=11.96614456322677, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:35:41,724] [INFO] [timer.py:197:stop] 0/2793, RunningAvgSamplesPerSec=11.994941989830576, CurrSamplesPerSec=11.909941499918451, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:35:48,187] [INFO] [timer.py:197:stop] 0/2794, RunningAvgSamplesPerSec=11.9949135866921, CurrSamplesPerSec=11.91616108095669, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:35:55,048] [INFO] [timer.py:197:stop] 0/2795, RunningAvgSamplesPerSec=11.994891863634376, CurrSamplesPerSec=11.934546325954587, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:36:02,088] [INFO] [timer.py:197:stop] 0/2796, RunningAvgSamplesPerSec=11.9948446143744, CurrSamplesPerSec=11.864314041739986, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:36:08,927] [INFO] [timer.py:197:stop] 0/2797, RunningAvgSamplesPerSec=11.994831484446387, CurrSamplesPerSec=11.958258360932666, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:36:16,004] [INFO] [timer.py:197:stop] 0/2798, RunningAvgSamplesPerSec=11.994790821678444, CurrSamplesPerSec=11.882205529426862, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:36:22,505] [INFO] [timer.py:197:stop] 0/2799, RunningAvgSamplesPerSec=11.994794374061838, CurrSamplesPerSec=12.00473507251396, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:36:29,225] [INFO] [logging.py:68:log_dist] [Rank 0] step=2800, skipped=4, lr=[4.9000000000000005e-06], mom=[[0.9, 0.999]] [2022-12-19 23:36:29,225] [INFO] [timer.py:197:stop] 0/2800, RunningAvgSamplesPerSec=11.994747302472852, CurrSamplesPerSec=11.864518028724076, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 4.9000000000000005e-06, 'epoch': 73.68} [2022-12-19 23:36:35,675] [INFO] [timer.py:197:stop] 0/2801, RunningAvgSamplesPerSec=11.994746590018329, CurrSamplesPerSec=11.992753473619294, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:36:42,143] [INFO] [timer.py:197:stop] 0/2802, RunningAvgSamplesPerSec=11.994710971456705, CurrSamplesPerSec=11.895836723464887, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:36:48,610] [INFO] [timer.py:197:stop] 0/2803, RunningAvgSamplesPerSec=11.994711643311364, CurrSamplesPerSec=11.996593131543682, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:36:55,193] [INFO] [timer.py:197:stop] 0/2804, RunningAvgSamplesPerSec=11.994640827310024, CurrSamplesPerSec=11.799513189313304, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:37:01,807] [INFO] [timer.py:197:stop] 0/2805, RunningAvgSamplesPerSec=11.99459466681401, CurrSamplesPerSec=11.866633299484299, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:37:08,468] [INFO] [timer.py:197:stop] 0/2806, RunningAvgSamplesPerSec=11.994551270978453, CurrSamplesPerSec=11.87413434551529, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:37:15,201] [INFO] [timer.py:197:stop] 0/2807, RunningAvgSamplesPerSec=11.994503313081292, CurrSamplesPerSec=11.861520807569196, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:37:21,830] [INFO] [timer.py:197:stop] 0/2808, RunningAvgSamplesPerSec=11.994500890722415, CurrSamplesPerSec=11.987710022371274, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:37:28,319] [INFO] [timer.py:197:stop] 0/2809, RunningAvgSamplesPerSec=11.99449129481775, CurrSamplesPerSec=11.96762551808092, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:37:34,820] [INFO] [logging.py:68:log_dist] [Rank 0] step=2810, skipped=4, lr=[4.877777777777778e-06], mom=[[0.9, 0.999]] [2022-12-19 23:37:34,820] [INFO] [timer.py:197:stop] 0/2810, RunningAvgSamplesPerSec=11.994478297299317, CurrSamplesPerSec=11.958104940606495, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:37:41,307] [INFO] [timer.py:197:stop] 0/2811, RunningAvgSamplesPerSec=11.994466766938656, CurrSamplesPerSec=11.962176707388547, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:37:46,021] [INFO] [timer.py:197:stop] 0/2812, RunningAvgSamplesPerSec=11.995617584566874, CurrSamplesPerSec=16.42136401544995, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:37:52,568] [INFO] [timer.py:197:stop] 0/2813, RunningAvgSamplesPerSec=11.995551835504989, CurrSamplesPerSec=11.813600375802428, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:37:59,100] [INFO] [timer.py:197:stop] 0/2814, RunningAvgSamplesPerSec=11.995545472821746, CurrSamplesPerSec=11.977686607475677, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:38:05,625] [INFO] [timer.py:197:stop] 0/2815, RunningAvgSamplesPerSec=11.99552782747682, CurrSamplesPerSec=11.946113588974285, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:38:12,135] [INFO] [timer.py:197:stop] 0/2816, RunningAvgSamplesPerSec=11.995536605919279, CurrSamplesPerSec=12.020281321615673, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:38:18,814] [INFO] [timer.py:197:stop] 0/2817, RunningAvgSamplesPerSec=11.995504905173195, CurrSamplesPerSec=11.906957730560379, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:38:25,395] [INFO] [timer.py:197:stop] 0/2818, RunningAvgSamplesPerSec=11.995489916618782, CurrSamplesPerSec=11.953445076454539, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:38:31,851] [INFO] [timer.py:197:stop] 0/2819, RunningAvgSamplesPerSec=11.9954492655226, CurrSamplesPerSec=11.882058261934946, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:38:38,280] [INFO] [logging.py:68:log_dist] [Rank 0] step=2820, skipped=4, lr=[4.855555555555556e-06], mom=[[0.9, 0.999]] [2022-12-19 23:38:38,281] [INFO] [timer.py:197:stop] 0/2820, RunningAvgSamplesPerSec=11.995404347123824, CurrSamplesPerSec=11.870190521463236, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:38:44,752] [INFO] [timer.py:197:stop] 0/2821, RunningAvgSamplesPerSec=11.995391802705896, CurrSamplesPerSec=11.960145539844815, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:38:51,313] [INFO] [timer.py:197:stop] 0/2822, RunningAvgSamplesPerSec=11.995396030298194, CurrSamplesPerSec=12.007325469245938, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:38:57,907] [INFO] [timer.py:197:stop] 0/2823, RunningAvgSamplesPerSec=11.99539424227348, CurrSamplesPerSec=11.990354131929271, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:39:04,587] [INFO] [timer.py:197:stop] 0/2824, RunningAvgSamplesPerSec=11.995375320058155, CurrSamplesPerSec=11.942232321457736, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:39:11,161] [INFO] [timer.py:197:stop] 0/2825, RunningAvgSamplesPerSec=11.9953463046593, CurrSamplesPerSec=11.91402018618213, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 4.8444444444444446e-06, 'epoch': 74.34} [2022-12-19 23:39:17,625] [INFO] [timer.py:197:stop] 0/2826, RunningAvgSamplesPerSec=11.995353926673253, CurrSamplesPerSec=12.016909551661946, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:39:24,149] [INFO] [timer.py:197:stop] 0/2827, RunningAvgSamplesPerSec=11.99529384506374, CurrSamplesPerSec=11.827990679681673, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:39:30,626] [INFO] [timer.py:197:stop] 0/2828, RunningAvgSamplesPerSec=11.995264810975941, CurrSamplesPerSec=11.913800745632631, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:39:37,083] [INFO] [timer.py:197:stop] 0/2829, RunningAvgSamplesPerSec=11.995238255241542, CurrSamplesPerSec=11.920658512515194, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:39:43,580] [INFO] [logging.py:68:log_dist] [Rank 0] step=2830, skipped=4, lr=[4.833333333333333e-06], mom=[[0.9, 0.999]] [2022-12-19 23:39:43,581] [INFO] [timer.py:197:stop] 0/2830, RunningAvgSamplesPerSec=11.99519900060084, CurrSamplesPerSec=11.885243735745346, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:39:50,174] [INFO] [timer.py:197:stop] 0/2831, RunningAvgSamplesPerSec=11.99518003662598, CurrSamplesPerSec=11.941788711286401, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:39:56,800] [INFO] [timer.py:197:stop] 0/2832, RunningAvgSamplesPerSec=11.995157452615425, CurrSamplesPerSec=11.931605902929027, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:40:03,400] [INFO] [timer.py:197:stop] 0/2833, RunningAvgSamplesPerSec=11.99517084625352, CurrSamplesPerSec=12.033195038603344, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:40:09,884] [INFO] [timer.py:197:stop] 0/2834, RunningAvgSamplesPerSec=11.995184291786037, CurrSamplesPerSec=12.033369811205048, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:40:16,326] [INFO] [timer.py:197:stop] 0/2835, RunningAvgSamplesPerSec=11.995185990461273, CurrSamplesPerSec=11.999998569488696, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:40:22,755] [INFO] [timer.py:197:stop] 0/2836, RunningAvgSamplesPerSec=11.995187703258228, CurrSamplesPerSec=12.000042021421716, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:40:29,268] [INFO] [timer.py:197:stop] 0/2837, RunningAvgSamplesPerSec=11.995186467634046, CurrSamplesPerSec=11.9916857310328, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:40:35,933] [INFO] [timer.py:197:stop] 0/2838, RunningAvgSamplesPerSec=11.99518889310726, CurrSamplesPerSec=12.0020690550938, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:40:42,650] [INFO] [timer.py:197:stop] 0/2839, RunningAvgSamplesPerSec=11.99514700035247, CurrSamplesPerSec=11.877504764975935, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:40:49,228] [INFO] [logging.py:68:log_dist] [Rank 0] step=2840, skipped=4, lr=[4.811111111111111e-06], mom=[[0.9, 0.999]] [2022-12-19 23:40:49,228] [INFO] [timer.py:197:stop] 0/2840, RunningAvgSamplesPerSec=11.995156139246793, CurrSamplesPerSec=12.021139363923899, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:40:55,647] [INFO] [timer.py:197:stop] 0/2841, RunningAvgSamplesPerSec=11.995161589507466, CurrSamplesPerSec=12.010649407978699, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:41:02,095] [INFO] [timer.py:197:stop] 0/2842, RunningAvgSamplesPerSec=11.995129901140645, CurrSamplesPerSec=11.905836562172746, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:41:08,686] [INFO] [timer.py:197:stop] 0/2843, RunningAvgSamplesPerSec=11.995058210004018, CurrSamplesPerSec=11.794854809531884, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:41:15,227] [INFO] [timer.py:197:stop] 0/2844, RunningAvgSamplesPerSec=11.994993606557177, CurrSamplesPerSec=11.81422221588458, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:41:21,782] [INFO] [timer.py:197:stop] 0/2845, RunningAvgSamplesPerSec=11.994973929928433, CurrSamplesPerSec=11.939312537655683, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:41:28,410] [INFO] [timer.py:197:stop] 0/2846, RunningAvgSamplesPerSec=11.994974892516233, CurrSamplesPerSec=11.99771215435208, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:41:34,946] [INFO] [timer.py:197:stop] 0/2847, RunningAvgSamplesPerSec=11.994950189192433, CurrSamplesPerSec=11.925103184719262, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:41:41,611] [INFO] [timer.py:197:stop] 0/2848, RunningAvgSamplesPerSec=11.994883673984992, CurrSamplesPerSec=11.808588011827968, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:41:48,257] [INFO] [timer.py:197:stop] 0/2849, RunningAvgSamplesPerSec=11.994885506522081, CurrSamplesPerSec=12.000103176527215, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:41:52,909] [INFO] [logging.py:68:log_dist] [Rank 0] step=2850, skipped=4, lr=[4.7888888888888894e-06], mom=[[0.9, 0.999]] [2022-12-19 23:41:52,910] [INFO] [timer.py:197:stop] 0/2850, RunningAvgSamplesPerSec=11.996044206712059, CurrSamplesPerSec=16.546697748651876, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 4.7888888888888894e-06, 'epoch': 75.0} [2022-12-19 23:41:59,496] [INFO] [timer.py:197:stop] 0/2851, RunningAvgSamplesPerSec=11.996050632792386, CurrSamplesPerSec=12.014380083254386, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:42:06,168] [INFO] [timer.py:197:stop] 0/2852, RunningAvgSamplesPerSec=11.996021913158797, CurrSamplesPerSec=11.914754181020333, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:42:12,808] [INFO] [timer.py:197:stop] 0/2853, RunningAvgSamplesPerSec=11.995985228406497, CurrSamplesPerSec=11.89233734902182, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:42:19,393] [INFO] [timer.py:197:stop] 0/2854, RunningAvgSamplesPerSec=11.995953857659744, CurrSamplesPerSec=11.907177975446508, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:42:25,859] [INFO] [timer.py:197:stop] 0/2855, RunningAvgSamplesPerSec=11.99596124843588, CurrSamplesPerSec=12.017076857902913, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:42:32,336] [INFO] [timer.py:197:stop] 0/2856, RunningAvgSamplesPerSec=11.995947817626535, CurrSamplesPerSec=11.957751769192775, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:42:38,834] [INFO] [timer.py:197:stop] 0/2857, RunningAvgSamplesPerSec=11.9959055307113, CurrSamplesPerSec=11.876421188453289, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:42:45,342] [INFO] [timer.py:197:stop] 0/2858, RunningAvgSamplesPerSec=11.995860734122244, CurrSamplesPerSec=11.869316104640644, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:42:51,890] [INFO] [timer.py:197:stop] 0/2859, RunningAvgSamplesPerSec=11.995823711724112, CurrSamplesPerSec=11.89101191903202, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:42:58,465] [INFO] [logging.py:68:log_dist] [Rank 0] step=2860, skipped=4, lr=[4.766666666666667e-06], mom=[[0.9, 0.999]] [2022-12-19 23:42:58,466] [INFO] [timer.py:197:stop] 0/2860, RunningAvgSamplesPerSec=11.995801869732366, CurrSamplesPerSec=11.93372235211941, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:43:05,129] [INFO] [timer.py:197:stop] 0/2861, RunningAvgSamplesPerSec=11.995756367804592, CurrSamplesPerSec=11.867107017883102, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:43:11,741] [INFO] [timer.py:197:stop] 0/2862, RunningAvgSamplesPerSec=11.995730307322603, CurrSamplesPerSec=11.92168346399783, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:43:18,319] [INFO] [timer.py:197:stop] 0/2863, RunningAvgSamplesPerSec=11.99573642273011, CurrSamplesPerSec=12.013252035303735, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:43:24,849] [INFO] [timer.py:197:stop] 0/2864, RunningAvgSamplesPerSec=11.995742772868976, CurrSamplesPerSec=12.013938086891638, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:43:31,329] [INFO] [timer.py:197:stop] 0/2865, RunningAvgSamplesPerSec=11.995727042534, CurrSamplesPerSec=11.950875212524075, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:43:37,846] [INFO] [timer.py:197:stop] 0/2866, RunningAvgSamplesPerSec=11.995688944324613, CurrSamplesPerSec=11.88759697972693, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:43:44,268] [INFO] [timer.py:197:stop] 0/2867, RunningAvgSamplesPerSec=11.995695188517471, CurrSamplesPerSec=12.013605266813604, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:43:50,939] [INFO] [timer.py:197:stop] 0/2868, RunningAvgSamplesPerSec=11.995667819365977, CurrSamplesPerSec=11.91776461146271, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:43:57,465] [INFO] [timer.py:197:stop] 0/2869, RunningAvgSamplesPerSec=11.99567232245394, CurrSamplesPerSec=12.008592077451809, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:44:03,902] [INFO] [logging.py:68:log_dist] [Rank 0] step=2870, skipped=4, lr=[4.744444444444445e-06], mom=[[0.9, 0.999]] [2022-12-19 23:44:03,903] [INFO] [timer.py:197:stop] 0/2870, RunningAvgSamplesPerSec=11.995673790366103, CurrSamplesPerSec=11.999883772068443, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:44:10,445] [INFO] [timer.py:197:stop] 0/2871, RunningAvgSamplesPerSec=11.995625233927706, CurrSamplesPerSec=11.857964064777809, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:44:16,924] [INFO] [timer.py:197:stop] 0/2872, RunningAvgSamplesPerSec=11.995604547347426, CurrSamplesPerSec=11.936547044972027, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:44:23,326] [INFO] [timer.py:197:stop] 0/2873, RunningAvgSamplesPerSec=11.995609253984249, CurrSamplesPerSec=12.00913253530572, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:44:29,778] [INFO] [timer.py:197:stop] 0/2874, RunningAvgSamplesPerSec=11.995592505317317, CurrSamplesPerSec=11.947699134360214, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:44:36,228] [INFO] [timer.py:197:stop] 0/2875, RunningAvgSamplesPerSec=11.995540783086717, CurrSamplesPerSec=11.848812176639454, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 4.7333333333333335e-06, 'epoch': 75.66} [2022-12-19 23:44:42,719] [INFO] [timer.py:197:stop] 0/2876, RunningAvgSamplesPerSec=11.995528758125452, CurrSamplesPerSec=11.961080292227095, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:44:49,315] [INFO] [timer.py:197:stop] 0/2877, RunningAvgSamplesPerSec=11.995467651581732, CurrSamplesPerSec=11.822382389548984, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:44:55,871] [INFO] [timer.py:197:stop] 0/2878, RunningAvgSamplesPerSec=11.995427321190762, CurrSamplesPerSec=11.880587891040012, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:45:02,360] [INFO] [timer.py:197:stop] 0/2879, RunningAvgSamplesPerSec=11.995396320352492, CurrSamplesPerSec=11.906895936696326, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:45:08,836] [INFO] [logging.py:68:log_dist] [Rank 0] step=2880, skipped=4, lr=[4.722222222222222e-06], mom=[[0.9, 0.999]] [2022-12-19 23:45:08,837] [INFO] [timer.py:197:stop] 0/2880, RunningAvgSamplesPerSec=11.995391532309146, CurrSamplesPerSec=11.981632138027312, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:45:15,314] [INFO] [timer.py:197:stop] 0/2881, RunningAvgSamplesPerSec=11.995399920454267, CurrSamplesPerSec=12.01958970164683, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:45:21,841] [INFO] [timer.py:197:stop] 0/2882, RunningAvgSamplesPerSec=11.995386200467326, CurrSamplesPerSec=11.956016045847313, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:45:28,311] [INFO] [timer.py:197:stop] 0/2883, RunningAvgSamplesPerSec=11.995354620659237, CurrSamplesPerSec=11.905089409870286, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:45:34,840] [INFO] [timer.py:197:stop] 0/2884, RunningAvgSamplesPerSec=11.995332188920155, CurrSamplesPerSec=11.931052779445885, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:45:41,404] [INFO] [timer.py:197:stop] 0/2885, RunningAvgSamplesPerSec=11.995303047355343, CurrSamplesPerSec=11.91190120427437, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:45:47,852] [INFO] [timer.py:197:stop] 0/2886, RunningAvgSamplesPerSec=11.995280365996916, CurrSamplesPerSec=11.93024466437619, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:45:54,334] [INFO] [timer.py:197:stop] 0/2887, RunningAvgSamplesPerSec=11.995256573231098, CurrSamplesPerSec=11.927028666674783, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:45:58,988] [INFO] [timer.py:197:stop] 0/2888, RunningAvgSamplesPerSec=11.996395072970081, CurrSamplesPerSec=16.519922339542216, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:46:05,506] [INFO] [timer.py:197:stop] 0/2889, RunningAvgSamplesPerSec=11.996369370006356, CurrSamplesPerSec=11.922646634291452, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:46:11,987] [INFO] [logging.py:68:log_dist] [Rank 0] step=2890, skipped=4, lr=[4.7e-06], mom=[[0.9, 0.999]] [2022-12-19 23:46:11,988] [INFO] [timer.py:197:stop] 0/2890, RunningAvgSamplesPerSec=11.996371579145174, CurrSamplesPerSec=12.002752756590795, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:46:18,505] [INFO] [timer.py:197:stop] 0/2891, RunningAvgSamplesPerSec=11.996339979520815, CurrSamplesPerSec=11.905769499442712, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:46:25,006] [INFO] [timer.py:197:stop] 0/2892, RunningAvgSamplesPerSec=11.996318473977574, CurrSamplesPerSec=11.934509183667798, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:46:31,444] [INFO] [timer.py:197:stop] 0/2893, RunningAvgSamplesPerSec=11.996321592797116, CurrSamplesPerSec=12.005341760888399, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:46:37,941] [INFO] [timer.py:197:stop] 0/2894, RunningAvgSamplesPerSec=11.996292617273836, CurrSamplesPerSec=11.913105463301342, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:46:44,418] [INFO] [timer.py:197:stop] 0/2895, RunningAvgSamplesPerSec=11.99625509532189, CurrSamplesPerSec=11.888714715354709, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:46:50,925] [INFO] [timer.py:197:stop] 0/2896, RunningAvgSamplesPerSec=11.996242206633672, CurrSamplesPerSec=11.959070808485007, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:46:57,323] [INFO] [timer.py:197:stop] 0/2897, RunningAvgSamplesPerSec=11.996246537067483, CurrSamplesPerSec=12.008791922997844, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:47:03,742] [INFO] [timer.py:197:stop] 0/2898, RunningAvgSamplesPerSec=11.996252100558165, CurrSamplesPerSec=12.01238006714857, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:47:10,229] [INFO] [timer.py:197:stop] 0/2899, RunningAvgSamplesPerSec=11.996226388234524, CurrSamplesPerSec=11.92222301081288, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:47:16,739] [INFO] [logging.py:68:log_dist] [Rank 0] step=2900, skipped=4, lr=[4.677777777777778e-06], mom=[[0.9, 0.999]] [2022-12-19 23:47:16,740] [INFO] [timer.py:197:stop] 0/2900, RunningAvgSamplesPerSec=11.996185769314122, CurrSamplesPerSec=11.87965621346321, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 4.677777777777778e-06, 'epoch': 76.32} [2022-12-19 23:47:23,160] [INFO] [timer.py:197:stop] 0/2901, RunningAvgSamplesPerSec=11.996196464310772, CurrSamplesPerSec=12.027270878045066, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:47:29,630] [INFO] [timer.py:197:stop] 0/2902, RunningAvgSamplesPerSec=11.996194266087848, CurrSamplesPerSec=11.989825002495689, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:47:36,063] [INFO] [timer.py:197:stop] 0/2903, RunningAvgSamplesPerSec=11.99619863177601, CurrSamplesPerSec=12.008872507761119, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:47:42,604] [INFO] [timer.py:197:stop] 0/2904, RunningAvgSamplesPerSec=11.996153542635701, CurrSamplesPerSec=11.866761299397877, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:47:49,150] [INFO] [timer.py:197:stop] 0/2905, RunningAvgSamplesPerSec=11.996114919505782, CurrSamplesPerSec=11.885068503455313, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:47:55,567] [INFO] [timer.py:197:stop] 0/2906, RunningAvgSamplesPerSec=11.996099455671791, CurrSamplesPerSec=11.951375368628302, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:48:02,075] [INFO] [timer.py:197:stop] 0/2907, RunningAvgSamplesPerSec=11.996069140529142, CurrSamplesPerSec=11.908675539794567, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:48:08,527] [INFO] [timer.py:197:stop] 0/2908, RunningAvgSamplesPerSec=11.996057759113098, CurrSamplesPerSec=11.963085653069902, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:48:15,045] [INFO] [timer.py:197:stop] 0/2909, RunningAvgSamplesPerSec=11.996026714254247, CurrSamplesPerSec=11.906483993993076, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:48:21,591] [INFO] [logging.py:68:log_dist] [Rank 0] step=2910, skipped=4, lr=[4.655555555555556e-06], mom=[[0.9, 0.999]] [2022-12-19 23:48:21,592] [INFO] [timer.py:197:stop] 0/2910, RunningAvgSamplesPerSec=11.995994184047277, CurrSamplesPerSec=11.902168759594689, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:48:28,064] [INFO] [timer.py:197:stop] 0/2911, RunningAvgSamplesPerSec=11.995972741144111, CurrSamplesPerSec=11.933939343850737, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:48:34,581] [INFO] [timer.py:197:stop] 0/2912, RunningAvgSamplesPerSec=11.99593355078353, CurrSamplesPerSec=11.88300241506502, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:48:41,040] [INFO] [timer.py:197:stop] 0/2913, RunningAvgSamplesPerSec=11.995886612223023, CurrSamplesPerSec=11.860833710010874, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:48:47,584] [INFO] [timer.py:197:stop] 0/2914, RunningAvgSamplesPerSec=11.995849938547407, CurrSamplesPerSec=11.890034893351748, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:48:54,078] [INFO] [timer.py:197:stop] 0/2915, RunningAvgSamplesPerSec=11.995845443004521, CurrSamplesPerSec=11.982768697624998, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:49:00,606] [INFO] [timer.py:197:stop] 0/2916, RunningAvgSamplesPerSec=11.99580578508577, CurrSamplesPerSec=11.881384559710323, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:49:07,103] [INFO] [timer.py:197:stop] 0/2917, RunningAvgSamplesPerSec=11.99578397808125, CurrSamplesPerSec=11.932573329402565, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:49:13,602] [INFO] [timer.py:197:stop] 0/2918, RunningAvgSamplesPerSec=11.99579192462954, CurrSamplesPerSec=12.01900094549821, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:49:20,137] [INFO] [timer.py:197:stop] 0/2919, RunningAvgSamplesPerSec=11.995732001631197, CurrSamplesPerSec=11.823506121798358, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:49:26,575] [INFO] [logging.py:68:log_dist] [Rank 0] step=2920, skipped=4, lr=[4.633333333333334e-06], mom=[[0.9, 0.999]] [2022-12-19 23:49:26,575] [INFO] [timer.py:197:stop] 0/2920, RunningAvgSamplesPerSec=11.995732514781201, CurrSamplesPerSec=11.99722956021883, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:49:33,041] [INFO] [timer.py:197:stop] 0/2921, RunningAvgSamplesPerSec=11.995710813895954, CurrSamplesPerSec=11.932720260411891, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:49:39,498] [INFO] [timer.py:197:stop] 0/2922, RunningAvgSamplesPerSec=11.995718168514095, CurrSamplesPerSec=12.017224801386078, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:49:45,921] [INFO] [timer.py:197:stop] 0/2923, RunningAvgSamplesPerSec=11.995700490170748, CurrSamplesPerSec=11.944300989377206, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:49:52,395] [INFO] [timer.py:197:stop] 0/2924, RunningAvgSamplesPerSec=11.995693940270002, CurrSamplesPerSec=11.97659215658245, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:49:58,853] [INFO] [timer.py:197:stop] 0/2925, RunningAvgSamplesPerSec=11.995663576803596, CurrSamplesPerSec=11.907593135442914, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 4.622222222222222e-06, 'epoch': 76.97} [2022-12-19 23:50:03,461] [INFO] [timer.py:197:stop] 0/2926, RunningAvgSamplesPerSec=11.996786658576081, CurrSamplesPerSec=16.516830235934275, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:50:09,998] [INFO] [timer.py:197:stop] 0/2927, RunningAvgSamplesPerSec=11.996735902101301, CurrSamplesPerSec=11.850138154625379, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:50:16,426] [INFO] [timer.py:197:stop] 0/2928, RunningAvgSamplesPerSec=11.996702427947078, CurrSamplesPerSec=11.899583441035311, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:50:22,964] [INFO] [timer.py:197:stop] 0/2929, RunningAvgSamplesPerSec=11.996648997598196, CurrSamplesPerSec=11.842323612278657, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:50:29,922] [INFO] [logging.py:68:log_dist] [Rank 0] step=2930, skipped=4, lr=[4.611111111111112e-06], mom=[[0.9, 0.999]] [2022-12-19 23:50:29,923] [INFO] [timer.py:197:stop] 0/2930, RunningAvgSamplesPerSec=11.996635067584124, CurrSamplesPerSec=11.956000070372522, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:50:36,951] [INFO] [timer.py:197:stop] 0/2931, RunningAvgSamplesPerSec=11.996606214478295, CurrSamplesPerSec=11.912715294775909, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:50:43,705] [INFO] [timer.py:197:stop] 0/2932, RunningAvgSamplesPerSec=11.996566373963098, CurrSamplesPerSec=11.88099804374489, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:50:50,395] [INFO] [timer.py:197:stop] 0/2933, RunningAvgSamplesPerSec=11.996546541013602, CurrSamplesPerSec=11.938716221153562, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:50:57,267] [INFO] [timer.py:197:stop] 0/2934, RunningAvgSamplesPerSec=11.996522238051522, CurrSamplesPerSec=11.925710857747417, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:51:03,745] [INFO] [timer.py:197:stop] 0/2935, RunningAvgSamplesPerSec=11.996500869485446, CurrSamplesPerSec=11.93417385234442, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:51:10,204] [INFO] [timer.py:197:stop] 0/2936, RunningAvgSamplesPerSec=11.996473850232153, CurrSamplesPerSec=11.917746621594173, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:51:16,578] [INFO] [timer.py:197:stop] 0/2937, RunningAvgSamplesPerSec=11.996481105661388, CurrSamplesPerSec=12.017806389073801, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:51:23,218] [INFO] [timer.py:197:stop] 0/2938, RunningAvgSamplesPerSec=11.996467743755263, CurrSamplesPerSec=11.957378378365135, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:51:29,793] [INFO] [timer.py:197:stop] 0/2939, RunningAvgSamplesPerSec=11.9964679691719, CurrSamplesPerSec=11.997129828938789, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:51:36,247] [INFO] [logging.py:68:log_dist] [Rank 0] step=2940, skipped=4, lr=[4.58888888888889e-06], mom=[[0.9, 0.999]] [2022-12-19 23:51:36,247] [INFO] [timer.py:197:stop] 0/2940, RunningAvgSamplesPerSec=11.996449403044647, CurrSamplesPerSec=11.942167504502523, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:51:42,628] [INFO] [timer.py:197:stop] 0/2941, RunningAvgSamplesPerSec=11.99646142145258, CurrSamplesPerSec=12.031875777161442, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:51:49,183] [INFO] [timer.py:197:stop] 0/2942, RunningAvgSamplesPerSec=11.996416865153853, CurrSamplesPerSec=11.866880383764185, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:51:55,849] [INFO] [timer.py:197:stop] 0/2943, RunningAvgSamplesPerSec=11.996287801404325, CurrSamplesPerSec=11.628478297926536, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:52:02,293] [INFO] [timer.py:197:stop] 0/2944, RunningAvgSamplesPerSec=11.996282511469106, CurrSamplesPerSec=11.980744969119959, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:52:08,943] [INFO] [timer.py:197:stop] 0/2945, RunningAvgSamplesPerSec=11.996254411488648, CurrSamplesPerSec=11.914150268010781, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:52:15,561] [INFO] [timer.py:197:stop] 0/2946, RunningAvgSamplesPerSec=11.996259544168167, CurrSamplesPerSec=12.011384070999203, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:52:22,221] [INFO] [timer.py:197:stop] 0/2947, RunningAvgSamplesPerSec=11.996234486108118, CurrSamplesPerSec=11.922914591684341, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:52:28,696] [INFO] [timer.py:197:stop] 0/2948, RunningAvgSamplesPerSec=11.996239290304686, CurrSamplesPerSec=12.01040436115169, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:52:35,081] [INFO] [timer.py:197:stop] 0/2949, RunningAvgSamplesPerSec=11.99623103934141, CurrSamplesPerSec=11.97197287127503, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:52:41,577] [INFO] [logging.py:68:log_dist] [Rank 0] step=2950, skipped=4, lr=[4.566666666666667e-06], mom=[[0.9, 0.999]] [2022-12-19 23:52:41,578] [INFO] [timer.py:197:stop] 0/2950, RunningAvgSamplesPerSec=11.996208716905302, CurrSamplesPerSec=11.930783395592034, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 4.566666666666667e-06, 'epoch': 77.63} [2022-12-19 23:52:48,108] [INFO] [timer.py:197:stop] 0/2951, RunningAvgSamplesPerSec=11.996178032597875, CurrSamplesPerSec=11.90639791213139, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:52:54,759] [INFO] [timer.py:197:stop] 0/2952, RunningAvgSamplesPerSec=11.996186256064806, CurrSamplesPerSec=12.020486400889292, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:53:01,443] [INFO] [timer.py:197:stop] 0/2953, RunningAvgSamplesPerSec=11.996147620215726, CurrSamplesPerSec=11.883244920949009, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:53:07,962] [INFO] [timer.py:197:stop] 0/2954, RunningAvgSamplesPerSec=11.996154736585686, CurrSamplesPerSec=12.01719198451724, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:53:14,471] [INFO] [timer.py:197:stop] 0/2955, RunningAvgSamplesPerSec=11.996131953953771, CurrSamplesPerSec=11.929252699036889, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:53:20,865] [INFO] [timer.py:197:stop] 0/2956, RunningAvgSamplesPerSec=11.996118261782607, CurrSamplesPerSec=11.955821147971559, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:53:27,367] [INFO] [timer.py:197:stop] 0/2957, RunningAvgSamplesPerSec=11.996119392708595, CurrSamplesPerSec=11.999461079006014, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:53:34,048] [INFO] [timer.py:197:stop] 0/2958, RunningAvgSamplesPerSec=11.996097308876331, CurrSamplesPerSec=11.931192779008247, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:53:40,525] [INFO] [timer.py:197:stop] 0/2959, RunningAvgSamplesPerSec=11.996071304680594, CurrSamplesPerSec=11.919692487686905, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:53:47,027] [INFO] [logging.py:68:log_dist] [Rank 0] step=2960, skipped=4, lr=[4.544444444444445e-06], mom=[[0.9, 0.999]] [2022-12-19 23:53:47,027] [INFO] [timer.py:197:stop] 0/2960, RunningAvgSamplesPerSec=11.99604622887748, CurrSamplesPerSec=11.922352742305247, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:53:53,486] [INFO] [timer.py:197:stop] 0/2961, RunningAvgSamplesPerSec=11.996024109090847, CurrSamplesPerSec=11.93094884250192, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:54:00,017] [INFO] [timer.py:197:stop] 0/2962, RunningAvgSamplesPerSec=11.995978875593616, CurrSamplesPerSec=11.863610366701105, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:54:06,624] [INFO] [timer.py:197:stop] 0/2963, RunningAvgSamplesPerSec=11.995941329264916, CurrSamplesPerSec=11.885824723004765, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:54:11,376] [INFO] [timer.py:197:stop] 0/2964, RunningAvgSamplesPerSec=11.997055501492447, CurrSamplesPerSec=16.54800543965399, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:54:17,851] [INFO] [timer.py:197:stop] 0/2965, RunningAvgSamplesPerSec=11.997031871974233, CurrSamplesPerSec=11.927447331990503, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:54:24,338] [INFO] [timer.py:197:stop] 0/2966, RunningAvgSamplesPerSec=11.997001112219658, CurrSamplesPerSec=11.90654736793828, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:54:30,851] [INFO] [timer.py:197:stop] 0/2967, RunningAvgSamplesPerSec=11.99696009571496, CurrSamplesPerSec=11.876607200626918, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:54:37,383] [INFO] [timer.py:197:stop] 0/2968, RunningAvgSamplesPerSec=11.996915228529806, CurrSamplesPerSec=11.86534348726421, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:54:43,913] [INFO] [timer.py:197:stop] 0/2969, RunningAvgSamplesPerSec=11.996897585357685, CurrSamplesPerSec=11.944795280168572, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:54:50,519] [INFO] [logging.py:68:log_dist] [Rank 0] step=2970, skipped=4, lr=[4.5222222222222225e-06], mom=[[0.9, 0.999]] [2022-12-19 23:54:50,519] [INFO] [timer.py:197:stop] 0/2970, RunningAvgSamplesPerSec=11.996857547689974, CurrSamplesPerSec=11.87923090811894, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:54:57,201] [INFO] [timer.py:197:stop] 0/2971, RunningAvgSamplesPerSec=11.99680818205337, CurrSamplesPerSec=11.852059387575835, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:55:03,813] [INFO] [timer.py:197:stop] 0/2972, RunningAvgSamplesPerSec=11.99681146197726, CurrSamplesPerSec=12.00655746973128, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:55:10,439] [INFO] [timer.py:197:stop] 0/2973, RunningAvgSamplesPerSec=11.996779629721633, CurrSamplesPerSec=11.902977297892392, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:55:17,097] [INFO] [timer.py:197:stop] 0/2974, RunningAvgSamplesPerSec=11.996733544682225, CurrSamplesPerSec=11.861360425273467, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:55:23,565] [INFO] [timer.py:197:stop] 0/2975, RunningAvgSamplesPerSec=11.996723111339803, CurrSamplesPerSec=11.965795183877587, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 4.511111111111111e-06, 'epoch': 78.29} [2022-12-19 23:55:30,010] [INFO] [timer.py:197:stop] 0/2976, RunningAvgSamplesPerSec=11.996694480727873, CurrSamplesPerSec=11.912175550915272, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:55:36,545] [INFO] [timer.py:197:stop] 0/2977, RunningAvgSamplesPerSec=11.996613057500571, CurrSamplesPerSec=11.759253121463436, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:55:43,094] [INFO] [timer.py:197:stop] 0/2978, RunningAvgSamplesPerSec=11.996555715729937, CurrSamplesPerSec=11.828356554579884, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:55:49,712] [INFO] [timer.py:197:stop] 0/2979, RunningAvgSamplesPerSec=11.996520779745097, CurrSamplesPerSec=11.893444910184893, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:55:56,286] [INFO] [logging.py:68:log_dist] [Rank 0] step=2980, skipped=4, lr=[4.5e-06], mom=[[0.9, 0.999]] [2022-12-19 23:55:56,287] [INFO] [timer.py:197:stop] 0/2980, RunningAvgSamplesPerSec=11.996522437097134, CurrSamplesPerSec=12.001458404858285, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:56:02,865] [INFO] [timer.py:197:stop] 0/2981, RunningAvgSamplesPerSec=11.996507633649514, CurrSamplesPerSec=11.952584429469274, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:56:09,499] [INFO] [timer.py:197:stop] 0/2982, RunningAvgSamplesPerSec=11.996469113845306, CurrSamplesPerSec=11.88280620822062, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:56:16,116] [INFO] [timer.py:197:stop] 0/2983, RunningAvgSamplesPerSec=11.996463839330065, CurrSamplesPerSec=11.980766358024153, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:56:22,624] [INFO] [timer.py:197:stop] 0/2984, RunningAvgSamplesPerSec=11.996430543744061, CurrSamplesPerSec=11.897991127327591, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:56:29,090] [INFO] [timer.py:197:stop] 0/2985, RunningAvgSamplesPerSec=11.996410220946451, CurrSamplesPerSec=11.936112348980767, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:56:35,618] [INFO] [timer.py:197:stop] 0/2986, RunningAvgSamplesPerSec=11.99636931530724, CurrSamplesPerSec=11.875576850514863, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:56:42,244] [INFO] [timer.py:197:stop] 0/2987, RunningAvgSamplesPerSec=11.996345028521201, CurrSamplesPerSec=11.924308588064047, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:56:48,779] [INFO] [timer.py:197:stop] 0/2988, RunningAvgSamplesPerSec=11.996311359434635, CurrSamplesPerSec=11.896644401666302, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:56:55,264] [INFO] [timer.py:197:stop] 0/2989, RunningAvgSamplesPerSec=11.996315112861362, CurrSamplesPerSec=12.007533329364154, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:57:01,742] [INFO] [logging.py:68:log_dist] [Rank 0] step=2990, skipped=4, lr=[4.477777777777778e-06], mom=[[0.9, 0.999]] [2022-12-19 23:57:01,743] [INFO] [timer.py:197:stop] 0/2990, RunningAvgSamplesPerSec=11.996317648194635, CurrSamplesPerSec=12.003895474011198, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:57:08,066] [INFO] [timer.py:197:stop] 0/2991, RunningAvgSamplesPerSec=11.996329878326453, CurrSamplesPerSec=12.032985210303554, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:57:14,615] [INFO] [timer.py:197:stop] 0/2992, RunningAvgSamplesPerSec=11.996310468536228, CurrSamplesPerSec=11.938573921121622, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:57:21,142] [INFO] [timer.py:197:stop] 0/2993, RunningAvgSamplesPerSec=11.99630162515668, CurrSamplesPerSec=11.969918093142468, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:57:27,714] [INFO] [timer.py:197:stop] 0/2994, RunningAvgSamplesPerSec=11.99627390990779, CurrSamplesPerSec=11.913946685917015, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:57:34,285] [INFO] [timer.py:197:stop] 0/2995, RunningAvgSamplesPerSec=11.996276102961138, CurrSamplesPerSec=12.002841310755352, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:57:40,978] [INFO] [timer.py:197:stop] 0/2996, RunningAvgSamplesPerSec=11.996256199160317, CurrSamplesPerSec=11.93697858769332, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:57:47,466] [INFO] [timer.py:197:stop] 0/2997, RunningAvgSamplesPerSec=11.996235841526541, CurrSamplesPerSec=11.935593302935679, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:57:53,941] [INFO] [timer.py:197:stop] 0/2998, RunningAvgSamplesPerSec=11.99620069388429, CurrSamplesPerSec=11.89184949742308, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:58:00,428] [INFO] [timer.py:197:stop] 0/2999, RunningAvgSamplesPerSec=11.996163827626125, CurrSamplesPerSec=11.886720522989132, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-19 23:58:06,933] [INFO] [logging.py:68:log_dist] [Rank 0] step=3000, skipped=4, lr=[4.455555555555555e-06], mom=[[0.9, 0.999]] [2022-12-19 23:58:06,934] [INFO] [timer.py:197:stop] 0/3000, RunningAvgSamplesPerSec=11.996168514606284, CurrSamplesPerSec=12.010231867079506, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 4.455555555555555e-06, 'epoch': 78.95} {'eval_loss': 0.43310546875, 'eval_wer': 17.93002915451895, 'eval_runtime': 168.7524, 'eval_samples_per_second': 7.152, 'eval_steps_per_second': 0.225, 'epoch': 78.95} [2022-12-20 00:00:57,515] [INFO] [logging.py:68:log_dist] [Rank 0] [Torch] Checkpoint global_step3000 is begin to save! [2022-12-20 00:00:57,522] [INFO] [logging.py:68:log_dist] [Rank 0] Saving model checkpoint: ./checkpoint-3000/global_step3000/mp_rank_00_model_states.pt [2022-12-20 00:00:57,522] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving ./checkpoint-3000/global_step3000/mp_rank_00_model_states.pt... [2022-12-20 00:00:59,296] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved ./checkpoint-3000/global_step3000/mp_rank_00_model_states.pt. [2022-12-20 00:00:59,297] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving ./checkpoint-3000/global_step3000/zero_pp_rank_0_mp_rank_00_optim_states.pt... [2022-12-20 00:01:06,595] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved ./checkpoint-3000/global_step3000/zero_pp_rank_0_mp_rank_00_optim_states.pt. [2022-12-20 00:01:06,595] [INFO] [engine.py:3269:_save_zero_checkpoint] zero checkpoint saved ./checkpoint-3000/global_step3000/zero_pp_rank_0_mp_rank_00_optim_states.pt [2022-12-20 00:01:06,595] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step3000 is ready now! [2022-12-20 00:02:16,833] [INFO] [timer.py:197:stop] 0/3001, RunningAvgSamplesPerSec=11.996044793857228, CurrSamplesPerSec=11.636258212957967, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:02:21,453] [INFO] [timer.py:197:stop] 0/3002, RunningAvgSamplesPerSec=11.997152201666287, CurrSamplesPerSec=16.590150645740703, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:02:27,949] [INFO] [timer.py:197:stop] 0/3003, RunningAvgSamplesPerSec=11.997122144039112, CurrSamplesPerSec=11.907622187162564, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:02:34,549] [INFO] [timer.py:197:stop] 0/3004, RunningAvgSamplesPerSec=11.997096934905171, CurrSamplesPerSec=11.921918550031402, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:02:40,699] [INFO] [stage_1_and_2.py:1765:step] [deepspeed] OVERFLOW! Rank 0 Skipping step. Attempted loss scale: 65536.0, reducing to 65536.0 [2022-12-20 00:02:40,701] [INFO] [timer.py:197:stop] 0/3005, RunningAvgSamplesPerSec=11.997350062267264, CurrSamplesPerSec=12.808641083433553, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:02:46,859] [INFO] [stage_1_and_2.py:1765:step] [deepspeed] OVERFLOW! Rank 0 Skipping step. Attempted loss scale: 65536.0, reducing to 32768.0 [2022-12-20 00:02:46,860] [INFO] [timer.py:197:stop] 0/3006, RunningAvgSamplesPerSec=11.997601506274274, CurrSamplesPerSec=12.803420103045333, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:02:53,341] [INFO] [timer.py:197:stop] 0/3007, RunningAvgSamplesPerSec=11.99757090665604, CurrSamplesPerSec=11.906348798498046, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:02:59,828] [INFO] [timer.py:197:stop] 0/3008, RunningAvgSamplesPerSec=11.9975292722226, CurrSamplesPerSec=11.873709435450497, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:03:06,319] [INFO] [timer.py:197:stop] 0/3009, RunningAvgSamplesPerSec=11.997512448435423, CurrSamplesPerSec=11.94715249371433, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:03:12,853] [INFO] [logging.py:68:log_dist] [Rank 0] step=3010, skipped=6, lr=[4.437777777777778e-06], mom=[[0.9, 0.999]] [2022-12-20 00:03:12,854] [INFO] [timer.py:197:stop] 0/3010, RunningAvgSamplesPerSec=11.997493561602612, CurrSamplesPerSec=11.940968517729017, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:03:19,302] [INFO] [timer.py:197:stop] 0/3011, RunningAvgSamplesPerSec=11.997467763267629, CurrSamplesPerSec=11.920365247879097, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:03:25,797] [INFO] [timer.py:197:stop] 0/3012, RunningAvgSamplesPerSec=11.997469400196566, CurrSamplesPerSec=12.002396943030215, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:03:32,266] [INFO] [timer.py:197:stop] 0/3013, RunningAvgSamplesPerSec=11.9974683383987, CurrSamplesPerSec=11.994273178261848, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:03:38,769] [INFO] [timer.py:197:stop] 0/3014, RunningAvgSamplesPerSec=11.997450153509417, CurrSamplesPerSec=11.942944291699652, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:03:45,207] [INFO] [timer.py:197:stop] 0/3015, RunningAvgSamplesPerSec=11.997457827014314, CurrSamplesPerSec=12.020615049989571, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:03:51,608] [INFO] [timer.py:197:stop] 0/3016, RunningAvgSamplesPerSec=11.99746834161269, CurrSamplesPerSec=12.029232731323766, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:03:58,477] [INFO] [timer.py:197:stop] 0/3017, RunningAvgSamplesPerSec=11.997443893381485, CurrSamplesPerSec=11.924206887336636, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:04:04,977] [INFO] [timer.py:197:stop] 0/3018, RunningAvgSamplesPerSec=11.997416223928466, CurrSamplesPerSec=11.91456908733939, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:04:11,354] [INFO] [timer.py:197:stop] 0/3019, RunningAvgSamplesPerSec=11.99742501034162, CurrSamplesPerSec=12.023983514088822, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:04:18,044] [INFO] [logging.py:68:log_dist] [Rank 0] step=3020, skipped=6, lr=[4.415555555555556e-06], mom=[[0.9, 0.999]] [2022-12-20 00:04:18,045] [INFO] [timer.py:197:stop] 0/3020, RunningAvgSamplesPerSec=11.997396956038495, CurrSamplesPerSec=11.913350256873203, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:04:24,680] [INFO] [timer.py:197:stop] 0/3021, RunningAvgSamplesPerSec=11.997398643492101, CurrSamplesPerSec=12.002493541910669, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:04:31,096] [INFO] [timer.py:197:stop] 0/3022, RunningAvgSamplesPerSec=11.997403199614954, CurrSamplesPerSec=12.011173927776293, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:04:37,594] [INFO] [timer.py:197:stop] 0/3023, RunningAvgSamplesPerSec=11.997377432882557, CurrSamplesPerSec=11.920063528566384, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:04:44,053] [INFO] [timer.py:197:stop] 0/3024, RunningAvgSamplesPerSec=11.997370634341676, CurrSamplesPerSec=11.976867353680998, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:04:50,573] [INFO] [timer.py:197:stop] 0/3025, RunningAvgSamplesPerSec=11.997350914258075, CurrSamplesPerSec=11.938051475468788, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 4.404444444444445e-06, 'epoch': 79.61} [2022-12-20 00:04:57,106] [INFO] [timer.py:197:stop] 0/3026, RunningAvgSamplesPerSec=11.997302444055123, CurrSamplesPerSec=11.852545549886356, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:05:03,633] [INFO] [timer.py:197:stop] 0/3027, RunningAvgSamplesPerSec=11.997263829946233, CurrSamplesPerSec=11.881620688640188, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:05:10,182] [INFO] [timer.py:197:stop] 0/3028, RunningAvgSamplesPerSec=11.997250156594333, CurrSamplesPerSec=11.956030423811129, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:05:16,679] [INFO] [timer.py:197:stop] 0/3029, RunningAvgSamplesPerSec=11.99722565044906, CurrSamplesPerSec=11.923525749059635, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:05:23,178] [INFO] [logging.py:68:log_dist] [Rank 0] step=3030, skipped=6, lr=[4.393333333333334e-06], mom=[[0.9, 0.999]] [2022-12-20 00:05:23,179] [INFO] [timer.py:197:stop] 0/3030, RunningAvgSamplesPerSec=11.997165522725112, CurrSamplesPerSec=11.817879710403336, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:05:29,674] [INFO] [timer.py:197:stop] 0/3031, RunningAvgSamplesPerSec=11.997139483820868, CurrSamplesPerSec=11.918808645980171, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:05:36,219] [INFO] [timer.py:197:stop] 0/3032, RunningAvgSamplesPerSec=11.997108491813266, CurrSamplesPerSec=11.903962786493288, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:05:42,649] [INFO] [timer.py:197:stop] 0/3033, RunningAvgSamplesPerSec=11.997114271627051, CurrSamplesPerSec=12.014652717665204, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:05:49,128] [INFO] [timer.py:197:stop] 0/3034, RunningAvgSamplesPerSec=11.997069890379871, CurrSamplesPerSec=11.864042419575489, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:05:55,588] [INFO] [timer.py:197:stop] 0/3035, RunningAvgSamplesPerSec=11.99707390665435, CurrSamplesPerSec=12.009263627841456, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:06:02,321] [INFO] [timer.py:197:stop] 0/3036, RunningAvgSamplesPerSec=11.99704383156059, CurrSamplesPerSec=11.906514624649022, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:06:08,805] [INFO] [timer.py:197:stop] 0/3037, RunningAvgSamplesPerSec=11.997048722948326, CurrSamplesPerSec=12.011907579953432, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:06:15,342] [INFO] [timer.py:197:stop] 0/3038, RunningAvgSamplesPerSec=11.99705168575018, CurrSamplesPerSec=12.006050536470763, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:06:21,963] [INFO] [timer.py:197:stop] 0/3039, RunningAvgSamplesPerSec=11.997056362126635, CurrSamplesPerSec=12.011270667988567, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:06:26,612] [INFO] [logging.py:68:log_dist] [Rank 0] step=3040, skipped=6, lr=[4.371111111111112e-06], mom=[[0.9, 0.999]] [2022-12-20 00:06:26,612] [INFO] [timer.py:197:stop] 0/3040, RunningAvgSamplesPerSec=11.998137335855523, CurrSamplesPerSec=16.518250101640895, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:06:33,340] [INFO] [timer.py:197:stop] 0/3041, RunningAvgSamplesPerSec=11.998108421022623, CurrSamplesPerSec=11.910903830779949, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:06:39,755] [INFO] [timer.py:197:stop] 0/3042, RunningAvgSamplesPerSec=11.998082581210264, CurrSamplesPerSec=11.92006617516062, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:06:46,253] [INFO] [timer.py:197:stop] 0/3043, RunningAvgSamplesPerSec=11.998057405340418, CurrSamplesPerSec=11.922008033412745, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:06:52,890] [INFO] [timer.py:197:stop] 0/3044, RunningAvgSamplesPerSec=11.998025303763194, CurrSamplesPerSec=11.901192537944782, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:06:59,413] [INFO] [timer.py:197:stop] 0/3045, RunningAvgSamplesPerSec=11.998001244135432, CurrSamplesPerSec=11.925255759395407, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:07:05,855] [INFO] [timer.py:197:stop] 0/3046, RunningAvgSamplesPerSec=11.99800772684563, CurrSamplesPerSec=12.017767112647208, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:07:12,322] [INFO] [timer.py:197:stop] 0/3047, RunningAvgSamplesPerSec=11.998007259585858, CurrSamplesPerSec=11.996585089490345, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:07:18,820] [INFO] [timer.py:197:stop] 0/3048, RunningAvgSamplesPerSec=11.998001662743752, CurrSamplesPerSec=11.980983459723303, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:07:25,491] [INFO] [timer.py:197:stop] 0/3049, RunningAvgSamplesPerSec=11.99800546412814, CurrSamplesPerSec=12.009595670101506, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:07:31,971] [INFO] [logging.py:68:log_dist] [Rank 0] step=3050, skipped=6, lr=[4.348888888888889e-06], mom=[[0.9, 0.999]] [2022-12-20 00:07:31,972] [INFO] [timer.py:197:stop] 0/3050, RunningAvgSamplesPerSec=11.997974449253661, CurrSamplesPerSec=11.904210899854553, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 4.348888888888889e-06, 'epoch': 80.26} [2022-12-20 00:07:38,422] [INFO] [timer.py:197:stop] 0/3051, RunningAvgSamplesPerSec=11.99796858044156, CurrSamplesPerSec=11.980107080156056, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:07:45,097] [INFO] [timer.py:197:stop] 0/3052, RunningAvgSamplesPerSec=11.997975398126133, CurrSamplesPerSec=12.01879860752581, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:07:51,564] [INFO] [timer.py:197:stop] 0/3053, RunningAvgSamplesPerSec=11.997947642009834, CurrSamplesPerSec=11.913884819490317, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:07:58,052] [INFO] [timer.py:197:stop] 0/3054, RunningAvgSamplesPerSec=11.997948948368352, CurrSamplesPerSec=12.00193597312136, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:08:04,558] [INFO] [timer.py:197:stop] 0/3055, RunningAvgSamplesPerSec=11.99793341699228, CurrSamplesPerSec=11.9507182572705, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:08:11,108] [INFO] [timer.py:197:stop] 0/3056, RunningAvgSamplesPerSec=11.997890698561147, CurrSamplesPerSec=11.868874221495407, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:08:17,758] [INFO] [timer.py:197:stop] 0/3057, RunningAvgSamplesPerSec=11.997868891621248, CurrSamplesPerSec=11.931638254020273, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:08:24,253] [INFO] [timer.py:197:stop] 0/3058, RunningAvgSamplesPerSec=11.997832847556127, CurrSamplesPerSec=11.888719980739628, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:08:30,722] [INFO] [timer.py:197:stop] 0/3059, RunningAvgSamplesPerSec=11.997819979884229, CurrSamplesPerSec=11.958624880980844, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:08:37,226] [INFO] [logging.py:68:log_dist] [Rank 0] step=3060, skipped=6, lr=[4.326666666666667e-06], mom=[[0.9, 0.999]] [2022-12-20 00:08:37,227] [INFO] [timer.py:197:stop] 0/3060, RunningAvgSamplesPerSec=11.997775450148701, CurrSamplesPerSec=11.863175723900637, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:08:43,631] [INFO] [timer.py:197:stop] 0/3061, RunningAvgSamplesPerSec=11.997755524341741, CurrSamplesPerSec=11.937130404729112, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:08:50,079] [INFO] [timer.py:197:stop] 0/3062, RunningAvgSamplesPerSec=11.997753087270247, CurrSamplesPerSec=11.990302716494535, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:08:56,600] [INFO] [timer.py:197:stop] 0/3063, RunningAvgSamplesPerSec=11.997726135428254, CurrSamplesPerSec=11.915816730002883, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:09:03,097] [INFO] [timer.py:197:stop] 0/3064, RunningAvgSamplesPerSec=11.997729221152065, CurrSamplesPerSec=12.007182066072735, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:09:09,542] [INFO] [timer.py:197:stop] 0/3065, RunningAvgSamplesPerSec=11.99773270069566, CurrSamplesPerSec=12.008396536097822, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:09:15,993] [INFO] [timer.py:197:stop] 0/3066, RunningAvgSamplesPerSec=11.997702264422797, CurrSamplesPerSec=11.905195008745437, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:09:22,623] [INFO] [timer.py:197:stop] 0/3067, RunningAvgSamplesPerSec=11.997563490227485, CurrSamplesPerSec=11.586917749577363, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:09:29,181] [INFO] [timer.py:197:stop] 0/3068, RunningAvgSamplesPerSec=11.997525467981845, CurrSamplesPerSec=11.882108753235222, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:09:35,717] [INFO] [timer.py:197:stop] 0/3069, RunningAvgSamplesPerSec=11.997492224696185, CurrSamplesPerSec=11.896427182386963, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:09:42,199] [INFO] [logging.py:68:log_dist] [Rank 0] step=3070, skipped=6, lr=[4.304444444444445e-06], mom=[[0.9, 0.999]] [2022-12-20 00:09:42,200] [INFO] [timer.py:197:stop] 0/3070, RunningAvgSamplesPerSec=11.99748728376095, CurrSamplesPerSec=11.982352558079617, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:09:48,625] [INFO] [timer.py:197:stop] 0/3071, RunningAvgSamplesPerSec=11.997445900545294, CurrSamplesPerSec=11.871812155068021, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:09:55,107] [INFO] [timer.py:197:stop] 0/3072, RunningAvgSamplesPerSec=11.997425925759204, CurrSamplesPerSec=11.936435050669344, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:10:01,615] [INFO] [timer.py:197:stop] 0/3073, RunningAvgSamplesPerSec=11.997382697367696, CurrSamplesPerSec=11.866123950137034, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:10:08,130] [INFO] [timer.py:197:stop] 0/3074, RunningAvgSamplesPerSec=11.997378705728275, CurrSamplesPerSec=11.985132897299996, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:10:14,603] [INFO] [timer.py:197:stop] 0/3075, RunningAvgSamplesPerSec=11.997348569277566, CurrSamplesPerSec=11.905478550993047, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 4.2933333333333334e-06, 'epoch': 80.92} [2022-12-20 00:10:21,197] [INFO] [timer.py:197:stop] 0/3076, RunningAvgSamplesPerSec=11.997264571495556, CurrSamplesPerSec=11.744577797506995, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:10:27,679] [INFO] [timer.py:197:stop] 0/3077, RunningAvgSamplesPerSec=11.997239792555373, CurrSamplesPerSec=11.92155004091385, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:10:32,355] [INFO] [timer.py:197:stop] 0/3078, RunningAvgSamplesPerSec=11.998321977428954, CurrSamplesPerSec=16.603770933358767, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:10:38,822] [INFO] [timer.py:197:stop] 0/3079, RunningAvgSamplesPerSec=11.99826974277214, CurrSamplesPerSec=11.839719836188856, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:10:45,474] [INFO] [logging.py:68:log_dist] [Rank 0] step=3080, skipped=6, lr=[4.282222222222222e-06], mom=[[0.9, 0.999]] [2022-12-20 00:10:45,475] [INFO] [timer.py:197:stop] 0/3080, RunningAvgSamplesPerSec=11.998235088276056, CurrSamplesPerSec=11.892542827657197, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:10:51,863] [INFO] [timer.py:197:stop] 0/3081, RunningAvgSamplesPerSec=11.998233277704008, CurrSamplesPerSec=11.992662925098601, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:10:58,409] [INFO] [timer.py:197:stop] 0/3082, RunningAvgSamplesPerSec=11.998200913796575, CurrSamplesPerSec=11.899373499171432, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:11:04,975] [INFO] [timer.py:197:stop] 0/3083, RunningAvgSamplesPerSec=11.99817224843362, CurrSamplesPerSec=11.910528075183663, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:11:11,450] [INFO] [timer.py:197:stop] 0/3084, RunningAvgSamplesPerSec=11.99817536240374, CurrSamplesPerSec=12.007777184774522, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:11:17,920] [INFO] [timer.py:197:stop] 0/3085, RunningAvgSamplesPerSec=11.998151243691984, CurrSamplesPerSec=11.92427521732164, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:11:24,438] [INFO] [timer.py:197:stop] 0/3086, RunningAvgSamplesPerSec=11.998118886028577, CurrSamplesPerSec=11.899183080922818, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:11:31,061] [INFO] [timer.py:197:stop] 0/3087, RunningAvgSamplesPerSec=11.998112654051345, CurrSamplesPerSec=11.978923983997664, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:11:37,611] [INFO] [timer.py:197:stop] 0/3088, RunningAvgSamplesPerSec=11.998074897861684, CurrSamplesPerSec=11.882717312346257, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:11:44,116] [INFO] [timer.py:197:stop] 0/3089, RunningAvgSamplesPerSec=11.998053020731357, CurrSamplesPerSec=11.930918085998378, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:11:50,544] [INFO] [logging.py:68:log_dist] [Rank 0] step=3090, skipped=6, lr=[4.26e-06], mom=[[0.9, 0.999]] [2022-12-20 00:11:50,545] [INFO] [timer.py:197:stop] 0/3090, RunningAvgSamplesPerSec=11.998053744441863, CurrSamplesPerSec=12.000288254986213, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:11:57,014] [INFO] [timer.py:197:stop] 0/3091, RunningAvgSamplesPerSec=11.998031296408055, CurrSamplesPerSec=11.92911009438706, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:12:03,386] [INFO] [timer.py:197:stop] 0/3092, RunningAvgSamplesPerSec=11.99803566964251, CurrSamplesPerSec=12.011559823023825, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:12:09,964] [INFO] [timer.py:197:stop] 0/3093, RunningAvgSamplesPerSec=11.998008387567113, CurrSamplesPerSec=11.914295159245214, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:12:16,450] [INFO] [timer.py:197:stop] 0/3094, RunningAvgSamplesPerSec=11.997938592618484, CurrSamplesPerSec=11.786014272541134, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:12:22,951] [INFO] [timer.py:197:stop] 0/3095, RunningAvgSamplesPerSec=11.997939563141582, CurrSamplesPerSec=12.00094117154756, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:12:29,420] [INFO] [timer.py:197:stop] 0/3096, RunningAvgSamplesPerSec=11.997942465402677, CurrSamplesPerSec=12.006925882412615, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:12:35,874] [INFO] [timer.py:197:stop] 0/3097, RunningAvgSamplesPerSec=11.997955032998016, CurrSamplesPerSec=12.036965643217295, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:12:42,400] [INFO] [timer.py:197:stop] 0/3098, RunningAvgSamplesPerSec=11.997928868618583, CurrSamplesPerSec=11.917493181865606, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:12:49,006] [INFO] [timer.py:197:stop] 0/3099, RunningAvgSamplesPerSec=11.99789219945424, CurrSamplesPerSec=11.885428971892106, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:12:55,519] [INFO] [logging.py:68:log_dist] [Rank 0] step=3100, skipped=6, lr=[4.2377777777777775e-06], mom=[[0.9, 0.999]] [2022-12-20 00:12:55,519] [INFO] [timer.py:197:stop] 0/3100, RunningAvgSamplesPerSec=11.997879863086059, CurrSamplesPerSec=11.959795445006911, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 4.2377777777777775e-06, 'epoch': 81.58} [2022-12-20 00:13:02,025] [INFO] [timer.py:197:stop] 0/3101, RunningAvgSamplesPerSec=11.997850150778877, CurrSamplesPerSec=11.906502477990738, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:13:08,551] [INFO] [timer.py:197:stop] 0/3102, RunningAvgSamplesPerSec=11.997818912359017, CurrSamplesPerSec=11.901786167162715, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:13:15,034] [INFO] [timer.py:197:stop] 0/3103, RunningAvgSamplesPerSec=11.997817154713843, CurrSamplesPerSec=11.992370928828514, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:13:21,633] [INFO] [timer.py:197:stop] 0/3104, RunningAvgSamplesPerSec=11.99778501875676, CurrSamplesPerSec=11.898952582775562, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:13:28,161] [INFO] [timer.py:197:stop] 0/3105, RunningAvgSamplesPerSec=11.997759525909185, CurrSamplesPerSec=11.919198684514509, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:13:34,629] [INFO] [timer.py:197:stop] 0/3106, RunningAvgSamplesPerSec=11.997746015654709, CurrSamplesPerSec=11.955969717088035, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:13:41,080] [INFO] [timer.py:197:stop] 0/3107, RunningAvgSamplesPerSec=11.997722259972198, CurrSamplesPerSec=11.924435186705816, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:13:47,502] [INFO] [timer.py:197:stop] 0/3108, RunningAvgSamplesPerSec=11.997729913224028, CurrSamplesPerSec=12.021540435742986, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:13:53,945] [INFO] [timer.py:197:stop] 0/3109, RunningAvgSamplesPerSec=11.997705163078567, CurrSamplesPerSec=11.92132079314453, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:14:00,415] [INFO] [logging.py:68:log_dist] [Rank 0] step=3110, skipped=6, lr=[4.215555555555556e-06], mom=[[0.9, 0.999]] [2022-12-20 00:14:00,416] [INFO] [timer.py:197:stop] 0/3110, RunningAvgSamplesPerSec=11.99771375564513, CurrSamplesPerSec=12.024470417572974, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:14:06,934] [INFO] [timer.py:197:stop] 0/3111, RunningAvgSamplesPerSec=11.997693937145652, CurrSamplesPerSec=11.936412758215338, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:14:13,324] [INFO] [timer.py:197:stop] 0/3112, RunningAvgSamplesPerSec=11.997700748716818, CurrSamplesPerSec=12.018915381534937, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:14:19,864] [INFO] [timer.py:197:stop] 0/3113, RunningAvgSamplesPerSec=11.997680150845516, CurrSamplesPerSec=11.933961096513254, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:14:26,341] [INFO] [timer.py:197:stop] 0/3114, RunningAvgSamplesPerSec=11.997662454047306, CurrSamplesPerSec=11.942859275864476, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:14:32,845] [INFO] [timer.py:197:stop] 0/3115, RunningAvgSamplesPerSec=11.997638241915002, CurrSamplesPerSec=11.922760488233848, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:14:37,455] [INFO] [timer.py:197:stop] 0/3116, RunningAvgSamplesPerSec=11.998715393867885, CurrSamplesPerSec=16.65299816425566, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:14:43,897] [INFO] [timer.py:197:stop] 0/3117, RunningAvgSamplesPerSec=11.99872597893435, CurrSamplesPerSec=12.03177870475378, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:14:50,290] [INFO] [timer.py:197:stop] 0/3118, RunningAvgSamplesPerSec=11.998702840549317, CurrSamplesPerSec=11.92705728336811, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:14:56,749] [INFO] [timer.py:197:stop] 0/3119, RunningAvgSamplesPerSec=11.99871296634061, CurrSamplesPerSec=12.030348146977275, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:15:03,330] [INFO] [logging.py:68:log_dist] [Rank 0] step=3120, skipped=6, lr=[4.1933333333333336e-06], mom=[[0.9, 0.999]] [2022-12-20 00:15:03,331] [INFO] [timer.py:197:stop] 0/3120, RunningAvgSamplesPerSec=11.998653435192747, CurrSamplesPerSec=11.815921691386881, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:15:09,762] [INFO] [timer.py:197:stop] 0/3121, RunningAvgSamplesPerSec=11.998633636349197, CurrSamplesPerSec=11.937216931550571, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:15:16,180] [INFO] [timer.py:197:stop] 0/3122, RunningAvgSamplesPerSec=11.998627865021374, CurrSamplesPerSec=11.980654067129128, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:15:22,645] [INFO] [timer.py:197:stop] 0/3123, RunningAvgSamplesPerSec=11.99860409337525, CurrSamplesPerSec=11.92489234141877, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:15:29,112] [INFO] [timer.py:197:stop] 0/3124, RunningAvgSamplesPerSec=11.99857110357868, CurrSamplesPerSec=11.896486231503053, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:15:35,576] [INFO] [timer.py:197:stop] 0/3125, RunningAvgSamplesPerSec=11.998545168618035, CurrSamplesPerSec=11.91811912877625, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 4.182222222222222e-06, 'epoch': 82.24} [2022-12-20 00:15:42,118] [INFO] [timer.py:197:stop] 0/3126, RunningAvgSamplesPerSec=11.998516335178442, CurrSamplesPerSec=11.909140468835181, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:15:48,617] [INFO] [timer.py:197:stop] 0/3127, RunningAvgSamplesPerSec=11.998482513202886, CurrSamplesPerSec=11.89374528345024, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:15:55,058] [INFO] [timer.py:197:stop] 0/3128, RunningAvgSamplesPerSec=11.998484438568223, CurrSamplesPerSec=12.004504224902139, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:16:01,520] [INFO] [timer.py:197:stop] 0/3129, RunningAvgSamplesPerSec=11.998486100981058, CurrSamplesPerSec=12.003685055964038, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:16:08,052] [INFO] [logging.py:68:log_dist] [Rank 0] step=3130, skipped=6, lr=[4.171111111111111e-06], mom=[[0.9, 0.999]] [2022-12-20 00:16:08,053] [INFO] [timer.py:197:stop] 0/3130, RunningAvgSamplesPerSec=11.99845334570805, CurrSamplesPerSec=11.896894846359976, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:16:14,465] [INFO] [timer.py:197:stop] 0/3131, RunningAvgSamplesPerSec=11.998448767483762, CurrSamplesPerSec=11.984145159366292, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:16:20,972] [INFO] [timer.py:197:stop] 0/3132, RunningAvgSamplesPerSec=11.998423384707266, CurrSamplesPerSec=11.919523118644777, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:16:27,424] [INFO] [timer.py:197:stop] 0/3133, RunningAvgSamplesPerSec=11.998429170925798, CurrSamplesPerSec=12.016567422213747, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:16:33,870] [INFO] [timer.py:197:stop] 0/3134, RunningAvgSamplesPerSec=11.998434121776178, CurrSamplesPerSec=12.013955292954321, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:16:40,415] [INFO] [timer.py:197:stop] 0/3135, RunningAvgSamplesPerSec=11.998400815989433, CurrSamplesPerSec=11.894986460162, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:16:46,866] [INFO] [timer.py:197:stop] 0/3136, RunningAvgSamplesPerSec=11.998390572386823, CurrSamplesPerSec=11.96638300634295, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:16:53,402] [INFO] [timer.py:197:stop] 0/3137, RunningAvgSamplesPerSec=11.998356161819338, CurrSamplesPerSec=11.891474416285021, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:16:59,930] [INFO] [timer.py:197:stop] 0/3138, RunningAvgSamplesPerSec=11.998320423772693, CurrSamplesPerSec=11.887318499883932, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:17:06,461] [INFO] [timer.py:197:stop] 0/3139, RunningAvgSamplesPerSec=11.998279972223095, CurrSamplesPerSec=11.872751529763852, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:17:12,913] [INFO] [logging.py:68:log_dist] [Rank 0] step=3140, skipped=6, lr=[4.148888888888889e-06], mom=[[0.9, 0.999]] [2022-12-20 00:17:12,914] [INFO] [timer.py:197:stop] 0/3140, RunningAvgSamplesPerSec=11.998278013380478, CurrSamplesPerSec=11.992136270571905, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:17:19,401] [INFO] [timer.py:197:stop] 0/3141, RunningAvgSamplesPerSec=11.99825169600599, CurrSamplesPerSec=11.916232492632243, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:17:25,868] [INFO] [timer.py:197:stop] 0/3142, RunningAvgSamplesPerSec=11.998253989204757, CurrSamplesPerSec=12.005456662761086, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:17:32,332] [INFO] [timer.py:197:stop] 0/3143, RunningAvgSamplesPerSec=11.998245892782629, CurrSamplesPerSec=11.972876898107103, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:17:38,821] [INFO] [timer.py:197:stop] 0/3144, RunningAvgSamplesPerSec=11.998229822979214, CurrSamplesPerSec=11.947966091842442, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:17:45,270] [INFO] [timer.py:197:stop] 0/3145, RunningAvgSamplesPerSec=11.99821467817847, CurrSamplesPerSec=11.95081775046133, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:17:51,684] [INFO] [timer.py:197:stop] 0/3146, RunningAvgSamplesPerSec=11.998194475505239, CurrSamplesPerSec=11.935031849965698, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:17:58,228] [INFO] [timer.py:197:stop] 0/3147, RunningAvgSamplesPerSec=11.99815482935327, CurrSamplesPerSec=11.874789365345377, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:18:04,730] [INFO] [timer.py:197:stop] 0/3148, RunningAvgSamplesPerSec=11.998138164140444, CurrSamplesPerSec=11.945954100663679, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:18:11,257] [INFO] [timer.py:197:stop] 0/3149, RunningAvgSamplesPerSec=11.998114410203014, CurrSamplesPerSec=11.923847240866813, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:18:17,690] [INFO] [logging.py:68:log_dist] [Rank 0] step=3150, skipped=6, lr=[4.126666666666667e-06], mom=[[0.9, 0.999]] [2022-12-20 00:18:17,690] [INFO] [timer.py:197:stop] 0/3150, RunningAvgSamplesPerSec=11.998114369297681, CurrSamplesPerSec=11.997985641595365, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 4.126666666666667e-06, 'epoch': 82.89} [2022-12-20 00:18:24,273] [INFO] [timer.py:197:stop] 0/3151, RunningAvgSamplesPerSec=11.998061607831989, CurrSamplesPerSec=11.834237110386946, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:18:30,729] [INFO] [timer.py:197:stop] 0/3152, RunningAvgSamplesPerSec=11.998048414337394, CurrSamplesPerSec=11.956645513534012, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:18:37,146] [INFO] [timer.py:197:stop] 0/3153, RunningAvgSamplesPerSec=11.99803847152118, CurrSamplesPerSec=11.96680017129344, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:18:41,778] [INFO] [timer.py:197:stop] 0/3154, RunningAvgSamplesPerSec=11.999073694153315, CurrSamplesPerSec=16.47945187398617, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:18:48,250] [INFO] [timer.py:197:stop] 0/3155, RunningAvgSamplesPerSec=11.999023885819987, CurrSamplesPerSec=11.844056267014482, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:18:54,730] [INFO] [timer.py:197:stop] 0/3156, RunningAvgSamplesPerSec=11.998981638456286, CurrSamplesPerSec=11.867238700479113, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:19:01,130] [INFO] [timer.py:197:stop] 0/3157, RunningAvgSamplesPerSec=11.998961707127071, CurrSamplesPerSec=11.936426027523169, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:19:07,617] [INFO] [timer.py:197:stop] 0/3158, RunningAvgSamplesPerSec=11.998916667750956, CurrSamplesPerSec=11.85848109564254, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:19:14,175] [INFO] [timer.py:197:stop] 0/3159, RunningAvgSamplesPerSec=11.99887815891762, CurrSamplesPerSec=11.878563308495194, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:19:20,639] [INFO] [logging.py:68:log_dist] [Rank 0] step=3160, skipped=6, lr=[4.104444444444445e-06], mom=[[0.9, 0.999]] [2022-12-20 00:19:20,640] [INFO] [timer.py:197:stop] 0/3160, RunningAvgSamplesPerSec=11.998838169983848, CurrSamplesPerSec=11.873907968135773, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:19:27,102] [INFO] [timer.py:197:stop] 0/3161, RunningAvgSamplesPerSec=11.998833047245903, CurrSamplesPerSec=11.982677230028086, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:19:33,608] [INFO] [timer.py:197:stop] 0/3162, RunningAvgSamplesPerSec=11.998808567207588, CurrSamplesPerSec=11.921971498615497, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:19:40,129] [INFO] [timer.py:197:stop] 0/3163, RunningAvgSamplesPerSec=11.998781773162161, CurrSamplesPerSec=11.914706056110054, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:19:46,570] [INFO] [timer.py:197:stop] 0/3164, RunningAvgSamplesPerSec=11.99877471916811, CurrSamplesPerSec=11.976518416595727, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:19:53,578] [INFO] [timer.py:197:stop] 0/3165, RunningAvgSamplesPerSec=11.998740932985502, CurrSamplesPerSec=11.892852112733777, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:20:00,125] [INFO] [timer.py:197:stop] 0/3166, RunningAvgSamplesPerSec=11.998705932629715, CurrSamplesPerSec=11.889012216915342, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:20:07,162] [INFO] [timer.py:197:stop] 0/3167, RunningAvgSamplesPerSec=11.998653077697263, CurrSamplesPerSec=11.833719583003768, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:20:14,078] [INFO] [timer.py:197:stop] 0/3168, RunningAvgSamplesPerSec=11.99861595415069, CurrSamplesPerSec=11.882259703744099, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:20:20,996] [INFO] [timer.py:197:stop] 0/3169, RunningAvgSamplesPerSec=11.998617892767134, CurrSamplesPerSec=12.00475869463078, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:20:27,554] [INFO] [logging.py:68:log_dist] [Rank 0] step=3170, skipped=6, lr=[4.0822222222222225e-06], mom=[[0.9, 0.999]] [2022-12-20 00:20:27,555] [INFO] [timer.py:197:stop] 0/3170, RunningAvgSamplesPerSec=11.998630710763868, CurrSamplesPerSec=12.039363158754679, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:20:34,029] [INFO] [timer.py:197:stop] 0/3171, RunningAvgSamplesPerSec=11.998637575311493, CurrSamplesPerSec=12.020423961314393, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:20:40,480] [INFO] [timer.py:197:stop] 0/3172, RunningAvgSamplesPerSec=11.99861695449692, CurrSamplesPerSec=11.93362367389341, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:20:47,016] [INFO] [timer.py:197:stop] 0/3173, RunningAvgSamplesPerSec=11.998569602207356, CurrSamplesPerSec=11.850318112968676, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:20:53,550] [INFO] [timer.py:197:stop] 0/3174, RunningAvgSamplesPerSec=11.99853536547589, CurrSamplesPerSec=11.890944496469068, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:21:00,096] [INFO] [timer.py:197:stop] 0/3175, RunningAvgSamplesPerSec=11.99853944129502, CurrSamplesPerSec=12.011481889534199, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 4.071111111111111e-06, 'epoch': 83.55} [2022-12-20 00:21:06,803] [INFO] [timer.py:197:stop] 0/3176, RunningAvgSamplesPerSec=11.99850536355186, CurrSamplesPerSec=11.891342722059656, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:21:13,295] [INFO] [timer.py:197:stop] 0/3177, RunningAvgSamplesPerSec=11.998491049104064, CurrSamplesPerSec=11.953228439351012, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:21:19,709] [INFO] [timer.py:197:stop] 0/3178, RunningAvgSamplesPerSec=11.998466864949597, CurrSamplesPerSec=11.922170589500558, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:21:26,270] [INFO] [timer.py:197:stop] 0/3179, RunningAvgSamplesPerSec=11.998448542949014, CurrSamplesPerSec=11.940538811033326, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:21:32,817] [INFO] [logging.py:68:log_dist] [Rank 0] step=3180, skipped=6, lr=[4.060000000000001e-06], mom=[[0.9, 0.999]] [2022-12-20 00:21:32,818] [INFO] [timer.py:197:stop] 0/3180, RunningAvgSamplesPerSec=11.998426862848065, CurrSamplesPerSec=11.929942444741863, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:21:39,389] [INFO] [timer.py:197:stop] 0/3181, RunningAvgSamplesPerSec=11.998406143381539, CurrSamplesPerSec=11.932919179998446, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:21:45,865] [INFO] [timer.py:197:stop] 0/3182, RunningAvgSamplesPerSec=11.998413249333415, CurrSamplesPerSec=12.021045694629281, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:21:52,323] [INFO] [timer.py:197:stop] 0/3183, RunningAvgSamplesPerSec=11.998409918684434, CurrSamplesPerSec=11.987827799110722, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:21:58,762] [INFO] [timer.py:197:stop] 0/3184, RunningAvgSamplesPerSec=11.998406747717123, CurrSamplesPerSec=11.988328376072957, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:22:05,264] [INFO] [timer.py:197:stop] 0/3185, RunningAvgSamplesPerSec=11.998382563034362, CurrSamplesPerSec=11.921917491064518, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:22:11,808] [INFO] [timer.py:197:stop] 0/3186, RunningAvgSamplesPerSec=11.998356701717357, CurrSamplesPerSec=11.916601202085964, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:22:18,390] [INFO] [timer.py:197:stop] 0/3187, RunningAvgSamplesPerSec=11.998355018590667, CurrSamplesPerSec=11.992998336526353, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:22:24,994] [INFO] [timer.py:197:stop] 0/3188, RunningAvgSamplesPerSec=11.998330474019864, CurrSamplesPerSec=11.92066221811735, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:22:31,466] [INFO] [timer.py:197:stop] 0/3189, RunningAvgSamplesPerSec=11.998326946132774, CurrSamplesPerSec=11.987097620629804, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:22:38,026] [INFO] [logging.py:68:log_dist] [Rank 0] step=3190, skipped=6, lr=[4.0377777777777786e-06], mom=[[0.9, 0.999]] [2022-12-20 00:22:38,026] [INFO] [timer.py:197:stop] 0/3190, RunningAvgSamplesPerSec=11.998304613432792, CurrSamplesPerSec=11.927550147968288, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:22:44,499] [INFO] [timer.py:197:stop] 0/3191, RunningAvgSamplesPerSec=11.998263745903378, CurrSamplesPerSec=11.869378033885871, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:22:49,178] [INFO] [timer.py:197:stop] 0/3192, RunningAvgSamplesPerSec=11.999287619551385, CurrSamplesPerSec=16.485567760060796, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:22:55,657] [INFO] [timer.py:197:stop] 0/3193, RunningAvgSamplesPerSec=11.999256484048518, CurrSamplesPerSec=11.9007498610137, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:23:02,220] [INFO] [timer.py:197:stop] 0/3194, RunningAvgSamplesPerSec=11.99924893639092, CurrSamplesPerSec=11.975212621170558, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:23:08,683] [INFO] [timer.py:197:stop] 0/3195, RunningAvgSamplesPerSec=11.999216689745047, CurrSamplesPerSec=11.897161120178076, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:23:15,254] [INFO] [timer.py:197:stop] 0/3196, RunningAvgSamplesPerSec=11.999220703571325, CurrSamplesPerSec=12.012050558504473, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:23:21,859] [INFO] [timer.py:197:stop] 0/3197, RunningAvgSamplesPerSec=11.999186232677298, CurrSamplesPerSec=11.890087559105398, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:23:28,425] [INFO] [timer.py:197:stop] 0/3198, RunningAvgSamplesPerSec=11.999183122947379, CurrSamplesPerSec=11.989255758504324, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:23:34,983] [INFO] [timer.py:197:stop] 0/3199, RunningAvgSamplesPerSec=11.999160435817041, CurrSamplesPerSec=11.927088019969457, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:23:41,498] [INFO] [logging.py:68:log_dist] [Rank 0] step=3200, skipped=6, lr=[4.015555555555556e-06], mom=[[0.9, 0.999]] [2022-12-20 00:23:41,498] [INFO] [timer.py:197:stop] 0/3200, RunningAvgSamplesPerSec=11.999138066531769, CurrSamplesPerSec=11.928047293526289, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 4.015555555555556e-06, 'epoch': 84.21} [2022-12-20 00:23:47,978] [INFO] [timer.py:197:stop] 0/3201, RunningAvgSamplesPerSec=11.999143347892405, CurrSamplesPerSec=12.016056953958573, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:23:54,439] [INFO] [timer.py:197:stop] 0/3202, RunningAvgSamplesPerSec=11.9991435651811, CurrSamplesPerSec=11.999838711999086, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:24:00,923] [INFO] [timer.py:197:stop] 0/3203, RunningAvgSamplesPerSec=11.999135214906408, CurrSamplesPerSec=11.972473726924143, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:24:07,437] [INFO] [timer.py:197:stop] 0/3204, RunningAvgSamplesPerSec=11.999099828275861, CurrSamplesPerSec=11.886886855592428, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:24:14,084] [INFO] [timer.py:197:stop] 0/3205, RunningAvgSamplesPerSec=11.999079858269317, CurrSamplesPerSec=11.935474958265814, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:24:20,721] [INFO] [timer.py:197:stop] 0/3206, RunningAvgSamplesPerSec=11.999082173075225, CurrSamplesPerSec=12.006501082030203, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:24:27,173] [INFO] [timer.py:197:stop] 0/3207, RunningAvgSamplesPerSec=11.99906162901218, CurrSamplesPerSec=11.933597678288251, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:24:33,663] [INFO] [timer.py:197:stop] 0/3208, RunningAvgSamplesPerSec=11.999067205684955, CurrSamplesPerSec=12.016967113090555, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:24:40,174] [INFO] [timer.py:197:stop] 0/3209, RunningAvgSamplesPerSec=11.999032625355387, CurrSamplesPerSec=11.889183352877325, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:24:46,865] [INFO] [logging.py:68:log_dist] [Rank 0] step=3210, skipped=6, lr=[3.993333333333334e-06], mom=[[0.9, 0.999]] [2022-12-20 00:24:46,866] [INFO] [timer.py:197:stop] 0/3210, RunningAvgSamplesPerSec=11.998957641377151, CurrSamplesPerSec=11.76321016094933, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:24:53,399] [INFO] [timer.py:197:stop] 0/3211, RunningAvgSamplesPerSec=11.998923382286616, CurrSamplesPerSec=11.890018040409117, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:24:59,900] [INFO] [timer.py:197:stop] 0/3212, RunningAvgSamplesPerSec=11.99889070129977, CurrSamplesPerSec=11.894926371756064, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:25:06,363] [INFO] [timer.py:197:stop] 0/3213, RunningAvgSamplesPerSec=11.998884782560037, CurrSamplesPerSec=11.97991567324749, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:25:12,853] [INFO] [timer.py:197:stop] 0/3214, RunningAvgSamplesPerSec=11.998866547698187, CurrSamplesPerSec=11.940598830011462, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:25:19,424] [INFO] [timer.py:197:stop] 0/3215, RunningAvgSamplesPerSec=11.998807162839796, CurrSamplesPerSec=11.811048709432768, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:25:25,905] [INFO] [timer.py:197:stop] 0/3216, RunningAvgSamplesPerSec=11.998800098399428, CurrSamplesPerSec=11.976144921480282, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:25:32,556] [INFO] [timer.py:197:stop] 0/3217, RunningAvgSamplesPerSec=11.998799578636362, CurrSamplesPerSec=11.997129292753591, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:25:39,054] [INFO] [timer.py:197:stop] 0/3218, RunningAvgSamplesPerSec=11.998761370146893, CurrSamplesPerSec=11.877166323141557, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:25:45,470] [INFO] [timer.py:197:stop] 0/3219, RunningAvgSamplesPerSec=11.998763546738655, CurrSamplesPerSec=12.00576755315283, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:25:51,942] [INFO] [logging.py:68:log_dist] [Rank 0] step=3220, skipped=6, lr=[3.971111111111111e-06], mom=[[0.9, 0.999]] [2022-12-20 00:25:51,942] [INFO] [timer.py:197:stop] 0/3220, RunningAvgSamplesPerSec=11.998756756683195, CurrSamplesPerSec=11.97695285438364, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:25:58,450] [INFO] [timer.py:197:stop] 0/3221, RunningAvgSamplesPerSec=11.998721631234712, CurrSamplesPerSec=11.886743156582417, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:26:05,060] [INFO] [timer.py:197:stop] 0/3222, RunningAvgSamplesPerSec=11.998697284786447, CurrSamplesPerSec=11.92083479585558, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:26:11,700] [INFO] [timer.py:197:stop] 0/3223, RunningAvgSamplesPerSec=11.99867328440617, CurrSamplesPerSec=11.921886781106696, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:26:18,432] [INFO] [timer.py:197:stop] 0/3224, RunningAvgSamplesPerSec=11.998644079629361, CurrSamplesPerSec=11.905307473607436, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:26:24,906] [INFO] [timer.py:197:stop] 0/3225, RunningAvgSamplesPerSec=11.998611102949589, CurrSamplesPerSec=11.893293148409668, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 3.96e-06, 'epoch': 84.87} [2022-12-20 00:26:31,372] [INFO] [timer.py:197:stop] 0/3226, RunningAvgSamplesPerSec=11.998566171165946, CurrSamplesPerSec=11.85547854585351, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:26:37,816] [INFO] [timer.py:197:stop] 0/3227, RunningAvgSamplesPerSec=11.99855860332884, CurrSamplesPerSec=11.974209425170228, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:26:44,398] [INFO] [timer.py:197:stop] 0/3228, RunningAvgSamplesPerSec=11.99853329104307, CurrSamplesPerSec=11.917452971124948, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:26:51,180] [INFO] [timer.py:197:stop] 0/3229, RunningAvgSamplesPerSec=11.9985107216317, CurrSamplesPerSec=11.926141087428881, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:26:55,946] [INFO] [logging.py:68:log_dist] [Rank 0] step=3230, skipped=6, lr=[3.948888888888889e-06], mom=[[0.9, 0.999]] [2022-12-20 00:26:55,947] [INFO] [timer.py:197:stop] 0/3230, RunningAvgSamplesPerSec=11.999543793699166, CurrSamplesPerSec=16.616294457804983, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:27:02,447] [INFO] [timer.py:197:stop] 0/3231, RunningAvgSamplesPerSec=11.999535379035958, CurrSamplesPerSec=11.972434212269269, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:27:08,925] [INFO] [timer.py:197:stop] 0/3232, RunningAvgSamplesPerSec=11.999522127221157, CurrSamplesPerSec=11.95688411082532, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:27:15,476] [INFO] [timer.py:197:stop] 0/3233, RunningAvgSamplesPerSec=11.999530058028428, CurrSamplesPerSec=12.02520138543918, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:27:22,086] [INFO] [timer.py:197:stop] 0/3234, RunningAvgSamplesPerSec=11.999499227468158, CurrSamplesPerSec=11.900706069972673, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:27:28,631] [INFO] [timer.py:197:stop] 0/3235, RunningAvgSamplesPerSec=11.999473119752473, CurrSamplesPerSec=11.915682380343, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:27:35,137] [INFO] [timer.py:197:stop] 0/3236, RunningAvgSamplesPerSec=11.999446977376936, CurrSamplesPerSec=11.915520001136352, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:27:41,661] [INFO] [timer.py:197:stop] 0/3237, RunningAvgSamplesPerSec=11.99939904751852, CurrSamplesPerSec=11.846371270254002, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:27:48,270] [INFO] [timer.py:197:stop] 0/3238, RunningAvgSamplesPerSec=11.999350283252294, CurrSamplesPerSec=11.843645525566254, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:27:55,044] [INFO] [timer.py:197:stop] 0/3239, RunningAvgSamplesPerSec=11.999331820472463, CurrSamplesPerSec=11.939882359654968, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:28:01,472] [INFO] [logging.py:68:log_dist] [Rank 0] step=3240, skipped=6, lr=[3.926666666666667e-06], mom=[[0.9, 0.999]] [2022-12-20 00:28:01,473] [INFO] [timer.py:197:stop] 0/3240, RunningAvgSamplesPerSec=11.999329880018658, CurrSamplesPerSec=11.993051918364554, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:28:07,960] [INFO] [timer.py:197:stop] 0/3241, RunningAvgSamplesPerSec=11.999332847192814, CurrSamplesPerSec=12.008948258424738, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:28:14,461] [INFO] [timer.py:197:stop] 0/3242, RunningAvgSamplesPerSec=11.999300081462009, CurrSamplesPerSec=11.894102589301879, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:28:20,967] [INFO] [timer.py:197:stop] 0/3243, RunningAvgSamplesPerSec=11.999271884324209, CurrSamplesPerSec=11.908603690363256, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:28:27,400] [INFO] [timer.py:197:stop] 0/3244, RunningAvgSamplesPerSec=11.999253012020828, CurrSamplesPerSec=11.93839817464477, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:28:33,912] [INFO] [timer.py:197:stop] 0/3245, RunningAvgSamplesPerSec=11.99922824568839, CurrSamplesPerSec=11.919469662540177, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:28:40,341] [INFO] [timer.py:197:stop] 0/3246, RunningAvgSamplesPerSec=11.999206785867536, CurrSamplesPerSec=11.930014021692514, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:28:46,744] [INFO] [timer.py:197:stop] 0/3247, RunningAvgSamplesPerSec=11.99921575493587, CurrSamplesPerSec=12.028382157006293, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:28:53,180] [INFO] [timer.py:197:stop] 0/3248, RunningAvgSamplesPerSec=11.999220045806418, CurrSamplesPerSec=12.013160101742143, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:28:59,654] [INFO] [timer.py:197:stop] 0/3249, RunningAvgSamplesPerSec=11.99919838569218, CurrSamplesPerSec=11.929299351112277, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:29:06,078] [INFO] [logging.py:68:log_dist] [Rank 0] step=3250, skipped=6, lr=[3.904444444444444e-06], mom=[[0.9, 0.999]] [2022-12-20 00:29:06,079] [INFO] [timer.py:197:stop] 0/3250, RunningAvgSamplesPerSec=11.99920350395858, CurrSamplesPerSec=12.015845571478973, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 3.904444444444444e-06, 'epoch': 85.53} [2022-12-20 00:29:12,523] [INFO] [timer.py:197:stop] 0/3251, RunningAvgSamplesPerSec=11.999213366187213, CurrSamplesPerSec=12.031331652605598, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:29:19,040] [INFO] [timer.py:197:stop] 0/3252, RunningAvgSamplesPerSec=11.999194095929415, CurrSamplesPerSec=11.936910112042275, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:29:25,568] [INFO] [timer.py:197:stop] 0/3253, RunningAvgSamplesPerSec=11.999166696876832, CurrSamplesPerSec=11.910775933808615, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:29:32,066] [INFO] [timer.py:197:stop] 0/3254, RunningAvgSamplesPerSec=11.999147440600948, CurrSamplesPerSec=11.9368703009784, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:29:38,540] [INFO] [timer.py:197:stop] 0/3255, RunningAvgSamplesPerSec=11.999151883670425, CurrSamplesPerSec=12.01361817064102, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:29:45,038] [INFO] [timer.py:197:stop] 0/3256, RunningAvgSamplesPerSec=11.999135599108552, CurrSamplesPerSec=11.946394830404294, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:29:51,515] [INFO] [timer.py:197:stop] 0/3257, RunningAvgSamplesPerSec=11.999100666935703, CurrSamplesPerSec=11.88649840253101, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:29:57,956] [INFO] [timer.py:197:stop] 0/3258, RunningAvgSamplesPerSec=11.999094009359675, CurrSamplesPerSec=11.977462677663238, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:30:04,483] [INFO] [timer.py:197:stop] 0/3259, RunningAvgSamplesPerSec=11.999053080138944, CurrSamplesPerSec=11.867251816435564, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:30:10,918] [INFO] [logging.py:68:log_dist] [Rank 0] step=3260, skipped=6, lr=[3.882222222222223e-06], mom=[[0.9, 0.999]] [2022-12-20 00:30:10,918] [INFO] [timer.py:197:stop] 0/3260, RunningAvgSamplesPerSec=11.999048223896018, CurrSamplesPerSec=11.983252268827771, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:30:17,425] [INFO] [timer.py:197:stop] 0/3261, RunningAvgSamplesPerSec=11.999010671755203, CurrSamplesPerSec=11.877901039710048, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:30:23,876] [INFO] [timer.py:197:stop] 0/3262, RunningAvgSamplesPerSec=11.99901642007187, CurrSamplesPerSec=12.01777948738405, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:30:30,439] [INFO] [timer.py:197:stop] 0/3263, RunningAvgSamplesPerSec=11.998981911615493, CurrSamplesPerSec=11.887529595884022, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:30:36,925] [INFO] [timer.py:197:stop] 0/3264, RunningAvgSamplesPerSec=11.998978115454873, CurrSamplesPerSec=11.986611598051264, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:30:43,393] [INFO] [timer.py:197:stop] 0/3265, RunningAvgSamplesPerSec=11.998952297119954, CurrSamplesPerSec=11.915320074511516, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:30:49,835] [INFO] [timer.py:197:stop] 0/3266, RunningAvgSamplesPerSec=11.99893290819308, CurrSamplesPerSec=11.93599877063001, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:30:56,287] [INFO] [timer.py:197:stop] 0/3267, RunningAvgSamplesPerSec=11.998913267550764, CurrSamplesPerSec=11.935147001892698, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:31:00,847] [INFO] [timer.py:197:stop] 0/3268, RunningAvgSamplesPerSec=11.99994176262093, CurrSamplesPerSec=16.663385290694524, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:31:07,356] [INFO] [timer.py:197:stop] 0/3269, RunningAvgSamplesPerSec=11.999921569831095, CurrSamplesPerSec=11.934332495669263, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:31:13,797] [INFO] [logging.py:68:log_dist] [Rank 0] step=3270, skipped=6, lr=[3.86e-06], mom=[[0.9, 0.999]] [2022-12-20 00:31:13,798] [INFO] [timer.py:197:stop] 0/3270, RunningAvgSamplesPerSec=11.999912547382026, CurrSamplesPerSec=11.970508455998116, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:31:20,182] [INFO] [timer.py:197:stop] 0/3271, RunningAvgSamplesPerSec=11.999916886010134, CurrSamplesPerSec=12.014112300553029, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:31:26,672] [INFO] [timer.py:197:stop] 0/3272, RunningAvgSamplesPerSec=11.999913916308563, CurrSamplesPerSec=11.990213811679078, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:31:33,125] [INFO] [timer.py:197:stop] 0/3273, RunningAvgSamplesPerSec=11.999875963396773, CurrSamplesPerSec=11.877040726187765, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:31:39,587] [INFO] [timer.py:197:stop] 0/3274, RunningAvgSamplesPerSec=11.99987993081418, CurrSamplesPerSec=12.01287140724865, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:31:46,096] [INFO] [timer.py:197:stop] 0/3275, RunningAvgSamplesPerSec=11.999881422128917, CurrSamplesPerSec=12.0047629895711, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 3.848888888888889e-06, 'epoch': 86.18} [2022-12-20 00:31:52,552] [INFO] [timer.py:197:stop] 0/3276, RunningAvgSamplesPerSec=11.99983860009319, CurrSamplesPerSec=11.861300676115304, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:31:58,958] [INFO] [timer.py:197:stop] 0/3277, RunningAvgSamplesPerSec=11.999841464568318, CurrSamplesPerSec=12.009227093568423, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:32:05,419] [INFO] [timer.py:197:stop] 0/3278, RunningAvgSamplesPerSec=11.999850819779912, CurrSamplesPerSec=12.030567588517865, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:32:11,814] [INFO] [timer.py:197:stop] 0/3279, RunningAvgSamplesPerSec=11.999854551442153, CurrSamplesPerSec=12.012091947667669, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:32:18,301] [INFO] [logging.py:68:log_dist] [Rank 0] step=3280, skipped=6, lr=[3.837777777777778e-06], mom=[[0.9, 0.999]] [2022-12-20 00:32:18,302] [INFO] [timer.py:197:stop] 0/3280, RunningAvgSamplesPerSec=11.999830998773945, CurrSamplesPerSec=11.923142311979248, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:32:24,927] [INFO] [timer.py:197:stop] 0/3281, RunningAvgSamplesPerSec=11.99980737533142, CurrSamplesPerSec=11.922866400830156, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:32:31,486] [INFO] [timer.py:197:stop] 0/3282, RunningAvgSamplesPerSec=11.999780188057047, CurrSamplesPerSec=11.911290709413144, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:32:37,949] [INFO] [timer.py:197:stop] 0/3283, RunningAvgSamplesPerSec=11.999751459229085, CurrSamplesPerSec=11.906255325283036, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:32:44,452] [INFO] [timer.py:197:stop] 0/3284, RunningAvgSamplesPerSec=11.999721685671819, CurrSamplesPerSec=11.90282370938598, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:32:50,844] [INFO] [timer.py:197:stop] 0/3285, RunningAvgSamplesPerSec=11.99972348285028, CurrSamplesPerSec=12.00562472413408, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:32:57,297] [INFO] [timer.py:197:stop] 0/3286, RunningAvgSamplesPerSec=11.999726205250976, CurrSamplesPerSec=12.008670510663444, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:33:03,756] [INFO] [timer.py:197:stop] 0/3287, RunningAvgSamplesPerSec=11.999702296555578, CurrSamplesPerSec=11.921696700577726, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:33:10,275] [INFO] [timer.py:197:stop] 0/3288, RunningAvgSamplesPerSec=11.999658824932, CurrSamplesPerSec=11.858534529974142, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:33:16,805] [INFO] [timer.py:197:stop] 0/3289, RunningAvgSamplesPerSec=11.999627697874084, CurrSamplesPerSec=11.898208931103532, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:33:23,271] [INFO] [logging.py:68:log_dist] [Rank 0] step=3290, skipped=6, lr=[3.8155555555555555e-06], mom=[[0.9, 0.999]] [2022-12-20 00:33:23,271] [INFO] [timer.py:197:stop] 0/3290, RunningAvgSamplesPerSec=11.999600131201614, CurrSamplesPerSec=11.909667784566418, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:33:29,771] [INFO] [timer.py:197:stop] 0/3291, RunningAvgSamplesPerSec=11.999576895155347, CurrSamplesPerSec=11.923660275647142, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:33:36,245] [INFO] [timer.py:197:stop] 0/3292, RunningAvgSamplesPerSec=11.999577557114693, CurrSamplesPerSec=12.00175513662265, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:33:42,777] [INFO] [timer.py:197:stop] 0/3293, RunningAvgSamplesPerSec=11.999546663492307, CurrSamplesPerSec=11.898760595712286, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:33:49,287] [INFO] [timer.py:197:stop] 0/3294, RunningAvgSamplesPerSec=11.999520241949876, CurrSamplesPerSec=11.91319269959762, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:33:55,813] [INFO] [timer.py:197:stop] 0/3295, RunningAvgSamplesPerSec=11.999477262542996, CurrSamplesPerSec=11.859638423351813, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:34:02,303] [INFO] [timer.py:197:stop] 0/3296, RunningAvgSamplesPerSec=11.999465840763296, CurrSamplesPerSec=11.961971480704934, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:34:08,774] [INFO] [timer.py:197:stop] 0/3297, RunningAvgSamplesPerSec=11.999465628667501, CurrSamplesPerSec=11.998767025807352, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:34:15,219] [INFO] [timer.py:197:stop] 0/3298, RunningAvgSamplesPerSec=11.999471265936716, CurrSamplesPerSec=12.018074874536499, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:34:21,716] [INFO] [timer.py:197:stop] 0/3299, RunningAvgSamplesPerSec=11.999448916695652, CurrSamplesPerSec=11.92623540300975, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:34:28,256] [INFO] [logging.py:68:log_dist] [Rank 0] step=3300, skipped=6, lr=[3.793333333333334e-06], mom=[[0.9, 0.999]] [2022-12-20 00:34:28,257] [INFO] [timer.py:197:stop] 0/3300, RunningAvgSamplesPerSec=11.999404263512886, CurrSamplesPerSec=11.853967633492644, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 3.793333333333334e-06, 'epoch': 86.84} [2022-12-20 00:34:35,240] [INFO] [timer.py:197:stop] 0/3301, RunningAvgSamplesPerSec=11.999371110422468, CurrSamplesPerSec=11.891019820163665, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:34:41,841] [INFO] [timer.py:197:stop] 0/3302, RunningAvgSamplesPerSec=11.999332096744258, CurrSamplesPerSec=11.871992247250851, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:34:48,778] [INFO] [timer.py:197:stop] 0/3303, RunningAvgSamplesPerSec=11.999290609165365, CurrSamplesPerSec=11.863926538228487, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:34:55,871] [INFO] [timer.py:197:stop] 0/3304, RunningAvgSamplesPerSec=11.99922226140341, CurrSamplesPerSec=11.777771401188028, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:35:02,831] [INFO] [timer.py:197:stop] 0/3305, RunningAvgSamplesPerSec=11.999207105168882, CurrSamplesPerSec=11.949369144180277, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:35:07,458] [INFO] [timer.py:197:stop] 0/3306, RunningAvgSamplesPerSec=12.000218471254675, CurrSamplesPerSec=16.629943300870405, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:35:14,022] [INFO] [timer.py:197:stop] 0/3307, RunningAvgSamplesPerSec=12.00019086783904, CurrSamplesPerSec=11.909677295699701, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:35:20,546] [INFO] [timer.py:197:stop] 0/3308, RunningAvgSamplesPerSec=12.000153817815299, CurrSamplesPerSec=11.878940729614023, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:35:27,093] [INFO] [timer.py:197:stop] 0/3309, RunningAvgSamplesPerSec=12.000112483562331, CurrSamplesPerSec=11.86500049372038, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:35:33,614] [INFO] [logging.py:68:log_dist] [Rank 0] step=3310, skipped=6, lr=[3.7711111111111116e-06], mom=[[0.9, 0.999]] [2022-12-20 00:35:33,615] [INFO] [timer.py:197:stop] 0/3310, RunningAvgSamplesPerSec=12.000081323323014, CurrSamplesPerSec=11.897912023813957, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:35:40,107] [INFO] [timer.py:197:stop] 0/3311, RunningAvgSamplesPerSec=12.000083048866538, CurrSamplesPerSec=12.005793864132222, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:35:46,769] [INFO] [timer.py:197:stop] 0/3312, RunningAvgSamplesPerSec=12.000062012398638, CurrSamplesPerSec=11.930853922079057, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:35:53,354] [INFO] [timer.py:197:stop] 0/3313, RunningAvgSamplesPerSec=11.999968120546734, CurrSamplesPerSec=11.6970340213671, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:35:59,851] [INFO] [timer.py:197:stop] 0/3314, RunningAvgSamplesPerSec=11.999943822399386, CurrSamplesPerSec=11.920028593632624, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:36:06,397] [INFO] [timer.py:197:stop] 0/3315, RunningAvgSamplesPerSec=11.99990852568465, CurrSamplesPerSec=11.884134018502179, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:36:12,818] [INFO] [timer.py:197:stop] 0/3316, RunningAvgSamplesPerSec=11.999900777162527, CurrSamplesPerSec=11.974284739161682, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:36:19,384] [INFO] [timer.py:197:stop] 0/3317, RunningAvgSamplesPerSec=11.999864043561395, CurrSamplesPerSec=11.879351820013532, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:36:25,842] [INFO] [timer.py:197:stop] 0/3318, RunningAvgSamplesPerSec=11.999841201307373, CurrSamplesPerSec=11.924594100939728, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:36:32,219] [INFO] [timer.py:197:stop] 0/3319, RunningAvgSamplesPerSec=11.999846753887038, CurrSamplesPerSec=12.018287411615805, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:36:38,669] [INFO] [logging.py:68:log_dist] [Rank 0] step=3320, skipped=6, lr=[3.7488888888888892e-06], mom=[[0.9, 0.999]] [2022-12-20 00:36:38,670] [INFO] [timer.py:197:stop] 0/3320, RunningAvgSamplesPerSec=11.99984206960216, CurrSamplesPerSec=11.984324395491006, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:36:45,193] [INFO] [timer.py:197:stop] 0/3321, RunningAvgSamplesPerSec=11.999799706307341, CurrSamplesPerSec=11.86086620253386, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:36:51,633] [INFO] [timer.py:197:stop] 0/3322, RunningAvgSamplesPerSec=11.999803097204358, CurrSamplesPerSec=12.011068052774423, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:36:58,178] [INFO] [timer.py:197:stop] 0/3323, RunningAvgSamplesPerSec=11.999790568601878, CurrSamplesPerSec=11.958339334358627, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:37:04,681] [INFO] [timer.py:197:stop] 0/3324, RunningAvgSamplesPerSec=11.999779477273519, CurrSamplesPerSec=11.963057929517214, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:37:11,110] [INFO] [timer.py:197:stop] 0/3325, RunningAvgSamplesPerSec=11.99974781189558, CurrSamplesPerSec=11.895469824948755, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 3.737777777777778e-06, 'epoch': 87.5} [2022-12-20 00:37:17,637] [INFO] [timer.py:197:stop] 0/3326, RunningAvgSamplesPerSec=11.999733540716429, CurrSamplesPerSec=11.952497147519706, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:37:24,226] [INFO] [timer.py:197:stop] 0/3327, RunningAvgSamplesPerSec=11.99972108067709, CurrSamplesPerSec=11.958446412743918, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:37:30,724] [INFO] [timer.py:197:stop] 0/3328, RunningAvgSamplesPerSec=11.99970043322568, CurrSamplesPerSec=11.931438316477736, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:37:37,174] [INFO] [timer.py:197:stop] 0/3329, RunningAvgSamplesPerSec=11.999703987686498, CurrSamplesPerSec=12.011537786555245, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:37:43,661] [INFO] [logging.py:68:log_dist] [Rank 0] step=3330, skipped=6, lr=[3.726666666666667e-06], mom=[[0.9, 0.999]] [2022-12-20 00:37:43,661] [INFO] [timer.py:197:stop] 0/3330, RunningAvgSamplesPerSec=11.99968138609993, CurrSamplesPerSec=11.924954322218307, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:37:50,108] [INFO] [timer.py:197:stop] 0/3331, RunningAvgSamplesPerSec=11.999684913842477, CurrSamplesPerSec=12.011436742320072, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:37:56,680] [INFO] [timer.py:197:stop] 0/3332, RunningAvgSamplesPerSec=11.99962147798412, CurrSamplesPerSec=11.792096771685346, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:38:03,135] [INFO] [timer.py:197:stop] 0/3333, RunningAvgSamplesPerSec=11.999616854502229, CurrSamplesPerSec=11.984240394609804, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:38:09,751] [INFO] [timer.py:197:stop] 0/3334, RunningAvgSamplesPerSec=11.999582173824644, CurrSamplesPerSec=11.885162696746738, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:38:16,358] [INFO] [timer.py:197:stop] 0/3335, RunningAvgSamplesPerSec=11.999554295864758, CurrSamplesPerSec=11.90737868531602, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:38:22,867] [INFO] [timer.py:197:stop] 0/3336, RunningAvgSamplesPerSec=11.999559742135654, CurrSamplesPerSec=12.017739673104247, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:38:29,306] [INFO] [timer.py:197:stop] 0/3337, RunningAvgSamplesPerSec=11.999530145832168, CurrSamplesPerSec=11.901661104350437, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:38:35,766] [INFO] [timer.py:197:stop] 0/3338, RunningAvgSamplesPerSec=11.99952634717725, CurrSamplesPerSec=11.986871197718758, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:38:42,222] [INFO] [timer.py:197:stop] 0/3339, RunningAvgSamplesPerSec=11.999490950755925, CurrSamplesPerSec=11.882559512295098, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:38:48,695] [INFO] [logging.py:68:log_dist] [Rank 0] step=3340, skipped=6, lr=[3.704444444444445e-06], mom=[[0.9, 0.999]] [2022-12-20 00:38:48,696] [INFO] [timer.py:197:stop] 0/3340, RunningAvgSamplesPerSec=11.999484712349805, CurrSamplesPerSec=11.978703215151933, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:38:55,261] [INFO] [timer.py:197:stop] 0/3341, RunningAvgSamplesPerSec=11.999480883353037, CurrSamplesPerSec=11.986713295550404, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:39:01,887] [INFO] [timer.py:197:stop] 0/3342, RunningAvgSamplesPerSec=11.999457492878479, CurrSamplesPerSec=11.92186189556727, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:39:08,507] [INFO] [timer.py:197:stop] 0/3343, RunningAvgSamplesPerSec=11.99943039948859, CurrSamplesPerSec=11.90961600199615, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:39:13,150] [INFO] [timer.py:197:stop] 0/3344, RunningAvgSamplesPerSec=12.00041401091724, CurrSamplesPerSec=16.52646409194888, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:39:19,762] [INFO] [timer.py:197:stop] 0/3345, RunningAvgSamplesPerSec=12.000384659288851, CurrSamplesPerSec=11.903087081796885, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:39:26,377] [INFO] [timer.py:197:stop] 0/3346, RunningAvgSamplesPerSec=12.000358922331674, CurrSamplesPerSec=11.914932934090805, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:39:32,959] [INFO] [timer.py:197:stop] 0/3347, RunningAvgSamplesPerSec=12.000316957272982, CurrSamplesPerSec=11.861608338298584, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:39:39,473] [INFO] [timer.py:197:stop] 0/3348, RunningAvgSamplesPerSec=12.00029246864709, CurrSamplesPerSec=11.918933540208744, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:39:45,920] [INFO] [timer.py:197:stop] 0/3349, RunningAvgSamplesPerSec=12.000287931855468, CurrSamplesPerSec=11.985127011055445, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:39:52,436] [INFO] [logging.py:68:log_dist] [Rank 0] step=3350, skipped=6, lr=[3.6822222222222225e-06], mom=[[0.9, 0.999]] [2022-12-20 00:39:52,437] [INFO] [timer.py:197:stop] 0/3350, RunningAvgSamplesPerSec=12.000261084599442, CurrSamplesPerSec=11.911071369397932, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 3.6822222222222225e-06, 'epoch': 88.16} [2022-12-20 00:39:58,916] [INFO] [timer.py:197:stop] 0/3351, RunningAvgSamplesPerSec=12.000226300749834, CurrSamplesPerSec=11.884889592934755, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:40:05,516] [INFO] [timer.py:197:stop] 0/3352, RunningAvgSamplesPerSec=12.000198106531101, CurrSamplesPerSec=11.906513040300885, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:40:12,050] [INFO] [timer.py:197:stop] 0/3353, RunningAvgSamplesPerSec=12.00018824744142, CurrSamplesPerSec=11.967250976858331, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:40:18,692] [INFO] [timer.py:197:stop] 0/3354, RunningAvgSamplesPerSec=12.000138354022384, CurrSamplesPerSec=11.83524360120409, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:40:25,183] [INFO] [timer.py:197:stop] 0/3355, RunningAvgSamplesPerSec=12.000113538536539, CurrSamplesPerSec=11.91750482186749, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:40:31,669] [INFO] [timer.py:197:stop] 0/3356, RunningAvgSamplesPerSec=12.000095135685102, CurrSamplesPerSec=11.93870613260674, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:40:38,306] [INFO] [timer.py:197:stop] 0/3357, RunningAvgSamplesPerSec=12.000071191499433, CurrSamplesPerSec=11.920296433357981, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:40:45,003] [INFO] [timer.py:197:stop] 0/3358, RunningAvgSamplesPerSec=12.00003853964168, CurrSamplesPerSec=11.891482844814762, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:40:51,582] [INFO] [timer.py:197:stop] 0/3359, RunningAvgSamplesPerSec=12.00001401206162, CurrSamplesPerSec=11.918260412448763, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:40:58,222] [INFO] [logging.py:68:log_dist] [Rank 0] step=3360, skipped=6, lr=[3.66e-06], mom=[[0.9, 0.999]] [2022-12-20 00:40:58,223] [INFO] [timer.py:197:stop] 0/3360, RunningAvgSamplesPerSec=12.000011420619359, CurrSamplesPerSec=11.99131825298134, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:41:04,788] [INFO] [timer.py:197:stop] 0/3361, RunningAvgSamplesPerSec=11.99998433487641, CurrSamplesPerSec=11.909714811984683, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:41:11,318] [INFO] [timer.py:197:stop] 0/3362, RunningAvgSamplesPerSec=11.999955001163203, CurrSamplesPerSec=11.902225754779227, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:41:17,864] [INFO] [timer.py:197:stop] 0/3363, RunningAvgSamplesPerSec=11.999892767396325, CurrSamplesPerSec=11.794369740769477, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:41:24,319] [INFO] [timer.py:197:stop] 0/3364, RunningAvgSamplesPerSec=11.999888014058493, CurrSamplesPerSec=11.983933293130256, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:41:30,853] [INFO] [timer.py:197:stop] 0/3365, RunningAvgSamplesPerSec=11.999873404898347, CurrSamplesPerSec=11.950957682190857, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:41:37,579] [INFO] [timer.py:197:stop] 0/3366, RunningAvgSamplesPerSec=11.999836077701103, CurrSamplesPerSec=11.875604695464595, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:41:44,028] [INFO] [timer.py:197:stop] 0/3367, RunningAvgSamplesPerSec=11.999818796468116, CurrSamplesPerSec=11.941965088868693, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:41:50,514] [INFO] [timer.py:197:stop] 0/3368, RunningAvgSamplesPerSec=11.999799598347062, CurrSamplesPerSec=11.935543948146822, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:41:56,978] [INFO] [timer.py:197:stop] 0/3369, RunningAvgSamplesPerSec=11.999755035805661, CurrSamplesPerSec=11.851609893003458, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:42:03,468] [INFO] [logging.py:68:log_dist] [Rank 0] step=3370, skipped=6, lr=[3.6377777777777777e-06], mom=[[0.9, 0.999]] [2022-12-20 00:42:03,468] [INFO] [timer.py:197:stop] 0/3370, RunningAvgSamplesPerSec=11.999741144328357, CurrSamplesPerSec=11.953150196430578, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:42:10,074] [INFO] [timer.py:197:stop] 0/3371, RunningAvgSamplesPerSec=11.999708518075332, CurrSamplesPerSec=11.89082071484781, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:42:16,580] [INFO] [timer.py:197:stop] 0/3372, RunningAvgSamplesPerSec=11.999706972769731, CurrSamplesPerSec=11.994503096599342, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:42:23,185] [INFO] [timer.py:197:stop] 0/3373, RunningAvgSamplesPerSec=11.999691667295641, CurrSamplesPerSec=11.948333044901993, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:42:29,774] [INFO] [timer.py:197:stop] 0/3374, RunningAvgSamplesPerSec=11.999670081294147, CurrSamplesPerSec=11.927342397923388, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:42:36,296] [INFO] [timer.py:197:stop] 0/3375, RunningAvgSamplesPerSec=11.99963015765106, CurrSamplesPerSec=11.866501630323306, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 3.6266666666666674e-06, 'epoch': 88.82} [2022-12-20 00:42:42,813] [INFO] [timer.py:197:stop] 0/3376, RunningAvgSamplesPerSec=11.999606741689192, CurrSamplesPerSec=11.921141318942691, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:42:49,320] [INFO] [timer.py:197:stop] 0/3377, RunningAvgSamplesPerSec=11.999576557002104, CurrSamplesPerSec=11.898590766166704, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:42:55,823] [INFO] [timer.py:197:stop] 0/3378, RunningAvgSamplesPerSec=11.999551614702074, CurrSamplesPerSec=11.915957959092646, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:43:02,305] [INFO] [timer.py:197:stop] 0/3379, RunningAvgSamplesPerSec=11.999534261363756, CurrSamplesPerSec=11.94123411211569, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:43:08,797] [INFO] [logging.py:68:log_dist] [Rank 0] step=3380, skipped=6, lr=[3.615555555555556e-06], mom=[[0.9, 0.999]] [2022-12-20 00:43:08,798] [INFO] [timer.py:197:stop] 0/3380, RunningAvgSamplesPerSec=11.999509485787945, CurrSamplesPerSec=11.9164218699608, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:43:15,234] [INFO] [timer.py:197:stop] 0/3381, RunningAvgSamplesPerSec=11.999513348006094, CurrSamplesPerSec=12.012574125558496, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:43:19,907] [INFO] [timer.py:197:stop] 0/3382, RunningAvgSamplesPerSec=12.000481625776287, CurrSamplesPerSec=16.499180067099278, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:43:26,380] [INFO] [timer.py:197:stop] 0/3383, RunningAvgSamplesPerSec=12.000456661836157, CurrSamplesPerSec=11.916667857880098, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:43:32,839] [INFO] [timer.py:197:stop] 0/3384, RunningAvgSamplesPerSec=12.000446401905885, CurrSamplesPerSec=11.965857590600535, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:43:39,396] [INFO] [timer.py:197:stop] 0/3385, RunningAvgSamplesPerSec=12.000409686911338, CurrSamplesPerSec=11.877511597074925, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:43:45,882] [INFO] [timer.py:197:stop] 0/3386, RunningAvgSamplesPerSec=12.000414398957933, CurrSamplesPerSec=12.016376462204109, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:43:52,392] [INFO] [timer.py:197:stop] 0/3387, RunningAvgSamplesPerSec=12.00039578196044, CurrSamplesPerSec=11.937724970155182, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:43:58,805] [INFO] [timer.py:197:stop] 0/3388, RunningAvgSamplesPerSec=12.000399889762077, CurrSamplesPerSec=12.014320933437771, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:44:05,241] [INFO] [timer.py:197:stop] 0/3389, RunningAvgSamplesPerSec=12.000403884107232, CurrSamplesPerSec=12.013944001470126, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:44:11,759] [INFO] [logging.py:68:log_dist] [Rank 0] step=3390, skipped=6, lr=[3.593333333333334e-06], mom=[[0.9, 0.999]] [2022-12-20 00:44:11,760] [INFO] [timer.py:197:stop] 0/3390, RunningAvgSamplesPerSec=12.000383731773272, CurrSamplesPerSec=11.932513921501897, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:44:18,264] [INFO] [timer.py:197:stop] 0/3391, RunningAvgSamplesPerSec=12.000377116298019, CurrSamplesPerSec=11.978005681840239, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:44:24,747] [INFO] [timer.py:197:stop] 0/3392, RunningAvgSamplesPerSec=12.000329592249907, CurrSamplesPerSec=11.841404194453052, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:44:31,263] [INFO] [timer.py:197:stop] 0/3393, RunningAvgSamplesPerSec=12.000296837682537, CurrSamplesPerSec=11.890277159681675, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:44:37,759] [INFO] [timer.py:197:stop] 0/3394, RunningAvgSamplesPerSec=12.000262803354614, CurrSamplesPerSec=11.88595208468623, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:44:44,240] [INFO] [timer.py:197:stop] 0/3395, RunningAvgSamplesPerSec=12.000265543303628, CurrSamplesPerSec=12.009566655968051, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:44:50,701] [INFO] [timer.py:197:stop] 0/3396, RunningAvgSamplesPerSec=12.000237626190943, CurrSamplesPerSec=11.906256909562588, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:44:57,142] [INFO] [timer.py:197:stop] 0/3397, RunningAvgSamplesPerSec=12.000236800302547, CurrSamplesPerSec=11.997434389875396, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:45:03,567] [INFO] [timer.py:197:stop] 0/3398, RunningAvgSamplesPerSec=12.000243386181278, CurrSamplesPerSec=12.022644194338199, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:45:09,952] [INFO] [timer.py:197:stop] 0/3399, RunningAvgSamplesPerSec=12.000229357863072, CurrSamplesPerSec=11.952777625147174, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:45:16,511] [INFO] [logging.py:68:log_dist] [Rank 0] step=3400, skipped=6, lr=[3.5711111111111114e-06], mom=[[0.9, 0.999]] [2022-12-20 00:45:16,511] [INFO] [timer.py:197:stop] 0/3400, RunningAvgSamplesPerSec=12.000202004944875, CurrSamplesPerSec=11.907998287675031, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 3.5711111111111114e-06, 'epoch': 89.47} [2022-12-20 00:45:23,056] [INFO] [timer.py:197:stop] 0/3401, RunningAvgSamplesPerSec=12.000154996278871, CurrSamplesPerSec=11.842518484506085, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:45:29,496] [INFO] [timer.py:197:stop] 0/3402, RunningAvgSamplesPerSec=12.000125717295061, CurrSamplesPerSec=11.90142523354127, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:45:36,038] [INFO] [timer.py:197:stop] 0/3403, RunningAvgSamplesPerSec=12.00008487428459, CurrSamplesPerSec=11.862807691811378, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:45:42,490] [INFO] [timer.py:197:stop] 0/3404, RunningAvgSamplesPerSec=12.000073270481916, CurrSamplesPerSec=11.960738136717685, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:45:48,957] [INFO] [timer.py:197:stop] 0/3405, RunningAvgSamplesPerSec=12.000066967374407, CurrSamplesPerSec=11.978662055758662, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:45:55,495] [INFO] [timer.py:197:stop] 0/3406, RunningAvgSamplesPerSec=12.00004644048032, CurrSamplesPerSec=11.93059780355772, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:46:01,962] [INFO] [timer.py:197:stop] 0/3407, RunningAvgSamplesPerSec=12.00005101780168, CurrSamplesPerSec=12.015652483041883, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:46:08,414] [INFO] [timer.py:197:stop] 0/3408, RunningAvgSamplesPerSec=12.00005211404074, CurrSamplesPerSec=12.003785969821173, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:46:14,921] [INFO] [timer.py:197:stop] 0/3409, RunningAvgSamplesPerSec=12.000019317709276, CurrSamplesPerSec=11.889345541659122, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:46:21,388] [INFO] [logging.py:68:log_dist] [Rank 0] step=3410, skipped=6, lr=[3.548888888888889e-06], mom=[[0.9, 0.999]] [2022-12-20 00:46:21,388] [INFO] [timer.py:197:stop] 0/3410, RunningAvgSamplesPerSec=11.999997612456674, CurrSamplesPerSec=11.926500872287868, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:46:27,846] [INFO] [timer.py:197:stop] 0/3411, RunningAvgSamplesPerSec=11.9999705513738, CurrSamplesPerSec=11.908449956624631, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:46:34,345] [INFO] [timer.py:197:stop] 0/3412, RunningAvgSamplesPerSec=11.999949083026904, CurrSamplesPerSec=11.927207258284348, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:46:40,868] [INFO] [timer.py:197:stop] 0/3413, RunningAvgSamplesPerSec=11.999924334503687, CurrSamplesPerSec=11.916121408173442, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:46:47,438] [INFO] [timer.py:197:stop] 0/3414, RunningAvgSamplesPerSec=11.999875468707083, CurrSamplesPerSec=11.835478421512683, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:46:53,901] [INFO] [timer.py:197:stop] 0/3415, RunningAvgSamplesPerSec=11.999849418080725, CurrSamplesPerSec=11.91161841316044, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:47:00,350] [INFO] [timer.py:197:stop] 0/3416, RunningAvgSamplesPerSec=11.999845400375515, CurrSamplesPerSec=11.986148628615641, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:47:06,864] [INFO] [timer.py:197:stop] 0/3417, RunningAvgSamplesPerSec=11.999823643670723, CurrSamplesPerSec=11.926003325778085, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:47:13,344] [INFO] [timer.py:197:stop] 0/3418, RunningAvgSamplesPerSec=11.999798187685336, CurrSamplesPerSec=11.913491427502223, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:47:19,784] [INFO] [timer.py:197:stop] 0/3419, RunningAvgSamplesPerSec=11.999801697838905, CurrSamplesPerSec=12.011804379505342, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:47:24,422] [INFO] [logging.py:68:log_dist] [Rank 0] step=3420, skipped=6, lr=[3.526666666666667e-06], mom=[[0.9, 0.999]] [2022-12-20 00:47:24,423] [INFO] [timer.py:197:stop] 0/3420, RunningAvgSamplesPerSec=12.000773497913954, CurrSamplesPerSec=16.59226102929019, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:47:30,894] [INFO] [timer.py:197:stop] 0/3421, RunningAvgSamplesPerSec=12.000755517294907, CurrSamplesPerSec=11.93961098377406, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:47:37,395] [INFO] [timer.py:197:stop] 0/3422, RunningAvgSamplesPerSec=12.00072495375821, CurrSamplesPerSec=11.897130537628748, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:47:43,953] [INFO] [timer.py:197:stop] 0/3423, RunningAvgSamplesPerSec=12.000697443468544, CurrSamplesPerSec=11.907344352884941, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:47:50,407] [INFO] [timer.py:197:stop] 0/3424, RunningAvgSamplesPerSec=12.000700080880941, CurrSamplesPerSec=12.0097294593083, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:47:56,873] [INFO] [timer.py:197:stop] 0/3425, RunningAvgSamplesPerSec=12.000702172412133, CurrSamplesPerSec=12.007863664512753, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 3.515555555555556e-06, 'epoch': 90.13} [2022-12-20 00:48:03,293] [INFO] [timer.py:197:stop] 0/3426, RunningAvgSamplesPerSec=12.000673208671875, CurrSamplesPerSec=11.902342913261693, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:48:09,808] [INFO] [timer.py:197:stop] 0/3427, RunningAvgSamplesPerSec=12.000648535184146, CurrSamplesPerSec=11.916757262457349, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:48:16,296] [INFO] [timer.py:197:stop] 0/3428, RunningAvgSamplesPerSec=12.000618284164158, CurrSamplesPerSec=11.897895675885623, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:48:22,818] [INFO] [timer.py:197:stop] 0/3429, RunningAvgSamplesPerSec=12.00060226548592, CurrSamplesPerSec=11.945972175791578, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:48:29,211] [INFO] [logging.py:68:log_dist] [Rank 0] step=3430, skipped=6, lr=[3.5044444444444447e-06], mom=[[0.9, 0.999]] [2022-12-20 00:48:29,211] [INFO] [timer.py:197:stop] 0/3430, RunningAvgSamplesPerSec=12.00057876437523, CurrSamplesPerSec=11.920577519214952, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:48:35,714] [INFO] [timer.py:197:stop] 0/3431, RunningAvgSamplesPerSec=12.000541941612791, CurrSamplesPerSec=11.875627812125838, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:48:42,203] [INFO] [timer.py:197:stop] 0/3432, RunningAvgSamplesPerSec=12.000514163545212, CurrSamplesPerSec=11.906013463550057, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:48:48,728] [INFO] [timer.py:197:stop] 0/3433, RunningAvgSamplesPerSec=12.000495789907555, CurrSamplesPerSec=11.937803542159784, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:48:55,201] [INFO] [timer.py:197:stop] 0/3434, RunningAvgSamplesPerSec=12.000462889603128, CurrSamplesPerSec=11.888634155547013, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:49:01,602] [INFO] [timer.py:197:stop] 0/3435, RunningAvgSamplesPerSec=12.000470030549904, CurrSamplesPerSec=12.025027927395088, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:49:08,134] [INFO] [timer.py:197:stop] 0/3436, RunningAvgSamplesPerSec=12.000432903635351, CurrSamplesPerSec=11.874316083950236, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:49:15,150] [INFO] [timer.py:197:stop] 0/3437, RunningAvgSamplesPerSec=12.000396990218215, CurrSamplesPerSec=11.878325198596047, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:49:22,155] [INFO] [timer.py:197:stop] 0/3438, RunningAvgSamplesPerSec=12.00036006088262, CurrSamplesPerSec=11.874835067100102, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:49:28,862] [INFO] [timer.py:197:stop] 0/3439, RunningAvgSamplesPerSec=12.000347494597488, CurrSamplesPerSec=11.957324582179222, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:49:35,725] [INFO] [logging.py:68:log_dist] [Rank 0] step=3440, skipped=6, lr=[3.4822222222222223e-06], mom=[[0.9, 0.999]] [2022-12-20 00:49:35,726] [INFO] [timer.py:197:stop] 0/3440, RunningAvgSamplesPerSec=12.000324730991188, CurrSamplesPerSec=11.922593150166856, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:49:42,582] [INFO] [timer.py:197:stop] 0/3441, RunningAvgSamplesPerSec=12.000330633422399, CurrSamplesPerSec=12.020657574774713, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:49:49,131] [INFO] [timer.py:197:stop] 0/3442, RunningAvgSamplesPerSec=12.000332995145245, CurrSamplesPerSec=12.00846046237829, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:49:55,570] [INFO] [timer.py:197:stop] 0/3443, RunningAvgSamplesPerSec=12.000309669294639, CurrSamplesPerSec=11.920601870026115, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:50:02,059] [INFO] [timer.py:197:stop] 0/3444, RunningAvgSamplesPerSec=12.000267728200832, CurrSamplesPerSec=11.857663924686392, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:50:08,681] [INFO] [timer.py:197:stop] 0/3445, RunningAvgSamplesPerSec=12.000229657213588, CurrSamplesPerSec=11.870605205926028, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:50:15,266] [INFO] [timer.py:197:stop] 0/3446, RunningAvgSamplesPerSec=12.00023496019121, CurrSamplesPerSec=12.01852094203625, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:50:21,917] [INFO] [timer.py:197:stop] 0/3447, RunningAvgSamplesPerSec=12.000200384109492, CurrSamplesPerSec=11.882290735857083, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:50:28,554] [INFO] [timer.py:197:stop] 0/3448, RunningAvgSamplesPerSec=12.000198644709396, CurrSamplesPerSec=11.994209402941973, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:50:35,124] [INFO] [timer.py:197:stop] 0/3449, RunningAvgSamplesPerSec=12.000184483413607, CurrSamplesPerSec=11.951582360369727, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:50:41,549] [INFO] [logging.py:68:log_dist] [Rank 0] step=3450, skipped=6, lr=[3.46e-06], mom=[[0.9, 0.999]] [2022-12-20 00:50:41,550] [INFO] [timer.py:197:stop] 0/3450, RunningAvgSamplesPerSec=12.000181538251528, CurrSamplesPerSec=11.99003814820901, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 3.46e-06, 'epoch': 90.79} [2022-12-20 00:50:48,155] [INFO] [timer.py:197:stop] 0/3451, RunningAvgSamplesPerSec=12.000104498912584, CurrSamplesPerSec=11.740227114523123, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:50:54,676] [INFO] [timer.py:197:stop] 0/3452, RunningAvgSamplesPerSec=12.000073627477931, CurrSamplesPerSec=11.894534757806706, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:51:01,291] [INFO] [timer.py:197:stop] 0/3453, RunningAvgSamplesPerSec=12.00003094929506, CurrSamplesPerSec=11.854576454480112, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:51:07,990] [INFO] [timer.py:197:stop] 0/3454, RunningAvgSamplesPerSec=12.000020348096747, CurrSamplesPerSec=11.963546842553333, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:51:14,410] [INFO] [timer.py:197:stop] 0/3455, RunningAvgSamplesPerSec=12.000020725052384, CurrSamplesPerSec=12.001322117066554, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:51:20,939] [INFO] [timer.py:197:stop] 0/3456, RunningAvgSamplesPerSec=11.999992066185255, CurrSamplesPerSec=11.901842630981344, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:51:27,389] [INFO] [timer.py:197:stop] 0/3457, RunningAvgSamplesPerSec=11.99996758994549, CurrSamplesPerSec=11.916018260173288, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:51:32,000] [INFO] [timer.py:197:stop] 0/3458, RunningAvgSamplesPerSec=12.000934513723694, CurrSamplesPerSec=16.630873666600355, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:51:38,570] [INFO] [timer.py:197:stop] 0/3459, RunningAvgSamplesPerSec=12.000903154528425, CurrSamplesPerSec=11.893496025265822, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:51:45,062] [INFO] [logging.py:68:log_dist] [Rank 0] step=3460, skipped=6, lr=[3.4377777777777784e-06], mom=[[0.9, 0.999]] [2022-12-20 00:51:45,062] [INFO] [timer.py:197:stop] 0/3460, RunningAvgSamplesPerSec=12.000880743017472, CurrSamplesPerSec=11.92390126590319, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:51:51,614] [INFO] [timer.py:197:stop] 0/3461, RunningAvgSamplesPerSec=12.000859499637862, CurrSamplesPerSec=11.927846946301296, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:51:58,207] [INFO] [timer.py:197:stop] 0/3462, RunningAvgSamplesPerSec=12.000849781715136, CurrSamplesPerSec=11.967329404490009, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:52:04,650] [INFO] [timer.py:197:stop] 0/3463, RunningAvgSamplesPerSec=12.000847542375835, CurrSamplesPerSec=11.993104429030506, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:52:11,151] [INFO] [timer.py:197:stop] 0/3464, RunningAvgSamplesPerSec=12.00082573620097, CurrSamplesPerSec=11.925826359712273, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:52:17,573] [INFO] [timer.py:197:stop] 0/3465, RunningAvgSamplesPerSec=12.000829614848548, CurrSamplesPerSec=12.014272538566471, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:52:24,010] [INFO] [timer.py:197:stop] 0/3466, RunningAvgSamplesPerSec=12.000833942434783, CurrSamplesPerSec=12.015839117169127, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:52:30,527] [INFO] [timer.py:197:stop] 0/3467, RunningAvgSamplesPerSec=12.000799634573458, CurrSamplesPerSec=11.883122877705128, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:52:37,075] [INFO] [timer.py:197:stop] 0/3468, RunningAvgSamplesPerSec=12.000774061055328, CurrSamplesPerSec=11.912811512823975, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:52:43,626] [INFO] [timer.py:197:stop] 0/3469, RunningAvgSamplesPerSec=12.00075682239046, CurrSamplesPerSec=11.941303699798265, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:52:50,162] [INFO] [logging.py:68:log_dist] [Rank 0] step=3470, skipped=6, lr=[3.415555555555556e-06], mom=[[0.9, 0.999]] [2022-12-20 00:52:50,163] [INFO] [timer.py:197:stop] 0/3470, RunningAvgSamplesPerSec=12.000758867036927, CurrSamplesPerSec=12.007851847332189, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:52:56,706] [INFO] [timer.py:197:stop] 0/3471, RunningAvgSamplesPerSec=12.000765862977188, CurrSamplesPerSec=12.025076947639409, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:53:03,365] [INFO] [timer.py:197:stop] 0/3472, RunningAvgSamplesPerSec=12.000744548830372, CurrSamplesPerSec=11.927258663982233, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:53:09,848] [INFO] [timer.py:197:stop] 0/3473, RunningAvgSamplesPerSec=12.000731274858131, CurrSamplesPerSec=11.954846753897316, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:53:16,281] [INFO] [timer.py:197:stop] 0/3474, RunningAvgSamplesPerSec=12.000709776939544, CurrSamplesPerSec=11.926551742001802, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:53:22,760] [INFO] [timer.py:197:stop] 0/3475, RunningAvgSamplesPerSec=12.000698571094706, CurrSamplesPerSec=11.961917643182487, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 3.404444444444445e-06, 'epoch': 91.45} [2022-12-20 00:53:29,261] [INFO] [timer.py:197:stop] 0/3476, RunningAvgSamplesPerSec=12.000703435954588, CurrSamplesPerSec=12.01762292193991, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:53:35,868] [INFO] [timer.py:197:stop] 0/3477, RunningAvgSamplesPerSec=12.000714322163596, CurrSamplesPerSec=12.038652604232373, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:53:42,493] [INFO] [timer.py:197:stop] 0/3478, RunningAvgSamplesPerSec=12.000714909575528, CurrSamplesPerSec=12.002756513407597, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:53:49,099] [INFO] [timer.py:197:stop] 0/3479, RunningAvgSamplesPerSec=12.000690252539853, CurrSamplesPerSec=11.915590347285146, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:53:55,778] [INFO] [logging.py:68:log_dist] [Rank 0] step=3480, skipped=6, lr=[3.3933333333333336e-06], mom=[[0.9, 0.999]] [2022-12-20 00:53:55,779] [INFO] [timer.py:197:stop] 0/3480, RunningAvgSamplesPerSec=12.000645651289828, CurrSamplesPerSec=11.847546105428439, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:54:02,275] [INFO] [timer.py:197:stop] 0/3481, RunningAvgSamplesPerSec=12.000624042660027, CurrSamplesPerSec=11.925937095348068, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:54:08,723] [INFO] [timer.py:197:stop] 0/3482, RunningAvgSamplesPerSec=12.00061909256122, CurrSamplesPerSec=11.983422383906621, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:54:15,215] [INFO] [timer.py:197:stop] 0/3483, RunningAvgSamplesPerSec=12.000599711547345, CurrSamplesPerSec=11.933530832966461, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:54:21,773] [INFO] [timer.py:197:stop] 0/3484, RunningAvgSamplesPerSec=12.000577502102516, CurrSamplesPerSec=11.92376143740348, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:54:28,354] [INFO] [timer.py:197:stop] 0/3485, RunningAvgSamplesPerSec=12.000581233042496, CurrSamplesPerSec=12.013586448781643, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:54:34,834] [INFO] [timer.py:197:stop] 0/3486, RunningAvgSamplesPerSec=12.00056953719541, CurrSamplesPerSec=11.959970756298283, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:54:41,237] [INFO] [timer.py:197:stop] 0/3487, RunningAvgSamplesPerSec=12.00054844108999, CurrSamplesPerSec=11.927497149820235, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:54:47,863] [INFO] [timer.py:197:stop] 0/3488, RunningAvgSamplesPerSec=12.000497796478031, CurrSamplesPerSec=11.826560232284553, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:54:54,399] [INFO] [timer.py:197:stop] 0/3489, RunningAvgSamplesPerSec=12.000476370205698, CurrSamplesPerSec=11.92624653022722, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:55:00,918] [INFO] [logging.py:68:log_dist] [Rank 0] step=3490, skipped=6, lr=[3.371111111111111e-06], mom=[[0.9, 0.999]] [2022-12-20 00:55:00,919] [INFO] [timer.py:197:stop] 0/3490, RunningAvgSamplesPerSec=12.000433599162873, CurrSamplesPerSec=11.8531222979091, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:55:07,388] [INFO] [timer.py:197:stop] 0/3491, RunningAvgSamplesPerSec=12.000390387503781, CurrSamplesPerSec=11.851538207266003, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:55:13,855] [INFO] [timer.py:197:stop] 0/3492, RunningAvgSamplesPerSec=12.0003838499656, CurrSamplesPerSec=11.977617663992152, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:55:20,395] [INFO] [timer.py:197:stop] 0/3493, RunningAvgSamplesPerSec=12.000343856396842, CurrSamplesPerSec=11.862371535199987, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:55:26,879] [INFO] [timer.py:197:stop] 0/3494, RunningAvgSamplesPerSec=12.000314253441095, CurrSamplesPerSec=11.897852960542995, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:55:33,508] [INFO] [timer.py:197:stop] 0/3495, RunningAvgSamplesPerSec=12.000276239076776, CurrSamplesPerSec=11.8689828524252, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:55:38,298] [INFO] [timer.py:197:stop] 0/3496, RunningAvgSamplesPerSec=12.001214542508162, CurrSamplesPerSec=16.510544902104602, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:55:44,755] [INFO] [timer.py:197:stop] 0/3497, RunningAvgSamplesPerSec=12.001211778159876, CurrSamplesPerSec=11.991560914533263, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:55:51,137] [INFO] [timer.py:197:stop] 0/3498, RunningAvgSamplesPerSec=12.00121875494935, CurrSamplesPerSec=12.025652292026702, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:55:57,759] [INFO] [timer.py:197:stop] 0/3499, RunningAvgSamplesPerSec=12.001223375032291, CurrSamplesPerSec=12.017396958388499, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:56:04,408] [INFO] [logging.py:68:log_dist] [Rank 0] step=3500, skipped=6, lr=[3.3488888888888892e-06], mom=[[0.9, 0.999]] [2022-12-20 00:56:04,409] [INFO] [timer.py:197:stop] 0/3500, RunningAvgSamplesPerSec=12.001209441901583, CurrSamplesPerSec=11.952682357076444, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 3.3488888888888892e-06, 'epoch': 92.11} [2022-12-20 00:56:11,028] [INFO] [timer.py:197:stop] 0/3501, RunningAvgSamplesPerSec=12.001212560608115, CurrSamplesPerSec=12.012131724534836, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:56:17,599] [INFO] [timer.py:197:stop] 0/3502, RunningAvgSamplesPerSec=12.001191861484482, CurrSamplesPerSec=11.929200215888226, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:56:24,069] [INFO] [timer.py:197:stop] 0/3503, RunningAvgSamplesPerSec=12.001174099133436, CurrSamplesPerSec=11.93932634444232, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:56:30,575] [INFO] [timer.py:197:stop] 0/3504, RunningAvgSamplesPerSec=12.001152099509843, CurrSamplesPerSec=11.924622705951657, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:56:37,130] [INFO] [timer.py:197:stop] 0/3505, RunningAvgSamplesPerSec=12.001107581905481, CurrSamplesPerSec=11.847206754381498, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:56:43,601] [INFO] [timer.py:197:stop] 0/3506, RunningAvgSamplesPerSec=12.001089769946354, CurrSamplesPerSec=11.93901729247645, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:56:50,238] [INFO] [timer.py:197:stop] 0/3507, RunningAvgSamplesPerSec=12.001072602558184, CurrSamplesPerSec=11.941218176119312, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:56:56,836] [INFO] [timer.py:197:stop] 0/3508, RunningAvgSamplesPerSec=12.001076400107147, CurrSamplesPerSec=12.014401592423011, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:57:03,379] [INFO] [timer.py:197:stop] 0/3509, RunningAvgSamplesPerSec=12.001085213895848, CurrSamplesPerSec=12.032066151486744, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:57:09,823] [INFO] [logging.py:68:log_dist] [Rank 0] step=3510, skipped=6, lr=[3.326666666666667e-06], mom=[[0.9, 0.999]] [2022-12-20 00:57:09,823] [INFO] [timer.py:197:stop] 0/3510, RunningAvgSamplesPerSec=12.001082397607682, CurrSamplesPerSec=11.991213799045582, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:57:16,214] [INFO] [timer.py:197:stop] 0/3511, RunningAvgSamplesPerSec=12.00109044611686, CurrSamplesPerSec=12.029391216639093, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:57:22,725] [INFO] [timer.py:197:stop] 0/3512, RunningAvgSamplesPerSec=12.001092632290073, CurrSamplesPerSec=12.008768822231827, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:57:29,164] [INFO] [timer.py:197:stop] 0/3513, RunningAvgSamplesPerSec=12.001069006215591, CurrSamplesPerSec=11.918710743436174, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:57:35,626] [INFO] [timer.py:197:stop] 0/3514, RunningAvgSamplesPerSec=12.00107644192846, CurrSamplesPerSec=12.027240161805052, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:57:42,001] [INFO] [timer.py:197:stop] 0/3515, RunningAvgSamplesPerSec=12.001081606148365, CurrSamplesPerSec=12.019245805080567, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:57:48,485] [INFO] [timer.py:197:stop] 0/3516, RunningAvgSamplesPerSec=12.001036890892436, CurrSamplesPerSec=11.84598232410422, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:57:54,956] [INFO] [timer.py:197:stop] 0/3517, RunningAvgSamplesPerSec=12.001041268129663, CurrSamplesPerSec=12.016442625125945, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:58:01,465] [INFO] [timer.py:197:stop] 0/3518, RunningAvgSamplesPerSec=12.001012184138066, CurrSamplesPerSec=11.899645686529283, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:58:07,982] [INFO] [timer.py:197:stop] 0/3519, RunningAvgSamplesPerSec=12.000992632494121, CurrSamplesPerSec=11.93264069443419, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:58:14,516] [INFO] [logging.py:68:log_dist] [Rank 0] step=3520, skipped=6, lr=[3.3044444444444445e-06], mom=[[0.9, 0.999]] [2022-12-20 00:58:14,517] [INFO] [timer.py:197:stop] 0/3520, RunningAvgSamplesPerSec=12.000948664157004, CurrSamplesPerSec=11.84827977667795, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:58:20,962] [INFO] [timer.py:197:stop] 0/3521, RunningAvgSamplesPerSec=12.000949060877971, CurrSamplesPerSec=12.00234488761159, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:58:27,453] [INFO] [timer.py:197:stop] 0/3522, RunningAvgSamplesPerSec=12.000907543658272, CurrSamplesPerSec=11.856566159469363, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:58:33,924] [INFO] [timer.py:197:stop] 0/3523, RunningAvgSamplesPerSec=12.000904470755083, CurrSamplesPerSec=11.990097594721766, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:58:40,418] [INFO] [timer.py:197:stop] 0/3524, RunningAvgSamplesPerSec=12.000900524436336, CurrSamplesPerSec=11.987021610093597, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:58:46,878] [INFO] [timer.py:197:stop] 0/3525, RunningAvgSamplesPerSec=12.000864913617026, CurrSamplesPerSec=11.876741196057283, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0001, 'learning_rate': 3.2933333333333333e-06, 'epoch': 92.76} [2022-12-20 00:58:53,661] [INFO] [timer.py:197:stop] 0/3526, RunningAvgSamplesPerSec=12.000844871392617, CurrSamplesPerSec=11.930649238486, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:59:00,164] [INFO] [timer.py:197:stop] 0/3527, RunningAvgSamplesPerSec=12.000825801768551, CurrSamplesPerSec=11.933998765945667, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:59:06,637] [INFO] [timer.py:197:stop] 0/3528, RunningAvgSamplesPerSec=12.000806988911158, CurrSamplesPerSec=11.934856207757512, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:59:13,457] [INFO] [timer.py:197:stop] 0/3529, RunningAvgSamplesPerSec=12.000775038365333, CurrSamplesPerSec=11.889165449192092, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:59:19,882] [INFO] [logging.py:68:log_dist] [Rank 0] step=3530, skipped=6, lr=[3.282222222222223e-06], mom=[[0.9, 0.999]] [2022-12-20 00:59:19,883] [INFO] [timer.py:197:stop] 0/3530, RunningAvgSamplesPerSec=12.000771568479346, CurrSamplesPerSec=11.988545751921508, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:59:26,458] [INFO] [timer.py:197:stop] 0/3531, RunningAvgSamplesPerSec=12.00071806178057, CurrSamplesPerSec=11.814870642273913, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:59:33,111] [INFO] [timer.py:197:stop] 0/3532, RunningAvgSamplesPerSec=12.00068263437976, CurrSamplesPerSec=11.876948763575609, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:59:39,595] [INFO] [timer.py:197:stop] 0/3533, RunningAvgSamplesPerSec=12.00067422140779, CurrSamplesPerSec=11.971049762084514, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:59:44,260] [INFO] [timer.py:197:stop] 0/3534, RunningAvgSamplesPerSec=12.001606030721566, CurrSamplesPerSec=16.534997915842926, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:59:50,747] [INFO] [timer.py:197:stop] 0/3535, RunningAvgSamplesPerSec=12.001596693546807, CurrSamplesPerSec=11.968708191409474, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 00:59:57,230] [INFO] [timer.py:197:stop] 0/3536, RunningAvgSamplesPerSec=12.001562749624444, CurrSamplesPerSec=11.882825670808462, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:00:03,667] [INFO] [timer.py:197:stop] 0/3537, RunningAvgSamplesPerSec=12.001525057665827, CurrSamplesPerSec=11.869784263770846, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:00:10,191] [INFO] [timer.py:197:stop] 0/3538, RunningAvgSamplesPerSec=12.001482799216161, CurrSamplesPerSec=11.853936225679734, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:00:16,664] [INFO] [timer.py:197:stop] 0/3539, RunningAvgSamplesPerSec=12.001484064502803, CurrSamplesPerSec=12.005959787046699, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:00:23,066] [INFO] [logging.py:68:log_dist] [Rank 0] step=3540, skipped=6, lr=[3.2600000000000006e-06], mom=[[0.9, 0.999]] [2022-12-20 01:00:23,067] [INFO] [timer.py:197:stop] 0/3540, RunningAvgSamplesPerSec=12.001482359108941, CurrSamplesPerSec=11.995453412041236, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:00:29,512] [INFO] [timer.py:197:stop] 0/3541, RunningAvgSamplesPerSec=12.001461762870285, CurrSamplesPerSec=11.92903216689267, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:00:35,997] [INFO] [timer.py:197:stop] 0/3542, RunningAvgSamplesPerSec=12.00146563646145, CurrSamplesPerSec=12.015189956592131, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:00:42,505] [INFO] [timer.py:197:stop] 0/3543, RunningAvgSamplesPerSec=12.001442151485492, CurrSamplesPerSec=11.918877443324654, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:00:48,972] [INFO] [timer.py:197:stop] 0/3544, RunningAvgSamplesPerSec=12.00141293545977, CurrSamplesPerSec=11.898843402559438, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:00:55,402] [INFO] [timer.py:197:stop] 0/3545, RunningAvgSamplesPerSec=12.001405831613948, CurrSamplesPerSec=11.976296667758994, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:01:01,898] [INFO] [timer.py:197:stop] 0/3546, RunningAvgSamplesPerSec=12.001371930754257, CurrSamplesPerSec=11.882451684670585, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:01:08,393] [INFO] [timer.py:197:stop] 0/3547, RunningAvgSamplesPerSec=12.00134583745549, CurrSamplesPerSec=11.909578486333679, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:01:15,056] [INFO] [timer.py:197:stop] 0/3548, RunningAvgSamplesPerSec=12.00134791225233, CurrSamplesPerSec=12.008707578770675, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:01:21,561] [INFO] [timer.py:197:stop] 0/3549, RunningAvgSamplesPerSec=12.001322791283675, CurrSamplesPerSec=11.912900330863286, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:01:28,017] [INFO] [logging.py:68:log_dist] [Rank 0] step=3550, skipped=6, lr=[3.237777777777778e-06], mom=[[0.9, 0.999]] [2022-12-20 01:01:28,018] [INFO] [timer.py:197:stop] 0/3550, RunningAvgSamplesPerSec=12.00132253022383, CurrSamplesPerSec=12.000396622415014, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0, 'learning_rate': 3.237777777777778e-06, 'epoch': 93.42} [2022-12-20 01:01:34,522] [INFO] [timer.py:197:stop] 0/3551, RunningAvgSamplesPerSec=12.001284963310805, CurrSamplesPerSec=11.86946200677602, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:01:41,040] [INFO] [timer.py:197:stop] 0/3552, RunningAvgSamplesPerSec=12.001251987117408, CurrSamplesPerSec=11.885350035328123, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:01:47,514] [INFO] [timer.py:197:stop] 0/3553, RunningAvgSamplesPerSec=12.001245496453029, CurrSamplesPerSec=11.9782478048878, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:01:54,027] [INFO] [timer.py:197:stop] 0/3554, RunningAvgSamplesPerSec=12.0012323276221, CurrSamplesPerSec=11.954651361746196, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:02:00,554] [INFO] [timer.py:197:stop] 0/3555, RunningAvgSamplesPerSec=12.001198920798982, CurrSamplesPerSec=11.88369997496961, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:02:07,117] [INFO] [timer.py:197:stop] 0/3556, RunningAvgSamplesPerSec=12.0011196940809, CurrSamplesPerSec=11.726080175962325, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:02:13,628] [INFO] [timer.py:197:stop] 0/3557, RunningAvgSamplesPerSec=12.001105143967264, CurrSamplesPerSec=11.949615962293347, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:02:20,118] [INFO] [timer.py:197:stop] 0/3558, RunningAvgSamplesPerSec=12.001077044511643, CurrSamplesPerSec=11.90200833163554, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:02:26,576] [INFO] [timer.py:197:stop] 0/3559, RunningAvgSamplesPerSec=12.001074556821196, CurrSamplesPerSec=11.992234847351062, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:02:33,051] [INFO] [logging.py:68:log_dist] [Rank 0] step=3560, skipped=6, lr=[3.2155555555555558e-06], mom=[[0.9, 0.999]] [2022-12-20 01:02:33,051] [INFO] [timer.py:197:stop] 0/3560, RunningAvgSamplesPerSec=12.001074521443112, CurrSamplesPerSec=12.000948682915611, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:02:39,569] [INFO] [timer.py:197:stop] 0/3561, RunningAvgSamplesPerSec=12.0010469921623, CurrSamplesPerSec=11.903890993918111, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:02:46,072] [INFO] [timer.py:197:stop] 0/3562, RunningAvgSamplesPerSec=12.001015312356898, CurrSamplesPerSec=11.889316579052117, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:02:52,624] [INFO] [timer.py:197:stop] 0/3563, RunningAvgSamplesPerSec=12.000989830023352, CurrSamplesPerSec=11.910953510509474, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:02:59,065] [INFO] [timer.py:197:stop] 0/3564, RunningAvgSamplesPerSec=12.000967125029973, CurrSamplesPerSec=11.92065586565792, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:03:05,563] [INFO] [timer.py:197:stop] 0/3565, RunningAvgSamplesPerSec=12.000934724505788, CurrSamplesPerSec=11.886623673237898, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:03:12,061] [INFO] [timer.py:197:stop] 0/3566, RunningAvgSamplesPerSec=12.000907129493884, CurrSamplesPerSec=11.903385303105566, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:03:18,541] [INFO] [timer.py:197:stop] 0/3567, RunningAvgSamplesPerSec=12.000909648736489, CurrSamplesPerSec=12.009894953696204, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:03:25,002] [INFO] [timer.py:197:stop] 0/3568, RunningAvgSamplesPerSec=12.000883922425123, CurrSamplesPerSec=11.909865407901604, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:03:31,486] [INFO] [timer.py:197:stop] 0/3569, RunningAvgSamplesPerSec=12.000865790951844, CurrSamplesPerSec=11.936555537516721, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:03:38,149] [INFO] [logging.py:68:log_dist] [Rank 0] step=3570, skipped=6, lr=[3.193333333333334e-06], mom=[[0.9, 0.999]] [2022-12-20 01:03:38,150] [INFO] [timer.py:197:stop] 0/3570, RunningAvgSamplesPerSec=12.000832565873305, CurrSamplesPerSec=11.883477967382657, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:03:44,603] [INFO] [timer.py:197:stop] 0/3571, RunningAvgSamplesPerSec=12.000822851829279, CurrSamplesPerSec=11.966262983452408, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:03:49,285] [INFO] [timer.py:197:stop] 0/3572, RunningAvgSamplesPerSec=12.001762608426507, CurrSamplesPerSec=16.65708716946968, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:03:55,838] [INFO] [timer.py:197:stop] 0/3573, RunningAvgSamplesPerSec=12.001730582960576, CurrSamplesPerSec=11.888478830895085, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:04:02,319] [INFO] [timer.py:197:stop] 0/3574, RunningAvgSamplesPerSec=12.001709523199432, CurrSamplesPerSec=11.926973553430093, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:04:08,827] [INFO] [timer.py:197:stop] 0/3575, RunningAvgSamplesPerSec=12.00170532701041, CurrSamplesPerSec=11.986735241026587, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0, 'learning_rate': 3.1822222222222226e-06, 'epoch': 94.08} [2022-12-20 01:04:15,341] [INFO] [timer.py:197:stop] 0/3576, RunningAvgSamplesPerSec=12.00170364018059, CurrSamplesPerSec=11.995679623238836, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:04:21,782] [INFO] [timer.py:197:stop] 0/3577, RunningAvgSamplesPerSec=12.00170598648109, CurrSamplesPerSec=12.01009752933383, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:04:28,214] [INFO] [timer.py:197:stop] 0/3578, RunningAvgSamplesPerSec=12.001707939828274, CurrSamplesPerSec=12.008695222709505, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:04:34,894] [INFO] [timer.py:197:stop] 0/3579, RunningAvgSamplesPerSec=12.001680613916916, CurrSamplesPerSec=11.904752561977695, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:04:41,324] [INFO] [logging.py:68:log_dist] [Rank 0] step=3580, skipped=6, lr=[3.1711111111111114e-06], mom=[[0.9, 0.999]] [2022-12-20 01:04:41,325] [INFO] [timer.py:197:stop] 0/3580, RunningAvgSamplesPerSec=12.00168993678109, CurrSamplesPerSec=12.035130766572436, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:04:47,714] [INFO] [timer.py:197:stop] 0/3581, RunningAvgSamplesPerSec=12.001692864055244, CurrSamplesPerSec=12.012175801911555, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:04:54,154] [INFO] [timer.py:197:stop] 0/3582, RunningAvgSamplesPerSec=12.001701864595667, CurrSamplesPerSec=12.034001516245302, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:05:00,764] [INFO] [timer.py:197:stop] 0/3583, RunningAvgSamplesPerSec=12.001705080755539, CurrSamplesPerSec=12.01322999262564, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:05:07,290] [INFO] [timer.py:197:stop] 0/3584, RunningAvgSamplesPerSec=12.001679754711, CurrSamplesPerSec=11.911667570277766, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:05:13,768] [INFO] [timer.py:197:stop] 0/3585, RunningAvgSamplesPerSec=12.00166113388671, CurrSamplesPerSec=11.935330082111694, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:05:20,235] [INFO] [timer.py:197:stop] 0/3586, RunningAvgSamplesPerSec=12.001641024386895, CurrSamplesPerSec=11.930018793519762, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:05:26,777] [INFO] [timer.py:197:stop] 0/3587, RunningAvgSamplesPerSec=12.001645061249365, CurrSamplesPerSec=12.016130643746761, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:05:33,473] [INFO] [timer.py:197:stop] 0/3588, RunningAvgSamplesPerSec=12.001640585557578, CurrSamplesPerSec=11.985616659342595, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:05:40,006] [INFO] [timer.py:197:stop] 0/3589, RunningAvgSamplesPerSec=12.00164813212946, CurrSamplesPerSec=12.028771314865006, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:05:46,523] [INFO] [logging.py:68:log_dist] [Rank 0] step=3590, skipped=6, lr=[3.148888888888889e-06], mom=[[0.9, 0.999]] [2022-12-20 01:05:46,523] [INFO] [timer.py:197:stop] 0/3590, RunningAvgSamplesPerSec=12.001603406574203, CurrSamplesPerSec=11.843289677192097, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:05:52,977] [INFO] [timer.py:197:stop] 0/3591, RunningAvgSamplesPerSec=12.001569147864338, CurrSamplesPerSec=11.879895428242872, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:05:59,482] [INFO] [timer.py:197:stop] 0/3592, RunningAvgSamplesPerSec=12.001535841993451, CurrSamplesPerSec=11.883180216675601, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:06:06,067] [INFO] [timer.py:197:stop] 0/3593, RunningAvgSamplesPerSec=12.00154690869528, CurrSamplesPerSec=12.04140836083711, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:06:12,542] [INFO] [timer.py:197:stop] 0/3594, RunningAvgSamplesPerSec=12.001535642324693, CurrSamplesPerSec=11.961214068629229, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:06:19,101] [INFO] [timer.py:197:stop] 0/3595, RunningAvgSamplesPerSec=12.001485361810483, CurrSamplesPerSec=11.823556116703875, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:06:25,639] [INFO] [timer.py:197:stop] 0/3596, RunningAvgSamplesPerSec=12.001434638519056, CurrSamplesPerSec=11.821912755678404, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:06:32,088] [INFO] [timer.py:197:stop] 0/3597, RunningAvgSamplesPerSec=12.001413301171445, CurrSamplesPerSec=11.925213906925515, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:06:38,637] [INFO] [timer.py:197:stop] 0/3598, RunningAvgSamplesPerSec=12.001417341557866, CurrSamplesPerSec=12.015960136632607, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:06:45,174] [INFO] [timer.py:197:stop] 0/3599, RunningAvgSamplesPerSec=12.001399874171199, CurrSamplesPerSec=11.938914278709964, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:06:51,637] [INFO] [logging.py:68:log_dist] [Rank 0] step=3600, skipped=6, lr=[3.1266666666666667e-06], mom=[[0.9, 0.999]] [2022-12-20 01:06:51,637] [INFO] [timer.py:197:stop] 0/3600, RunningAvgSamplesPerSec=12.001381375625929, CurrSamplesPerSec=11.935209089337556, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0, 'learning_rate': 3.1266666666666667e-06, 'epoch': 94.74} [2022-12-20 01:06:58,115] [INFO] [timer.py:197:stop] 0/3601, RunningAvgSamplesPerSec=12.001359423086758, CurrSamplesPerSec=11.922890760993472, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:07:04,605] [INFO] [timer.py:197:stop] 0/3602, RunningAvgSamplesPerSec=12.001350267753823, CurrSamplesPerSec=11.968490467090241, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:07:11,108] [INFO] [timer.py:197:stop] 0/3603, RunningAvgSamplesPerSec=12.001353400936381, CurrSamplesPerSec=12.012643472051044, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:07:17,609] [INFO] [timer.py:197:stop] 0/3604, RunningAvgSamplesPerSec=12.00133201572814, CurrSamplesPerSec=11.924814998760587, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:07:24,075] [INFO] [timer.py:197:stop] 0/3605, RunningAvgSamplesPerSec=12.00130658797054, CurrSamplesPerSec=11.910409698584862, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:07:30,493] [INFO] [timer.py:197:stop] 0/3606, RunningAvgSamplesPerSec=12.001309654564963, CurrSamplesPerSec=12.012368778625175, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:07:36,932] [INFO] [timer.py:197:stop] 0/3607, RunningAvgSamplesPerSec=12.001292964142396, CurrSamplesPerSec=11.941440752720311, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:07:43,395] [INFO] [timer.py:197:stop] 0/3608, RunningAvgSamplesPerSec=12.001255415557706, CurrSamplesPerSec=11.867402914344671, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:07:49,878] [INFO] [timer.py:197:stop] 0/3609, RunningAvgSamplesPerSec=12.001220584318196, CurrSamplesPerSec=11.876920386828518, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:07:54,585] [INFO] [logging.py:68:log_dist] [Rank 0] step=3610, skipped=6, lr=[3.104444444444445e-06], mom=[[0.9, 0.999]] [2022-12-20 01:07:54,585] [INFO] [timer.py:197:stop] 0/3610, RunningAvgSamplesPerSec=12.002114576777096, CurrSamplesPerSec=16.411844240377793, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:08:01,111] [INFO] [timer.py:197:stop] 0/3611, RunningAvgSamplesPerSec=12.002086992938489, CurrSamplesPerSec=11.903383191751743, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:08:07,614] [INFO] [timer.py:197:stop] 0/3612, RunningAvgSamplesPerSec=12.002058398675969, CurrSamplesPerSec=11.899741693567954, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:08:14,141] [INFO] [timer.py:197:stop] 0/3613, RunningAvgSamplesPerSec=12.0020575099752, CurrSamplesPerSec=11.998850157777001, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:08:20,648] [INFO] [timer.py:197:stop] 0/3614, RunningAvgSamplesPerSec=12.002027067015844, CurrSamplesPerSec=11.893095547736076, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:08:27,191] [INFO] [timer.py:197:stop] 0/3615, RunningAvgSamplesPerSec=12.002004228608662, CurrSamplesPerSec=11.92007517358981, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:08:33,665] [INFO] [timer.py:197:stop] 0/3616, RunningAvgSamplesPerSec=12.001992286769484, CurrSamplesPerSec=11.959001013574124, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:08:40,254] [INFO] [timer.py:197:stop] 0/3617, RunningAvgSamplesPerSec=12.00198005749671, CurrSamplesPerSec=11.957945664641212, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:08:46,636] [INFO] [timer.py:197:stop] 0/3618, RunningAvgSamplesPerSec=12.001977493398183, CurrSamplesPerSec=11.992715432358247, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:08:53,138] [INFO] [timer.py:197:stop] 0/3619, RunningAvgSamplesPerSec=12.001974454037653, CurrSamplesPerSec=11.990994183885213, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:08:59,697] [INFO] [logging.py:68:log_dist] [Rank 0] step=3620, skipped=6, lr=[3.0822222222222227e-06], mom=[[0.9, 0.999]] [2022-12-20 01:08:59,698] [INFO] [timer.py:197:stop] 0/3620, RunningAvgSamplesPerSec=12.001932851767638, CurrSamplesPerSec=11.853321189552403, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:09:06,168] [INFO] [timer.py:197:stop] 0/3621, RunningAvgSamplesPerSec=12.001923500655884, CurrSamplesPerSec=11.968186306672177, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:09:12,677] [INFO] [timer.py:197:stop] 0/3622, RunningAvgSamplesPerSec=12.001891316106068, CurrSamplesPerSec=11.886535246582474, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:09:19,110] [INFO] [timer.py:197:stop] 0/3623, RunningAvgSamplesPerSec=12.001886366308891, CurrSamplesPerSec=11.983994819169377, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:09:25,610] [INFO] [timer.py:197:stop] 0/3624, RunningAvgSamplesPerSec=12.00186434775561, CurrSamplesPerSec=11.922661461658471, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:09:32,064] [INFO] [timer.py:197:stop] 0/3625, RunningAvgSamplesPerSec=12.001864353158291, CurrSamplesPerSec=12.00188392170118, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0, 'learning_rate': 3.0711111111111115e-06, 'epoch': 95.39} [2022-12-20 01:09:38,894] [INFO] [timer.py:197:stop] 0/3626, RunningAvgSamplesPerSec=12.001839699641497, CurrSamplesPerSec=11.913180010602394, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:09:45,385] [INFO] [timer.py:197:stop] 0/3627, RunningAvgSamplesPerSec=12.00182089941549, CurrSamplesPerSec=11.934073574960179, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:09:51,873] [INFO] [timer.py:197:stop] 0/3628, RunningAvgSamplesPerSec=12.00179979390893, CurrSamplesPerSec=11.92577708566464, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:09:58,327] [INFO] [timer.py:197:stop] 0/3629, RunningAvgSamplesPerSec=12.001799856901302, CurrSamplesPerSec=12.002028271595064, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:10:04,806] [INFO] [logging.py:68:log_dist] [Rank 0] step=3630, skipped=6, lr=[3.0600000000000003e-06], mom=[[0.9, 0.999]] [2022-12-20 01:10:04,807] [INFO] [timer.py:197:stop] 0/3630, RunningAvgSamplesPerSec=12.001778652649822, CurrSamplesPerSec=11.925360656750023, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:10:11,314] [INFO] [timer.py:197:stop] 0/3631, RunningAvgSamplesPerSec=12.00175217431446, CurrSamplesPerSec=11.906451779162895, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:10:17,849] [INFO] [timer.py:197:stop] 0/3632, RunningAvgSamplesPerSec=12.001722916541967, CurrSamplesPerSec=11.89647779587915, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:10:24,354] [INFO] [timer.py:197:stop] 0/3633, RunningAvgSamplesPerSec=12.001699035346661, CurrSamplesPerSec=11.915632132232908, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:10:30,847] [INFO] [timer.py:197:stop] 0/3634, RunningAvgSamplesPerSec=12.001669484729089, CurrSamplesPerSec=11.895322228481263, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:10:37,327] [INFO] [timer.py:197:stop] 0/3635, RunningAvgSamplesPerSec=12.001659128311815, CurrSamplesPerSec=11.964162172675858, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:10:43,849] [INFO] [timer.py:197:stop] 0/3636, RunningAvgSamplesPerSec=12.001624944917602, CurrSamplesPerSec=11.878708912408438, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:10:50,283] [INFO] [timer.py:197:stop] 0/3637, RunningAvgSamplesPerSec=12.001619834518408, CurrSamplesPerSec=11.983077344207983, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:10:56,776] [INFO] [timer.py:197:stop] 0/3638, RunningAvgSamplesPerSec=12.00159065822966, CurrSamplesPerSec=11.896464088015824, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:11:03,279] [INFO] [timer.py:197:stop] 0/3639, RunningAvgSamplesPerSec=12.001566492452294, CurrSamplesPerSec=11.914338521540667, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:11:09,761] [INFO] [logging.py:68:log_dist] [Rank 0] step=3640, skipped=6, lr=[3.037777777777778e-06], mom=[[0.9, 0.999]] [2022-12-20 01:11:09,762] [INFO] [timer.py:197:stop] 0/3640, RunningAvgSamplesPerSec=12.001559467859476, CurrSamplesPerSec=11.976065309653718, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:11:16,259] [INFO] [timer.py:197:stop] 0/3641, RunningAvgSamplesPerSec=12.001528369179564, CurrSamplesPerSec=11.889448228402806, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:11:22,762] [INFO] [timer.py:197:stop] 0/3642, RunningAvgSamplesPerSec=12.001507830289947, CurrSamplesPerSec=11.927229516367182, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:11:29,234] [INFO] [timer.py:197:stop] 0/3643, RunningAvgSamplesPerSec=12.001482886217595, CurrSamplesPerSec=11.91136840528376, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:11:35,751] [INFO] [timer.py:197:stop] 0/3644, RunningAvgSamplesPerSec=12.001460160872842, CurrSamplesPerSec=11.919283893073457, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:11:42,171] [INFO] [timer.py:197:stop] 0/3645, RunningAvgSamplesPerSec=12.001452544249581, CurrSamplesPerSec=11.973776788678977, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:11:48,657] [INFO] [timer.py:197:stop] 0/3646, RunningAvgSamplesPerSec=12.00142296160072, CurrSamplesPerSec=11.894612762297179, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:11:55,213] [INFO] [timer.py:197:stop] 0/3647, RunningAvgSamplesPerSec=12.00139541808943, CurrSamplesPerSec=11.901859517461078, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:11:59,822] [INFO] [timer.py:197:stop] 0/3648, RunningAvgSamplesPerSec=12.002287280575041, CurrSamplesPerSec=16.461146553841353, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:12:06,337] [INFO] [timer.py:197:stop] 0/3649, RunningAvgSamplesPerSec=12.002286292237383, CurrSamplesPerSec=11.998683894989624, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:12:12,824] [INFO] [logging.py:68:log_dist] [Rank 0] step=3650, skipped=6, lr=[3.015555555555556e-06], mom=[[0.9, 0.999]] [2022-12-20 01:12:12,824] [INFO] [timer.py:197:stop] 0/3650, RunningAvgSamplesPerSec=12.002262962027245, CurrSamplesPerSec=11.917776781110456, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0, 'learning_rate': 3.015555555555556e-06, 'epoch': 96.05} [2022-12-20 01:12:19,401] [INFO] [timer.py:197:stop] 0/3651, RunningAvgSamplesPerSec=12.002235205965205, CurrSamplesPerSec=11.90182838305134, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:12:25,875] [INFO] [timer.py:197:stop] 0/3652, RunningAvgSamplesPerSec=12.002215520714376, CurrSamplesPerSec=11.930811500032044, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:12:32,420] [INFO] [timer.py:197:stop] 0/3653, RunningAvgSamplesPerSec=12.002157414412597, CurrSamplesPerSec=11.793753097702952, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:12:38,943] [INFO] [timer.py:197:stop] 0/3654, RunningAvgSamplesPerSec=12.00213649764838, CurrSamplesPerSec=11.92625235877799, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:12:45,368] [INFO] [timer.py:197:stop] 0/3655, RunningAvgSamplesPerSec=12.002144447593992, CurrSamplesPerSec=12.031248069943413, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:12:51,828] [INFO] [timer.py:197:stop] 0/3656, RunningAvgSamplesPerSec=12.002123958552911, CurrSamplesPerSec=11.927741475201861, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:12:58,259] [INFO] [timer.py:197:stop] 0/3657, RunningAvgSamplesPerSec=12.002129362559806, CurrSamplesPerSec=12.021908153288283, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:13:04,767] [INFO] [timer.py:197:stop] 0/3658, RunningAvgSamplesPerSec=12.002108392639066, CurrSamplesPerSec=11.925949811533572, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:13:11,298] [INFO] [timer.py:197:stop] 0/3659, RunningAvgSamplesPerSec=12.002105172226269, CurrSamplesPerSec=11.990342884765244, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:13:17,739] [INFO] [logging.py:68:log_dist] [Rank 0] step=3660, skipped=6, lr=[2.9933333333333336e-06], mom=[[0.9, 0.999]] [2022-12-20 01:13:17,740] [INFO] [timer.py:197:stop] 0/3660, RunningAvgSamplesPerSec=12.00210753479199, CurrSamplesPerSec=12.01075366338886, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:13:24,267] [INFO] [timer.py:197:stop] 0/3661, RunningAvgSamplesPerSec=12.002066075708484, CurrSamplesPerSec=11.85230167876893, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:13:30,738] [INFO] [timer.py:197:stop] 0/3662, RunningAvgSamplesPerSec=12.002061971037655, CurrSamplesPerSec=11.98706175638345, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:13:37,188] [INFO] [timer.py:197:stop] 0/3663, RunningAvgSamplesPerSec=12.002028662812561, CurrSamplesPerSec=11.881346695810777, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:13:43,723] [INFO] [timer.py:197:stop] 0/3664, RunningAvgSamplesPerSec=12.002009481461203, CurrSamplesPerSec=11.932195144566965, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:13:50,265] [INFO] [timer.py:197:stop] 0/3665, RunningAvgSamplesPerSec=12.001999334740713, CurrSamplesPerSec=11.964956756673393, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:13:56,893] [INFO] [timer.py:197:stop] 0/3666, RunningAvgSamplesPerSec=12.00194812005061, CurrSamplesPerSec=11.817236676174677, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:14:03,447] [INFO] [timer.py:197:stop] 0/3667, RunningAvgSamplesPerSec=12.0019527377851, CurrSamplesPerSec=12.018896008731264, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:14:09,869] [INFO] [timer.py:197:stop] 0/3668, RunningAvgSamplesPerSec=12.00196083940411, CurrSamplesPerSec=12.031726933443487, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:14:16,285] [INFO] [timer.py:197:stop] 0/3669, RunningAvgSamplesPerSec=12.001966682294007, CurrSamplesPerSec=12.023425024026878, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:14:22,757] [INFO] [logging.py:68:log_dist] [Rank 0] step=3670, skipped=6, lr=[2.9711111111111112e-06], mom=[[0.9, 0.999]] [2022-12-20 01:14:22,758] [INFO] [timer.py:197:stop] 0/3670, RunningAvgSamplesPerSec=12.00194429299892, CurrSamplesPerSec=11.920400714134857, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:14:29,178] [INFO] [timer.py:197:stop] 0/3671, RunningAvgSamplesPerSec=12.001915044445221, CurrSamplesPerSec=11.895582106143058, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:14:35,695] [INFO] [timer.py:197:stop] 0/3672, RunningAvgSamplesPerSec=12.001890152378166, CurrSamplesPerSec=11.911251069053323, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:14:42,176] [INFO] [timer.py:197:stop] 0/3673, RunningAvgSamplesPerSec=12.001864115684809, CurrSamplesPerSec=11.907064418912201, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:14:48,576] [INFO] [timer.py:197:stop] 0/3674, RunningAvgSamplesPerSec=12.001874511154696, CurrSamplesPerSec=12.040158042532475, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:14:55,136] [INFO] [timer.py:197:stop] 0/3675, RunningAvgSamplesPerSec=12.001848594647038, CurrSamplesPerSec=11.90743203302483, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0, 'learning_rate': 2.96e-06, 'epoch': 96.71} [2022-12-20 01:15:01,673] [INFO] [timer.py:197:stop] 0/3676, RunningAvgSamplesPerSec=12.001816872668595, CurrSamplesPerSec=11.886422609771936, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:15:08,130] [INFO] [timer.py:197:stop] 0/3677, RunningAvgSamplesPerSec=12.001817478476566, CurrSamplesPerSec=12.004043629918963, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:15:14,645] [INFO] [timer.py:197:stop] 0/3678, RunningAvgSamplesPerSec=12.00179025708349, CurrSamplesPerSec=11.902578820447065, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:15:21,095] [INFO] [timer.py:197:stop] 0/3679, RunningAvgSamplesPerSec=12.001756477025136, CurrSamplesPerSec=11.878852942895797, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:15:27,540] [INFO] [logging.py:68:log_dist] [Rank 0] step=3680, skipped=6, lr=[2.948888888888889e-06], mom=[[0.9, 0.999]] [2022-12-20 01:15:27,541] [INFO] [timer.py:197:stop] 0/3680, RunningAvgSamplesPerSec=12.001736419305939, CurrSamplesPerSec=11.928434756591868, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:15:33,985] [INFO] [timer.py:197:stop] 0/3681, RunningAvgSamplesPerSec=12.001741299567259, CurrSamplesPerSec=12.019717793347715, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:15:40,487] [INFO] [timer.py:197:stop] 0/3682, RunningAvgSamplesPerSec=12.001720458883726, CurrSamplesPerSec=11.925534430165804, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:15:46,976] [INFO] [timer.py:197:stop] 0/3683, RunningAvgSamplesPerSec=12.00169512124779, CurrSamplesPerSec=11.909171641512604, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:15:53,448] [INFO] [timer.py:197:stop] 0/3684, RunningAvgSamplesPerSec=12.001694196978843, CurrSamplesPerSec=11.998292927431, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:15:59,912] [INFO] [timer.py:197:stop] 0/3685, RunningAvgSamplesPerSec=12.00169933658709, CurrSamplesPerSec=12.020653268453666, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:16:04,495] [INFO] [timer.py:197:stop] 0/3686, RunningAvgSamplesPerSec=12.00259606352899, CurrSamplesPerSec=16.559449918876197, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:16:10,950] [INFO] [timer.py:197:stop] 0/3687, RunningAvgSamplesPerSec=12.002576903348658, CurrSamplesPerSec=11.932403594112893, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:16:17,378] [INFO] [timer.py:197:stop] 0/3688, RunningAvgSamplesPerSec=12.00258271628164, CurrSamplesPerSec=12.024041681900036, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:16:23,925] [INFO] [timer.py:197:stop] 0/3689, RunningAvgSamplesPerSec=12.002547496976154, CurrSamplesPerSec=11.874118588109605, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:16:30,374] [INFO] [logging.py:68:log_dist] [Rank 0] step=3690, skipped=6, lr=[2.9266666666666673e-06], mom=[[0.9, 0.999]] [2022-12-20 01:16:30,375] [INFO] [timer.py:197:stop] 0/3690, RunningAvgSamplesPerSec=12.002518943466121, CurrSamplesPerSec=11.898157775505036, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:16:36,797] [INFO] [timer.py:197:stop] 0/3691, RunningAvgSamplesPerSec=12.002520770180883, CurrSamplesPerSec=12.009261478760418, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:16:43,296] [INFO] [timer.py:197:stop] 0/3692, RunningAvgSamplesPerSec=12.00249682967409, CurrSamplesPerSec=11.914825575834174, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:16:49,689] [INFO] [timer.py:197:stop] 0/3693, RunningAvgSamplesPerSec=12.00249833278336, CurrSamplesPerSec=12.008047370947317, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:16:56,160] [INFO] [timer.py:197:stop] 0/3694, RunningAvgSamplesPerSec=12.002483609986742, CurrSamplesPerSec=11.948386760212017, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:17:02,689] [INFO] [timer.py:197:stop] 0/3695, RunningAvgSamplesPerSec=12.002455045165327, CurrSamplesPerSec=11.89791255116723, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:17:09,235] [INFO] [timer.py:197:stop] 0/3696, RunningAvgSamplesPerSec=12.002414723894887, CurrSamplesPerSec=11.85533351087486, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:17:15,766] [INFO] [timer.py:197:stop] 0/3697, RunningAvgSamplesPerSec=12.002358335575188, CurrSamplesPerSec=11.797614137827859, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:17:22,329] [INFO] [timer.py:197:stop] 0/3698, RunningAvgSamplesPerSec=12.00233582538132, CurrSamplesPerSec=11.919733242831109, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:17:28,807] [INFO] [timer.py:197:stop] 0/3699, RunningAvgSamplesPerSec=12.002334809836674, CurrSamplesPerSec=11.998582530583404, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:17:35,311] [INFO] [logging.py:68:log_dist] [Rank 0] step=3700, skipped=6, lr=[2.904444444444445e-06], mom=[[0.9, 0.999]] [2022-12-20 01:17:35,312] [INFO] [timer.py:197:stop] 0/3700, RunningAvgSamplesPerSec=12.002336042591805, CurrSamplesPerSec=12.006895270002046, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0, 'learning_rate': 2.904444444444445e-06, 'epoch': 97.37} [2022-12-20 01:17:41,838] [INFO] [timer.py:197:stop] 0/3701, RunningAvgSamplesPerSec=12.002317389763387, CurrSamplesPerSec=11.933733493311967, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:17:48,260] [INFO] [timer.py:197:stop] 0/3702, RunningAvgSamplesPerSec=12.002312342253713, CurrSamplesPerSec=11.983670610800317, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:17:54,791] [INFO] [timer.py:197:stop] 0/3703, RunningAvgSamplesPerSec=12.002292512048422, CurrSamplesPerSec=11.929366679182552, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:18:01,237] [INFO] [timer.py:197:stop] 0/3704, RunningAvgSamplesPerSec=12.002287231486553, CurrSamplesPerSec=11.98277565135907, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:18:07,621] [INFO] [timer.py:197:stop] 0/3705, RunningAvgSamplesPerSec=12.002293882315913, CurrSamplesPerSec=12.026965878126296, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:18:14,061] [INFO] [timer.py:197:stop] 0/3706, RunningAvgSamplesPerSec=12.002275533778128, CurrSamplesPerSec=11.934713469106113, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:18:20,545] [INFO] [timer.py:197:stop] 0/3707, RunningAvgSamplesPerSec=12.002257738797265, CurrSamplesPerSec=11.936705220600441, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:18:27,044] [INFO] [timer.py:197:stop] 0/3708, RunningAvgSamplesPerSec=12.00222978363564, CurrSamplesPerSec=11.89954229606628, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:18:34,121] [INFO] [timer.py:197:stop] 0/3709, RunningAvgSamplesPerSec=12.0021730068458, CurrSamplesPerSec=11.795384492146361, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:18:41,041] [INFO] [logging.py:68:log_dist] [Rank 0] step=3710, skipped=6, lr=[2.8822222222222225e-06], mom=[[0.9, 0.999]] [2022-12-20 01:18:41,042] [INFO] [timer.py:197:stop] 0/3710, RunningAvgSamplesPerSec=12.002162308764023, CurrSamplesPerSec=11.962635161259444, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:18:47,731] [INFO] [timer.py:197:stop] 0/3711, RunningAvgSamplesPerSec=12.002150554053058, CurrSamplesPerSec=11.958721842077068, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:18:54,731] [INFO] [timer.py:197:stop] 0/3712, RunningAvgSamplesPerSec=12.002125971507137, CurrSamplesPerSec=11.911636913103218, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:19:01,348] [INFO] [timer.py:197:stop] 0/3713, RunningAvgSamplesPerSec=12.002118653006079, CurrSamplesPerSec=11.975028315372198, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:19:07,852] [INFO] [timer.py:197:stop] 0/3714, RunningAvgSamplesPerSec=12.002100302265985, CurrSamplesPerSec=11.934385023987213, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:19:14,377] [INFO] [timer.py:197:stop] 0/3715, RunningAvgSamplesPerSec=12.002084020540039, CurrSamplesPerSec=11.941949150921229, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:19:20,777] [INFO] [timer.py:197:stop] 0/3716, RunningAvgSamplesPerSec=12.002084465723533, CurrSamplesPerSec=12.00373765978424, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:19:27,241] [INFO] [timer.py:197:stop] 0/3717, RunningAvgSamplesPerSec=12.00205095469168, CurrSamplesPerSec=11.878868712869332, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:19:33,793] [INFO] [timer.py:197:stop] 0/3718, RunningAvgSamplesPerSec=12.002037753052178, CurrSamplesPerSec=11.953193309749823, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:19:40,443] [INFO] [timer.py:197:stop] 0/3719, RunningAvgSamplesPerSec=12.002022024854298, CurrSamplesPerSec=11.943859351244887, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:19:47,046] [INFO] [logging.py:68:log_dist] [Rank 0] step=3720, skipped=6, lr=[2.86e-06], mom=[[0.9, 0.999]] [2022-12-20 01:19:47,047] [INFO] [timer.py:197:stop] 0/3720, RunningAvgSamplesPerSec=12.001996242292385, CurrSamplesPerSec=11.906921816099793, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:19:53,674] [INFO] [timer.py:197:stop] 0/3721, RunningAvgSamplesPerSec=12.001978190967863, CurrSamplesPerSec=11.935236683964842, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:20:00,090] [INFO] [timer.py:197:stop] 0/3722, RunningAvgSamplesPerSec=12.001982572969617, CurrSamplesPerSec=12.018301401661933, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:20:06,579] [INFO] [timer.py:197:stop] 0/3723, RunningAvgSamplesPerSec=12.001970738542001, CurrSamplesPerSec=11.958107604120245, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:20:11,319] [INFO] [timer.py:197:stop] 0/3724, RunningAvgSamplesPerSec=12.002845231145457, CurrSamplesPerSec=16.467542827899916, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:20:17,861] [INFO] [timer.py:197:stop] 0/3725, RunningAvgSamplesPerSec=12.002802788282642, CurrSamplesPerSec=11.846883107936138, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0, 'learning_rate': 2.8488888888888894e-06, 'epoch': 98.03} [2022-12-20 01:20:24,390] [INFO] [timer.py:197:stop] 0/3726, RunningAvgSamplesPerSec=12.002799059651734, CurrSamplesPerSec=11.988933407234482, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:20:30,907] [INFO] [timer.py:197:stop] 0/3727, RunningAvgSamplesPerSec=12.002796911403328, CurrSamplesPerSec=11.99480216440941, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:20:37,570] [INFO] [timer.py:197:stop] 0/3728, RunningAvgSamplesPerSec=12.002777382494298, CurrSamplesPerSec=11.9304705437856, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:20:44,218] [INFO] [timer.py:197:stop] 0/3729, RunningAvgSamplesPerSec=12.002756828767772, CurrSamplesPerSec=11.926659311680064, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:20:50,693] [INFO] [logging.py:68:log_dist] [Rank 0] step=3730, skipped=6, lr=[2.837777777777778e-06], mom=[[0.9, 0.999]] [2022-12-20 01:20:50,694] [INFO] [timer.py:197:stop] 0/3730, RunningAvgSamplesPerSec=12.002736233736467, CurrSamplesPerSec=11.926466429498769, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:20:57,201] [INFO] [timer.py:197:stop] 0/3731, RunningAvgSamplesPerSec=12.002712164754747, CurrSamplesPerSec=11.913648992679576, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:21:03,707] [INFO] [timer.py:197:stop] 0/3732, RunningAvgSamplesPerSec=12.00267540395337, CurrSamplesPerSec=11.867142692560735, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:21:10,366] [INFO] [timer.py:197:stop] 0/3733, RunningAvgSamplesPerSec=12.002610652913782, CurrSamplesPerSec=11.765854652329415, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:21:16,915] [INFO] [timer.py:197:stop] 0/3734, RunningAvgSamplesPerSec=12.002612160232543, CurrSamplesPerSec=12.008238603499269, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:21:23,504] [INFO] [timer.py:197:stop] 0/3735, RunningAvgSamplesPerSec=12.002585582133367, CurrSamplesPerSec=11.904209316119447, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:21:29,920] [INFO] [timer.py:197:stop] 0/3736, RunningAvgSamplesPerSec=12.002590923066487, CurrSamplesPerSec=12.022561809258676, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:21:36,386] [INFO] [timer.py:197:stop] 0/3737, RunningAvgSamplesPerSec=12.002564867309067, CurrSamplesPerSec=11.906055181357296, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:21:42,761] [INFO] [timer.py:197:stop] 0/3738, RunningAvgSamplesPerSec=12.00254470182354, CurrSamplesPerSec=11.92769642530076, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:21:49,434] [INFO] [timer.py:197:stop] 0/3739, RunningAvgSamplesPerSec=12.002538830584985, CurrSamplesPerSec=11.980643907580559, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:21:55,912] [INFO] [logging.py:68:log_dist] [Rank 0] step=3740, skipped=6, lr=[2.815555555555556e-06], mom=[[0.9, 0.999]] [2022-12-20 01:21:55,912] [INFO] [timer.py:197:stop] 0/3740, RunningAvgSamplesPerSec=12.002540660917548, CurrSamplesPerSec=12.009384514888305, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:22:02,391] [INFO] [timer.py:197:stop] 0/3741, RunningAvgSamplesPerSec=12.002537577863508, CurrSamplesPerSec=11.991024179627903, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:22:08,872] [INFO] [timer.py:197:stop] 0/3742, RunningAvgSamplesPerSec=12.002511309271553, CurrSamplesPerSec=11.905090465849765, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:22:15,334] [INFO] [timer.py:197:stop] 0/3743, RunningAvgSamplesPerSec=12.002482367215034, CurrSamplesPerSec=11.895206789511143, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:22:22,002] [INFO] [timer.py:197:stop] 0/3744, RunningAvgSamplesPerSec=12.00245840323543, CurrSamplesPerSec=11.91347397926575, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:22:28,684] [INFO] [timer.py:197:stop] 0/3745, RunningAvgSamplesPerSec=12.002456830310415, CurrSamplesPerSec=11.99657383063379, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:22:35,171] [INFO] [timer.py:197:stop] 0/3746, RunningAvgSamplesPerSec=12.002434523123913, CurrSamplesPerSec=11.91951570885908, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:22:41,557] [INFO] [timer.py:197:stop] 0/3747, RunningAvgSamplesPerSec=12.002430922988713, CurrSamplesPerSec=11.988967140857609, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:22:48,115] [INFO] [timer.py:197:stop] 0/3748, RunningAvgSamplesPerSec=12.002430143311852, CurrSamplesPerSec=11.999510963820128, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:22:54,723] [INFO] [timer.py:197:stop] 0/3749, RunningAvgSamplesPerSec=12.002426948298387, CurrSamplesPerSec=11.990470353859761, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:23:01,417] [INFO] [logging.py:68:log_dist] [Rank 0] step=3750, skipped=6, lr=[2.7933333333333334e-06], mom=[[0.9, 0.999]] [2022-12-20 01:23:01,417] [INFO] [timer.py:197:stop] 0/3750, RunningAvgSamplesPerSec=12.00240312145173, CurrSamplesPerSec=11.913783296490106, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0, 'learning_rate': 2.7933333333333334e-06, 'epoch': 98.68} [2022-12-20 01:23:08,039] [INFO] [timer.py:197:stop] 0/3751, RunningAvgSamplesPerSec=12.002370257580878, CurrSamplesPerSec=11.880448024926173, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:23:14,484] [INFO] [timer.py:197:stop] 0/3752, RunningAvgSamplesPerSec=12.002368626543769, CurrSamplesPerSec=11.996256982908173, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:23:21,110] [INFO] [timer.py:197:stop] 0/3753, RunningAvgSamplesPerSec=12.002345663474326, CurrSamplesPerSec=11.916847726442363, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:23:27,548] [INFO] [timer.py:197:stop] 0/3754, RunningAvgSamplesPerSec=12.002335373927043, CurrSamplesPerSec=11.963863031148867, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:23:34,037] [INFO] [timer.py:197:stop] 0/3755, RunningAvgSamplesPerSec=12.00230249115106, CurrSamplesPerSec=11.88018197291278, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:23:40,534] [INFO] [timer.py:197:stop] 0/3756, RunningAvgSamplesPerSec=12.002303558585728, CurrSamplesPerSec=12.006310978826555, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:23:47,127] [INFO] [timer.py:197:stop] 0/3757, RunningAvgSamplesPerSec=12.002276317093443, CurrSamplesPerSec=11.900875959907857, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:23:53,719] [INFO] [timer.py:197:stop] 0/3758, RunningAvgSamplesPerSec=12.002254345361129, CurrSamplesPerSec=11.920313901584251, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:24:00,275] [INFO] [timer.py:197:stop] 0/3759, RunningAvgSamplesPerSec=12.002231221563777, CurrSamplesPerSec=11.916002391408686, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:24:06,715] [INFO] [logging.py:68:log_dist] [Rank 0] step=3760, skipped=6, lr=[2.771111111111111e-06], mom=[[0.9, 0.999]] [2022-12-20 01:24:06,715] [INFO] [timer.py:197:stop] 0/3760, RunningAvgSamplesPerSec=12.002234998732675, CurrSamplesPerSec=12.016442625125945, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:24:13,167] [INFO] [timer.py:197:stop] 0/3761, RunningAvgSamplesPerSec=12.002216203137513, CurrSamplesPerSec=11.931995718739556, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:24:17,780] [INFO] [timer.py:197:stop] 0/3762, RunningAvgSamplesPerSec=12.003109975014832, CurrSamplesPerSec=16.669188200394554, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:24:24,225] [INFO] [timer.py:197:stop] 0/3763, RunningAvgSamplesPerSec=12.003108511618013, CurrSamplesPerSec=11.997608661451107, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:24:30,693] [INFO] [timer.py:197:stop] 0/3764, RunningAvgSamplesPerSec=12.003101806212625, CurrSamplesPerSec=11.977935665741578, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:24:37,167] [INFO] [timer.py:197:stop] 0/3765, RunningAvgSamplesPerSec=12.003077456162538, CurrSamplesPerSec=11.91216656440893, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:24:43,703] [INFO] [timer.py:197:stop] 0/3766, RunningAvgSamplesPerSec=12.003077160427795, CurrSamplesPerSec=12.001964413778795, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:24:50,379] [INFO] [timer.py:197:stop] 0/3767, RunningAvgSamplesPerSec=12.003055156863498, CurrSamplesPerSec=11.920801444553458, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:24:56,994] [INFO] [timer.py:197:stop] 0/3768, RunningAvgSamplesPerSec=12.003051224644754, CurrSamplesPerSec=11.988264663956027, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:25:03,533] [INFO] [timer.py:197:stop] 0/3769, RunningAvgSamplesPerSec=12.003034566123235, CurrSamplesPerSec=11.940624856127581, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:25:10,024] [INFO] [logging.py:68:log_dist] [Rank 0] step=3770, skipped=6, lr=[2.748888888888889e-06], mom=[[0.9, 0.999]] [2022-12-20 01:25:10,024] [INFO] [timer.py:197:stop] 0/3770, RunningAvgSamplesPerSec=12.003005514161826, CurrSamplesPerSec=11.894555840000539, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:25:16,527] [INFO] [timer.py:197:stop] 0/3771, RunningAvgSamplesPerSec=12.002982023836896, CurrSamplesPerSec=11.915118567956409, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:25:23,191] [INFO] [timer.py:197:stop] 0/3772, RunningAvgSamplesPerSec=12.002962867905662, CurrSamplesPerSec=11.931195960854668, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:25:29,800] [INFO] [timer.py:197:stop] 0/3773, RunningAvgSamplesPerSec=12.002947273899677, CurrSamplesPerSec=11.944444488940968, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:25:36,414] [INFO] [timer.py:197:stop] 0/3774, RunningAvgSamplesPerSec=12.002956239328027, CurrSamplesPerSec=12.036860392521112, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:25:42,895] [INFO] [timer.py:197:stop] 0/3775, RunningAvgSamplesPerSec=12.002928051135349, CurrSamplesPerSec=11.897536032827897, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0, 'learning_rate': 2.7377777777777783e-06, 'epoch': 99.34} [2022-12-20 01:25:49,379] [INFO] [timer.py:197:stop] 0/3776, RunningAvgSamplesPerSec=12.002900537868188, CurrSamplesPerSec=11.899983300552854, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:25:55,847] [INFO] [timer.py:197:stop] 0/3777, RunningAvgSamplesPerSec=12.00287326151822, CurrSamplesPerSec=11.90080789782972, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:26:02,331] [INFO] [timer.py:197:stop] 0/3778, RunningAvgSamplesPerSec=12.002843431102372, CurrSamplesPerSec=11.8912805633986, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:26:08,868] [INFO] [timer.py:197:stop] 0/3779, RunningAvgSamplesPerSec=12.002820318694189, CurrSamplesPerSec=11.916178008091277, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:26:15,463] [INFO] [logging.py:68:log_dist] [Rank 0] step=3780, skipped=6, lr=[2.726666666666667e-06], mom=[[0.9, 0.999]] [2022-12-20 01:26:15,464] [INFO] [timer.py:197:stop] 0/3780, RunningAvgSamplesPerSec=12.002798747571536, CurrSamplesPerSec=11.921874073584222, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:26:22,094] [INFO] [timer.py:197:stop] 0/3781, RunningAvgSamplesPerSec=12.002800507806182, CurrSamplesPerSec=12.009454361846904, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:26:28,601] [INFO] [timer.py:197:stop] 0/3782, RunningAvgSamplesPerSec=12.002803165004929, CurrSamplesPerSec=12.012853129107196, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:26:35,103] [INFO] [timer.py:197:stop] 0/3783, RunningAvgSamplesPerSec=12.002779369047683, CurrSamplesPerSec=11.913499887271636, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:26:41,572] [INFO] [timer.py:197:stop] 0/3784, RunningAvgSamplesPerSec=12.002761541563041, CurrSamplesPerSec=11.935732347653875, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:26:48,155] [INFO] [timer.py:197:stop] 0/3785, RunningAvgSamplesPerSec=12.002726908659715, CurrSamplesPerSec=11.873159566085375, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:26:54,651] [INFO] [timer.py:197:stop] 0/3786, RunningAvgSamplesPerSec=12.002694099324355, CurrSamplesPerSec=11.879847058763133, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:27:01,134] [INFO] [timer.py:197:stop] 0/3787, RunningAvgSamplesPerSec=12.0026906688517, CurrSamplesPerSec=11.989723787712729, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:27:07,657] [INFO] [timer.py:197:stop] 0/3788, RunningAvgSamplesPerSec=12.002669680961908, CurrSamplesPerSec=11.923752963054351, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:27:14,200] [INFO] [timer.py:197:stop] 0/3789, RunningAvgSamplesPerSec=12.002640679989405, CurrSamplesPerSec=11.893838560340875, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:27:20,713] [INFO] [logging.py:68:log_dist] [Rank 0] step=3790, skipped=6, lr=[2.7044444444444447e-06], mom=[[0.9, 0.999]] [2022-12-20 01:27:20,714] [INFO] [timer.py:197:stop] 0/3790, RunningAvgSamplesPerSec=12.002617920076055, CurrSamplesPerSec=11.917040826677683, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:27:27,223] [INFO] [timer.py:197:stop] 0/3791, RunningAvgSamplesPerSec=12.002619763278421, CurrSamplesPerSec=12.009605878811433, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:27:33,694] [INFO] [timer.py:197:stop] 0/3792, RunningAvgSamplesPerSec=12.002620966453371, CurrSamplesPerSec=12.007181528988637, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:27:40,113] [INFO] [timer.py:197:stop] 0/3793, RunningAvgSamplesPerSec=12.00260310689458, CurrSamplesPerSec=11.935295057635326, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:27:46,557] [INFO] [timer.py:197:stop] 0/3794, RunningAvgSamplesPerSec=12.002590680293224, CurrSamplesPerSec=11.95566566023413, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:27:52,979] [INFO] [timer.py:197:stop] 0/3795, RunningAvgSamplesPerSec=12.002595340284198, CurrSamplesPerSec=12.02029208675785, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:27:59,528] [INFO] [timer.py:197:stop] 0/3796, RunningAvgSamplesPerSec=12.002565172656446, CurrSamplesPerSec=11.889220213575536, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:28:06,009] [INFO] [timer.py:197:stop] 0/3797, RunningAvgSamplesPerSec=12.002537709371332, CurrSamplesPerSec=11.89923899257433, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:28:12,481] [INFO] [timer.py:197:stop] 0/3798, RunningAvgSamplesPerSec=12.002542483222479, CurrSamplesPerSec=12.020686642522488, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:28:18,925] [INFO] [timer.py:197:stop] 0/3799, RunningAvgSamplesPerSec=12.002524572860725, CurrSamplesPerSec=11.934919884093622, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:28:23,578] [INFO] [logging.py:68:log_dist] [Rank 0] step=3800, skipped=6, lr=[2.6822222222222223e-06], mom=[[0.9, 0.999]] [2022-12-20 01:28:23,579] [INFO] [timer.py:197:stop] 0/3800, RunningAvgSamplesPerSec=12.00338947650336, CurrSamplesPerSec=16.52477120154677, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0, 'learning_rate': 2.6822222222222223e-06, 'epoch': 100.0} [2022-12-20 01:28:30,092] [INFO] [timer.py:197:stop] 0/3801, RunningAvgSamplesPerSec=12.003394153020466, CurrSamplesPerSec=12.021181892418811, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:28:36,600] [INFO] [timer.py:197:stop] 0/3802, RunningAvgSamplesPerSec=12.003352441693597, CurrSamplesPerSec=11.846956305907854, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:28:43,116] [INFO] [timer.py:197:stop] 0/3803, RunningAvgSamplesPerSec=12.003336464393003, CurrSamplesPerSec=11.94292835113836, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:28:49,595] [INFO] [timer.py:197:stop] 0/3804, RunningAvgSamplesPerSec=12.003327877607356, CurrSamplesPerSec=11.970778035257222, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:28:56,053] [INFO] [timer.py:197:stop] 0/3805, RunningAvgSamplesPerSec=12.003334169647026, CurrSamplesPerSec=12.027304288870235, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:29:02,540] [INFO] [timer.py:197:stop] 0/3806, RunningAvgSamplesPerSec=12.00330977903103, CurrSamplesPerSec=11.91126375393976, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:29:08,998] [INFO] [timer.py:197:stop] 0/3807, RunningAvgSamplesPerSec=12.00331537269584, CurrSamplesPerSec=12.024631470656047, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:29:15,470] [INFO] [timer.py:197:stop] 0/3808, RunningAvgSamplesPerSec=12.003297943639234, CurrSamplesPerSec=11.937344866143295, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:29:21,985] [INFO] [timer.py:197:stop] 0/3809, RunningAvgSamplesPerSec=12.00327955323937, CurrSamplesPerSec=11.933691581314699, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:29:28,500] [INFO] [logging.py:68:log_dist] [Rank 0] step=3810, skipped=6, lr=[2.6600000000000004e-06], mom=[[0.9, 0.999]] [2022-12-20 01:29:28,500] [INFO] [timer.py:197:stop] 0/3810, RunningAvgSamplesPerSec=12.00324483363935, CurrSamplesPerSec=11.872507351556758, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:29:34,955] [INFO] [timer.py:197:stop] 0/3811, RunningAvgSamplesPerSec=12.003253225512033, CurrSamplesPerSec=12.035294803358319, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:29:41,398] [INFO] [timer.py:197:stop] 0/3812, RunningAvgSamplesPerSec=12.003256283977068, CurrSamplesPerSec=12.01491729779918, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:29:47,911] [INFO] [timer.py:197:stop] 0/3813, RunningAvgSamplesPerSec=12.003229688899632, CurrSamplesPerSec=11.902750874984946, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:29:54,380] [INFO] [timer.py:197:stop] 0/3814, RunningAvgSamplesPerSec=12.003207133778707, CurrSamplesPerSec=11.917860911094245, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:30:00,840] [INFO] [timer.py:197:stop] 0/3815, RunningAvgSamplesPerSec=12.003206138923586, CurrSamplesPerSec=11.999414949331966, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:30:07,283] [INFO] [timer.py:197:stop] 0/3816, RunningAvgSamplesPerSec=12.00320181684699, CurrSamplesPerSec=11.986744340393935, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:30:13,689] [INFO] [timer.py:197:stop] 0/3817, RunningAvgSamplesPerSec=12.003208377779242, CurrSamplesPerSec=12.028284063015581, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:30:20,162] [INFO] [timer.py:197:stop] 0/3818, RunningAvgSamplesPerSec=12.003187391538715, CurrSamplesPerSec=11.923655508905158, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:30:26,621] [INFO] [timer.py:197:stop] 0/3819, RunningAvgSamplesPerSec=12.003193341965217, CurrSamplesPerSec=12.025943217448539, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:30:33,175] [INFO] [logging.py:68:log_dist] [Rank 0] step=3820, skipped=6, lr=[2.637777777777778e-06], mom=[[0.9, 0.999]] [2022-12-20 01:30:33,175] [INFO] [timer.py:197:stop] 0/3820, RunningAvgSamplesPerSec=12.003153122847511, CurrSamplesPerSec=11.851575881414066, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:30:39,633] [INFO] [timer.py:197:stop] 0/3821, RunningAvgSamplesPerSec=12.00315795210262, CurrSamplesPerSec=12.021624421955591, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:30:46,108] [INFO] [timer.py:197:stop] 0/3822, RunningAvgSamplesPerSec=12.003133050123841, CurrSamplesPerSec=11.908780145663652, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:30:52,580] [INFO] [timer.py:197:stop] 0/3823, RunningAvgSamplesPerSec=12.003107630883829, CurrSamplesPerSec=11.906785554381495, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:30:59,030] [INFO] [timer.py:197:stop] 0/3824, RunningAvgSamplesPerSec=12.003111915906636, CurrSamplesPerSec=12.019507358423622, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:31:05,509] [INFO] [timer.py:197:stop] 0/3825, RunningAvgSamplesPerSec=12.0030922163338, CurrSamplesPerSec=11.928269909447772, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0, 'learning_rate': 2.6266666666666668e-06, 'epoch': 100.66} [2022-12-20 01:31:11,914] [INFO] [timer.py:197:stop] 0/3826, RunningAvgSamplesPerSec=12.003092365743221, CurrSamplesPerSec=12.003663585149512, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:31:18,372] [INFO] [timer.py:197:stop] 0/3827, RunningAvgSamplesPerSec=12.003067411021625, CurrSamplesPerSec=11.908393430096805, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:31:24,865] [INFO] [timer.py:197:stop] 0/3828, RunningAvgSamplesPerSec=12.003045591676274, CurrSamplesPerSec=11.920163041318292, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:31:31,346] [INFO] [timer.py:197:stop] 0/3829, RunningAvgSamplesPerSec=12.003019316355427, CurrSamplesPerSec=11.903325129815133, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:31:37,799] [INFO] [logging.py:68:log_dist] [Rank 0] step=3830, skipped=6, lr=[2.6155555555555556e-06], mom=[[0.9, 0.999]] [2022-12-20 01:31:37,800] [INFO] [timer.py:197:stop] 0/3830, RunningAvgSamplesPerSec=12.00302017194467, CurrSamplesPerSec=12.006295405667046, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:31:44,252] [INFO] [timer.py:197:stop] 0/3831, RunningAvgSamplesPerSec=12.003020864069118, CurrSamplesPerSec=12.005670901565384, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:31:50,725] [INFO] [timer.py:197:stop] 0/3832, RunningAvgSamplesPerSec=12.002995103977273, CurrSamplesPerSec=11.905163856882465, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:31:57,206] [INFO] [timer.py:197:stop] 0/3833, RunningAvgSamplesPerSec=12.002970927779069, CurrSamplesPerSec=11.911085110944823, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:32:03,643] [INFO] [timer.py:197:stop] 0/3834, RunningAvgSamplesPerSec=12.002949878924493, CurrSamplesPerSec=11.922849984254501, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:32:10,202] [INFO] [timer.py:197:stop] 0/3835, RunningAvgSamplesPerSec=12.002905789236197, CurrSamplesPerSec=11.836299844071615, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:32:16,713] [INFO] [timer.py:197:stop] 0/3836, RunningAvgSamplesPerSec=12.002884072857741, CurrSamplesPerSec=11.920218620972017, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:32:23,195] [INFO] [timer.py:197:stop] 0/3837, RunningAvgSamplesPerSec=12.002863773233729, CurrSamplesPerSec=11.92553654938506, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:32:27,826] [INFO] [timer.py:197:stop] 0/3838, RunningAvgSamplesPerSec=12.003726869000566, CurrSamplesPerSec=16.574358285355956, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:32:34,375] [INFO] [timer.py:197:stop] 0/3839, RunningAvgSamplesPerSec=12.003684569414496, CurrSamplesPerSec=11.843588044938803, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:32:40,818] [INFO] [logging.py:68:log_dist] [Rank 0] step=3840, skipped=6, lr=[2.5933333333333336e-06], mom=[[0.9, 0.999]] [2022-12-20 01:32:40,818] [INFO] [timer.py:197:stop] 0/3840, RunningAvgSamplesPerSec=12.003672274442673, CurrSamplesPerSec=11.9566811960055, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:32:47,317] [INFO] [timer.py:197:stop] 0/3841, RunningAvgSamplesPerSec=12.003650686338748, CurrSamplesPerSec=11.921363677160773, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:32:53,843] [INFO] [timer.py:197:stop] 0/3842, RunningAvgSamplesPerSec=12.003625957452744, CurrSamplesPerSec=11.909436880047394, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:33:00,370] [INFO] [timer.py:197:stop] 0/3843, RunningAvgSamplesPerSec=12.00359097708084, CurrSamplesPerSec=11.87075323954908, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:33:07,107] [INFO] [timer.py:197:stop] 0/3844, RunningAvgSamplesPerSec=12.003534609034856, CurrSamplesPerSec=11.790861953656437, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:33:14,083] [INFO] [timer.py:197:stop] 0/3845, RunningAvgSamplesPerSec=12.003507825852564, CurrSamplesPerSec=11.901481693935297, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:33:20,933] [INFO] [timer.py:197:stop] 0/3846, RunningAvgSamplesPerSec=12.0034749569564, CurrSamplesPerSec=11.878475527355228, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:33:27,686] [INFO] [timer.py:197:stop] 0/3847, RunningAvgSamplesPerSec=12.00344521736874, CurrSamplesPerSec=11.890205005416327, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:33:34,300] [INFO] [timer.py:197:stop] 0/3848, RunningAvgSamplesPerSec=12.003441037507452, CurrSamplesPerSec=11.987390966103877, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:33:41,054] [INFO] [timer.py:197:stop] 0/3849, RunningAvgSamplesPerSec=12.00343262052405, CurrSamplesPerSec=11.971147992499844, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:33:47,548] [INFO] [logging.py:68:log_dist] [Rank 0] step=3850, skipped=6, lr=[2.5711111111111112e-06], mom=[[0.9, 0.999]] [2022-12-20 01:33:47,549] [INFO] [timer.py:197:stop] 0/3850, RunningAvgSamplesPerSec=12.003414344353953, CurrSamplesPerSec=11.933515448038094, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0, 'learning_rate': 2.5711111111111112e-06, 'epoch': 101.32} [2022-12-20 01:33:54,068] [INFO] [timer.py:197:stop] 0/3851, RunningAvgSamplesPerSec=12.003400946099001, CurrSamplesPerSec=11.952065014207676, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:34:00,672] [INFO] [timer.py:197:stop] 0/3852, RunningAvgSamplesPerSec=12.003400274582022, CurrSamplesPerSec=12.000816162302565, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:34:07,293] [INFO] [timer.py:197:stop] 0/3853, RunningAvgSamplesPerSec=12.003378508268014, CurrSamplesPerSec=11.920159336026472, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:34:13,864] [INFO] [timer.py:197:stop] 0/3854, RunningAvgSamplesPerSec=12.00334901503054, CurrSamplesPerSec=11.890835463161064, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:34:20,293] [INFO] [timer.py:197:stop] 0/3855, RunningAvgSamplesPerSec=12.00335242480248, CurrSamplesPerSec=12.016501257917634, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:34:26,744] [INFO] [timer.py:197:stop] 0/3856, RunningAvgSamplesPerSec=12.003349332579349, CurrSamplesPerSec=11.991446814169677, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:34:33,203] [INFO] [timer.py:197:stop] 0/3857, RunningAvgSamplesPerSec=12.003348052540883, CurrSamplesPerSec=11.998416811512918, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:34:39,829] [INFO] [timer.py:197:stop] 0/3858, RunningAvgSamplesPerSec=12.003349696707954, CurrSamplesPerSec=12.00969131027485, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:34:46,353] [INFO] [timer.py:197:stop] 0/3859, RunningAvgSamplesPerSec=12.003348364079523, CurrSamplesPerSec=11.998211948309912, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:34:52,896] [INFO] [logging.py:68:log_dist] [Rank 0] step=3860, skipped=6, lr=[2.5488888888888893e-06], mom=[[0.9, 0.999]] [2022-12-20 01:34:52,897] [INFO] [timer.py:197:stop] 0/3860, RunningAvgSamplesPerSec=12.003324759793317, CurrSamplesPerSec=11.9129685313281, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:34:59,429] [INFO] [timer.py:197:stop] 0/3861, RunningAvgSamplesPerSec=12.003307922461358, CurrSamplesPerSec=11.938699229926632, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:35:05,965] [INFO] [timer.py:197:stop] 0/3862, RunningAvgSamplesPerSec=12.0032751685726, CurrSamplesPerSec=11.878195372275156, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:35:12,417] [INFO] [timer.py:197:stop] 0/3863, RunningAvgSamplesPerSec=12.003265194246737, CurrSamplesPerSec=11.964887426350334, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:35:18,956] [INFO] [timer.py:197:stop] 0/3864, RunningAvgSamplesPerSec=12.003229614298943, CurrSamplesPerSec=11.867410259479986, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:35:25,507] [INFO] [timer.py:197:stop] 0/3865, RunningAvgSamplesPerSec=12.003228986216614, CurrSamplesPerSec=12.000803822475405, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:35:32,114] [INFO] [timer.py:197:stop] 0/3866, RunningAvgSamplesPerSec=12.003209054112025, CurrSamplesPerSec=11.92670223411299, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:35:38,569] [INFO] [timer.py:197:stop] 0/3867, RunningAvgSamplesPerSec=12.00318523782033, CurrSamplesPerSec=11.911859445489208, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:35:44,993] [INFO] [timer.py:197:stop] 0/3868, RunningAvgSamplesPerSec=12.003190857236216, CurrSamplesPerSec=12.02494928036556, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:35:51,497] [INFO] [timer.py:197:stop] 0/3869, RunningAvgSamplesPerSec=12.003151849185492, CurrSamplesPerSec=11.854218378500596, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:35:58,077] [INFO] [logging.py:68:log_dist] [Rank 0] step=3870, skipped=6, lr=[2.526666666666667e-06], mom=[[0.9, 0.999]] [2022-12-20 01:35:58,078] [INFO] [timer.py:197:stop] 0/3870, RunningAvgSamplesPerSec=12.00308281224248, CurrSamplesPerSec=11.741926938642713, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:36:04,637] [INFO] [timer.py:197:stop] 0/3871, RunningAvgSamplesPerSec=12.003087592488265, CurrSamplesPerSec=12.021606117168218, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:36:11,302] [INFO] [timer.py:197:stop] 0/3872, RunningAvgSamplesPerSec=12.003069863228077, CurrSamplesPerSec=11.934865228530484, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:36:17,789] [INFO] [timer.py:197:stop] 0/3873, RunningAvgSamplesPerSec=12.003066554216584, CurrSamplesPerSec=11.990274331078858, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:36:24,329] [INFO] [timer.py:197:stop] 0/3874, RunningAvgSamplesPerSec=12.003053020355821, CurrSamplesPerSec=11.950891174306236, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:36:30,718] [INFO] [timer.py:197:stop] 0/3875, RunningAvgSamplesPerSec=12.003059108744393, CurrSamplesPerSec=12.026679752689931, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0, 'learning_rate': 2.5155555555555557e-06, 'epoch': 101.97} [2022-12-20 01:36:35,303] [INFO] [timer.py:197:stop] 0/3876, RunningAvgSamplesPerSec=12.003907902503306, CurrSamplesPerSec=16.531538736396076, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:36:41,791] [INFO] [timer.py:197:stop] 0/3877, RunningAvgSamplesPerSec=12.003884532783548, CurrSamplesPerSec=11.91402811791963, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:36:48,308] [INFO] [timer.py:197:stop] 0/3878, RunningAvgSamplesPerSec=12.003862687662407, CurrSamplesPerSec=11.919805755919596, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:36:54,757] [INFO] [timer.py:197:stop] 0/3879, RunningAvgSamplesPerSec=12.003841101998953, CurrSamplesPerSec=11.92075432953992, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:37:01,372] [INFO] [logging.py:68:log_dist] [Rank 0] step=3880, skipped=6, lr=[2.504444444444445e-06], mom=[[0.9, 0.999]] [2022-12-20 01:37:01,373] [INFO] [timer.py:197:stop] 0/3880, RunningAvgSamplesPerSec=12.003823485597257, CurrSamplesPerSec=11.935911200012912, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:37:07,938] [INFO] [timer.py:197:stop] 0/3881, RunningAvgSamplesPerSec=12.003828620139672, CurrSamplesPerSec=12.023773468344924, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:37:14,532] [INFO] [timer.py:197:stop] 0/3882, RunningAvgSamplesPerSec=12.003814145183437, CurrSamplesPerSec=11.947927270590817, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:37:21,068] [INFO] [timer.py:197:stop] 0/3883, RunningAvgSamplesPerSec=12.003820527316684, CurrSamplesPerSec=12.028634406043635, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:37:27,450] [INFO] [timer.py:197:stop] 0/3884, RunningAvgSamplesPerSec=12.003819963517241, CurrSamplesPerSec=12.001632256770247, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:37:33,954] [INFO] [timer.py:197:stop] 0/3885, RunningAvgSamplesPerSec=12.003804170061779, CurrSamplesPerSec=11.942805610241694, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:37:40,366] [INFO] [timer.py:197:stop] 0/3886, RunningAvgSamplesPerSec=12.003802485237413, CurrSamplesPerSec=11.997263876741625, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:37:46,891] [INFO] [timer.py:197:stop] 0/3887, RunningAvgSamplesPerSec=12.003780396841687, CurrSamplesPerSec=11.918598025201753, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:37:53,496] [INFO] [timer.py:197:stop] 0/3888, RunningAvgSamplesPerSec=12.003783153175391, CurrSamplesPerSec=12.01450107332971, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:38:00,110] [INFO] [timer.py:197:stop] 0/3889, RunningAvgSamplesPerSec=12.003765730046139, CurrSamplesPerSec=11.936439296860499, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:38:06,742] [INFO] [logging.py:68:log_dist] [Rank 0] step=3890, skipped=6, lr=[2.4822222222222225e-06], mom=[[0.9, 0.999]] [2022-12-20 01:38:06,743] [INFO] [timer.py:197:stop] 0/3890, RunningAvgSamplesPerSec=12.003741112795838, CurrSamplesPerSec=11.908810788134925, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:38:13,341] [INFO] [timer.py:197:stop] 0/3891, RunningAvgSamplesPerSec=12.003721998477255, CurrSamplesPerSec=11.929862915804025, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:38:19,916] [INFO] [timer.py:197:stop] 0/3892, RunningAvgSamplesPerSec=12.003663910660837, CurrSamplesPerSec=11.781934316140491, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:38:26,405] [INFO] [timer.py:197:stop] 0/3893, RunningAvgSamplesPerSec=12.003666484891372, CurrSamplesPerSec=12.013688604520928, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:38:32,864] [INFO] [timer.py:197:stop] 0/3894, RunningAvgSamplesPerSec=12.003668920021886, CurrSamplesPerSec=12.013151499843456, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:38:39,395] [INFO] [timer.py:197:stop] 0/3895, RunningAvgSamplesPerSec=12.003636630506493, CurrSamplesPerSec=11.879268232832466, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:38:46,039] [INFO] [timer.py:197:stop] 0/3896, RunningAvgSamplesPerSec=12.003615555645585, CurrSamplesPerSec=11.922128229180903, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:38:52,428] [INFO] [timer.py:197:stop] 0/3897, RunningAvgSamplesPerSec=12.003620787761163, CurrSamplesPerSec=12.024029294263446, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:38:58,906] [INFO] [timer.py:197:stop] 0/3898, RunningAvgSamplesPerSec=12.003623802209246, CurrSamplesPerSec=12.015376576351079, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:39:05,372] [INFO] [timer.py:197:stop] 0/3899, RunningAvgSamplesPerSec=12.00359161412903, CurrSamplesPerSec=11.879483774498587, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:39:11,872] [INFO] [logging.py:68:log_dist] [Rank 0] step=3900, skipped=6, lr=[2.46e-06], mom=[[0.9, 0.999]] [2022-12-20 01:39:11,873] [INFO] [timer.py:197:stop] 0/3900, RunningAvgSamplesPerSec=12.00357076400614, CurrSamplesPerSec=11.92286428255979, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0, 'learning_rate': 2.46e-06, 'epoch': 102.63} [2022-12-20 01:39:18,498] [INFO] [timer.py:197:stop] 0/3901, RunningAvgSamplesPerSec=12.0035513296688, CurrSamplesPerSec=11.928271499591386, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:39:25,099] [INFO] [timer.py:197:stop] 0/3902, RunningAvgSamplesPerSec=12.003534586672059, CurrSamplesPerSec=11.938606840976657, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:39:31,792] [INFO] [timer.py:197:stop] 0/3903, RunningAvgSamplesPerSec=12.003535590334558, CurrSamplesPerSec=12.007451151247956, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:39:38,210] [INFO] [timer.py:197:stop] 0/3904, RunningAvgSamplesPerSec=12.003538435958022, CurrSamplesPerSec=12.014649491150124, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:39:44,718] [INFO] [timer.py:197:stop] 0/3905, RunningAvgSamplesPerSec=12.003543765194475, CurrSamplesPerSec=12.024374541866441, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:39:51,204] [INFO] [timer.py:197:stop] 0/3906, RunningAvgSamplesPerSec=12.003536619354868, CurrSamplesPerSec=11.97571107657371, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:39:57,642] [INFO] [timer.py:197:stop] 0/3907, RunningAvgSamplesPerSec=12.003520970581238, CurrSamplesPerSec=11.942737599056239, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:40:04,202] [INFO] [timer.py:197:stop] 0/3908, RunningAvgSamplesPerSec=12.00349622193116, CurrSamplesPerSec=11.907624828235017, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:40:10,830] [INFO] [timer.py:197:stop] 0/3909, RunningAvgSamplesPerSec=12.003500969304362, CurrSamplesPerSec=12.02207290665986, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:40:17,461] [INFO] [logging.py:68:log_dist] [Rank 0] step=3910, skipped=6, lr=[2.437777777777778e-06], mom=[[0.9, 0.999]] [2022-12-20 01:40:17,462] [INFO] [timer.py:197:stop] 0/3910, RunningAvgSamplesPerSec=12.003479449834819, CurrSamplesPerSec=11.91998783646871, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:40:24,087] [INFO] [timer.py:197:stop] 0/3911, RunningAvgSamplesPerSec=12.00345509457204, CurrSamplesPerSec=11.90902370483293, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:40:30,500] [INFO] [timer.py:197:stop] 0/3912, RunningAvgSamplesPerSec=12.003440999517027, CurrSamplesPerSec=11.948595244010033, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:40:37,045] [INFO] [timer.py:197:stop] 0/3913, RunningAvgSamplesPerSec=12.003394890796717, CurrSamplesPerSec=11.825778192242106, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:40:41,608] [INFO] [timer.py:197:stop] 0/3914, RunningAvgSamplesPerSec=12.004244791304798, CurrSamplesPerSec=16.601507717506376, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:40:48,120] [INFO] [timer.py:197:stop] 0/3915, RunningAvgSamplesPerSec=12.004224307518026, CurrSamplesPerSec=11.924623235675394, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:40:54,555] [INFO] [timer.py:197:stop] 0/3916, RunningAvgSamplesPerSec=12.004229312042261, CurrSamplesPerSec=12.023844021353211, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:41:01,009] [INFO] [timer.py:197:stop] 0/3917, RunningAvgSamplesPerSec=12.004226350977047, CurrSamplesPerSec=11.992647923108866, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:41:07,712] [INFO] [timer.py:197:stop] 0/3918, RunningAvgSamplesPerSec=12.004198214014663, CurrSamplesPerSec=11.895043913679267, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:41:14,210] [INFO] [timer.py:197:stop] 0/3919, RunningAvgSamplesPerSec=12.004174353178424, CurrSamplesPerSec=11.911457201805616, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:41:20,673] [INFO] [logging.py:68:log_dist] [Rank 0] step=3920, skipped=6, lr=[2.415555555555556e-06], mom=[[0.9, 0.999]] [2022-12-20 01:41:20,673] [INFO] [timer.py:197:stop] 0/3920, RunningAvgSamplesPerSec=12.004151672781363, CurrSamplesPerSec=11.91596536445564, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:41:27,039] [INFO] [timer.py:197:stop] 0/3921, RunningAvgSamplesPerSec=12.004154434202462, CurrSamplesPerSec=12.014983444653446, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:41:33,506] [INFO] [timer.py:197:stop] 0/3922, RunningAvgSamplesPerSec=12.004129421026123, CurrSamplesPerSec=11.906896992996307, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:41:39,942] [INFO] [timer.py:197:stop] 0/3923, RunningAvgSamplesPerSec=12.004118402127654, CurrSamplesPerSec=11.961079226291686, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:41:46,416] [INFO] [timer.py:197:stop] 0/3924, RunningAvgSamplesPerSec=12.004089920698847, CurrSamplesPerSec=11.893443856270322, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:41:53,037] [INFO] [timer.py:197:stop] 0/3925, RunningAvgSamplesPerSec=12.0040204226245, CurrSamplesPerSec=11.73750226061369, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0, 'learning_rate': 2.4044444444444446e-06, 'epoch': 103.29} [2022-12-20 01:41:59,544] [INFO] [timer.py:197:stop] 0/3926, RunningAvgSamplesPerSec=12.003999620136524, CurrSamplesPerSec=11.922942659064647, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:42:06,022] [INFO] [timer.py:197:stop] 0/3927, RunningAvgSamplesPerSec=12.004004540638645, CurrSamplesPerSec=12.023343705307534, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:42:12,471] [INFO] [timer.py:197:stop] 0/3928, RunningAvgSamplesPerSec=12.003987049385273, CurrSamplesPerSec=11.935724386990978, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:42:19,024] [INFO] [timer.py:197:stop] 0/3929, RunningAvgSamplesPerSec=12.003948374523251, CurrSamplesPerSec=11.854007940429732, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:42:25,570] [INFO] [logging.py:68:log_dist] [Rank 0] step=3930, skipped=6, lr=[2.3933333333333334e-06], mom=[[0.9, 0.999]] [2022-12-20 01:42:25,571] [INFO] [timer.py:197:stop] 0/3930, RunningAvgSamplesPerSec=12.0039264025231, CurrSamplesPerSec=11.918258295814722, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:42:32,023] [INFO] [timer.py:197:stop] 0/3931, RunningAvgSamplesPerSec=12.003901340451304, CurrSamplesPerSec=11.906258493842563, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:42:38,547] [INFO] [timer.py:197:stop] 0/3932, RunningAvgSamplesPerSec=12.003862228627302, CurrSamplesPerSec=11.852134742738668, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:42:45,089] [INFO] [timer.py:197:stop] 0/3933, RunningAvgSamplesPerSec=12.00380175058334, CurrSamplesPerSec=11.77073892279253, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:42:51,586] [INFO] [timer.py:197:stop] 0/3934, RunningAvgSamplesPerSec=12.003803300807231, CurrSamplesPerSec=12.009900326967063, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:42:58,003] [INFO] [timer.py:197:stop] 0/3935, RunningAvgSamplesPerSec=12.003806807389322, CurrSamplesPerSec=12.017610547525498, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:43:04,498] [INFO] [timer.py:197:stop] 0/3936, RunningAvgSamplesPerSec=12.003779982963113, CurrSamplesPerSec=11.899198904921823, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:43:11,030] [INFO] [timer.py:197:stop] 0/3937, RunningAvgSamplesPerSec=12.003768277785223, CurrSamplesPerSec=11.95789612482145, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:43:17,518] [INFO] [timer.py:197:stop] 0/3938, RunningAvgSamplesPerSec=12.003769261874199, CurrSamplesPerSec=12.00764290193552, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:43:24,022] [INFO] [timer.py:197:stop] 0/3939, RunningAvgSamplesPerSec=12.003766144566413, CurrSamplesPerSec=11.991508953065564, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:43:30,474] [INFO] [logging.py:68:log_dist] [Rank 0] step=3940, skipped=6, lr=[2.371111111111111e-06], mom=[[0.9, 0.999]] [2022-12-20 01:43:30,475] [INFO] [timer.py:197:stop] 0/3940, RunningAvgSamplesPerSec=12.003744046849883, CurrSamplesPerSec=11.917371492298239, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:43:36,967] [INFO] [timer.py:197:stop] 0/3941, RunningAvgSamplesPerSec=12.00372314981501, CurrSamplesPerSec=11.921991089710815, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:43:43,402] [INFO] [timer.py:197:stop] 0/3942, RunningAvgSamplesPerSec=12.003729545616263, CurrSamplesPerSec=12.028975605869526, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:43:49,861] [INFO] [timer.py:197:stop] 0/3943, RunningAvgSamplesPerSec=12.003718932718007, CurrSamplesPerSec=11.962049306098935, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:43:56,305] [INFO] [timer.py:197:stop] 0/3944, RunningAvgSamplesPerSec=12.003719494395895, CurrSamplesPerSec=12.005933475340056, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:44:02,866] [INFO] [timer.py:197:stop] 0/3945, RunningAvgSamplesPerSec=12.003688683629912, CurrSamplesPerSec=11.883449559427506, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:44:09,379] [INFO] [timer.py:197:stop] 0/3946, RunningAvgSamplesPerSec=12.003660612912167, CurrSamplesPerSec=11.89398928202449, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:44:15,874] [INFO] [timer.py:197:stop] 0/3947, RunningAvgSamplesPerSec=12.003652820517514, CurrSamplesPerSec=11.972998121772145, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:44:22,426] [INFO] [timer.py:197:stop] 0/3948, RunningAvgSamplesPerSec=12.003645818462243, CurrSamplesPerSec=11.976086147546326, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:44:28,916] [INFO] [timer.py:197:stop] 0/3949, RunningAvgSamplesPerSec=12.003614202703586, CurrSamplesPerSec=11.880142013560604, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:44:35,336] [INFO] [logging.py:68:log_dist] [Rank 0] step=3950, skipped=6, lr=[2.348888888888889e-06], mom=[[0.9, 0.999]] [2022-12-20 01:44:35,337] [INFO] [timer.py:197:stop] 0/3950, RunningAvgSamplesPerSec=12.003598776984113, CurrSamplesPerSec=11.943020806986189, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0, 'learning_rate': 2.348888888888889e-06, 'epoch': 103.95} [2022-12-20 01:44:41,755] [INFO] [timer.py:197:stop] 0/3951, RunningAvgSamplesPerSec=12.003602578198395, CurrSamplesPerSec=12.01862856279636, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:44:46,389] [INFO] [timer.py:197:stop] 0/3952, RunningAvgSamplesPerSec=12.00443411300075, CurrSamplesPerSec=16.52506621157737, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:44:52,898] [INFO] [timer.py:197:stop] 0/3953, RunningAvgSamplesPerSec=12.004415876368698, CurrSamplesPerSec=11.932810966920785, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:44:59,415] [INFO] [timer.py:197:stop] 0/3954, RunningAvgSamplesPerSec=12.004379979267942, CurrSamplesPerSec=11.8642070693644, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:45:05,885] [INFO] [timer.py:197:stop] 0/3955, RunningAvgSamplesPerSec=12.004376297816286, CurrSamplesPerSec=11.989844817229937, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:45:12,403] [INFO] [timer.py:197:stop] 0/3956, RunningAvgSamplesPerSec=12.004336229519224, CurrSamplesPerSec=11.84800941112622, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:45:18,868] [INFO] [timer.py:197:stop] 0/3957, RunningAvgSamplesPerSec=12.004325993923633, CurrSamplesPerSec=11.963990471075615, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:45:25,375] [INFO] [timer.py:197:stop] 0/3958, RunningAvgSamplesPerSec=12.004287958052164, CurrSamplesPerSec=11.855718358765216, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:45:31,930] [INFO] [timer.py:197:stop] 0/3959, RunningAvgSamplesPerSec=12.00423953515275, CurrSamplesPerSec=11.815688166544689, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:45:38,397] [INFO] [logging.py:68:log_dist] [Rank 0] step=3960, skipped=6, lr=[2.3266666666666667e-06], mom=[[0.9, 0.999]] [2022-12-20 01:45:38,398] [INFO] [timer.py:197:stop] 0/3960, RunningAvgSamplesPerSec=12.004218964565672, CurrSamplesPerSec=11.923369511360551, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:45:44,967] [INFO] [timer.py:197:stop] 0/3961, RunningAvgSamplesPerSec=12.004177225485735, CurrSamplesPerSec=11.84121719398191, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:45:51,499] [INFO] [timer.py:197:stop] 0/3962, RunningAvgSamplesPerSec=12.00415026469125, CurrSamplesPerSec=11.898353434861107, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:45:58,024] [INFO] [timer.py:197:stop] 0/3963, RunningAvgSamplesPerSec=12.004115933204343, CurrSamplesPerSec=11.86968611524341, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:46:04,529] [INFO] [timer.py:197:stop] 0/3964, RunningAvgSamplesPerSec=12.004076685826373, CurrSamplesPerSec=11.85060584841978, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:46:11,003] [INFO] [timer.py:197:stop] 0/3965, RunningAvgSamplesPerSec=12.004065596501217, CurrSamplesPerSec=11.960289953275483, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:46:17,551] [INFO] [timer.py:197:stop] 0/3966, RunningAvgSamplesPerSec=12.004020701253571, CurrSamplesPerSec=11.828700038936136, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:46:24,018] [INFO] [timer.py:197:stop] 0/3967, RunningAvgSamplesPerSec=12.00399714075904, CurrSamplesPerSec=11.911324536062164, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:46:30,516] [INFO] [timer.py:197:stop] 0/3968, RunningAvgSamplesPerSec=12.003969267679699, CurrSamplesPerSec=11.894460970716816, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:46:36,981] [INFO] [timer.py:197:stop] 0/3969, RunningAvgSamplesPerSec=12.003953811083113, CurrSamplesPerSec=11.942964483138374, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:46:43,496] [INFO] [logging.py:68:log_dist] [Rank 0] step=3970, skipped=6, lr=[2.3044444444444447e-06], mom=[[0.9, 0.999]] [2022-12-20 01:46:43,497] [INFO] [timer.py:197:stop] 0/3970, RunningAvgSamplesPerSec=12.003931810403683, CurrSamplesPerSec=11.91728525293561, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:46:50,045] [INFO] [timer.py:197:stop] 0/3971, RunningAvgSamplesPerSec=12.003888410132985, CurrSamplesPerSec=11.834112419710928, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:46:56,483] [INFO] [timer.py:197:stop] 0/3972, RunningAvgSamplesPerSec=12.003892158487913, CurrSamplesPerSec=12.018787845058528, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:47:02,938] [INFO] [timer.py:197:stop] 0/3973, RunningAvgSamplesPerSec=12.003863004040483, CurrSamplesPerSec=11.889225479408225, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:47:09,410] [INFO] [timer.py:197:stop] 0/3974, RunningAvgSamplesPerSec=12.003844249827708, CurrSamplesPerSec=11.929830574339249, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:47:15,834] [INFO] [timer.py:197:stop] 0/3975, RunningAvgSamplesPerSec=12.003838451577352, CurrSamplesPerSec=11.980851914404639, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0, 'learning_rate': 2.2933333333333335e-06, 'epoch': 104.61} [2022-12-20 01:47:22,349] [INFO] [timer.py:197:stop] 0/3976, RunningAvgSamplesPerSec=12.003810045427109, CurrSamplesPerSec=11.892003856505154, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:47:28,820] [INFO] [timer.py:197:stop] 0/3977, RunningAvgSamplesPerSec=12.003799125207859, CurrSamplesPerSec=11.960558539557397, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:47:35,314] [INFO] [timer.py:197:stop] 0/3978, RunningAvgSamplesPerSec=12.003775171041523, CurrSamplesPerSec=11.909306901121843, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:47:41,793] [INFO] [timer.py:197:stop] 0/3979, RunningAvgSamplesPerSec=12.003747956255943, CurrSamplesPerSec=11.89650890230157, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:47:48,429] [INFO] [logging.py:68:log_dist] [Rank 0] step=3980, skipped=6, lr=[2.2822222222222223e-06], mom=[[0.9, 0.999]] [2022-12-20 01:47:48,430] [INFO] [timer.py:197:stop] 0/3980, RunningAvgSamplesPerSec=12.003695567542586, CurrSamplesPerSec=11.798901194441395, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:47:55,285] [INFO] [timer.py:197:stop] 0/3981, RunningAvgSamplesPerSec=12.003661128610476, CurrSamplesPerSec=11.86820935957052, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:48:02,158] [INFO] [timer.py:197:stop] 0/3982, RunningAvgSamplesPerSec=12.003648072533814, CurrSamplesPerSec=11.95192186363376, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:48:08,748] [INFO] [timer.py:197:stop] 0/3983, RunningAvgSamplesPerSec=12.003643383049269, CurrSamplesPerSec=11.985008217173954, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:48:15,308] [INFO] [timer.py:197:stop] 0/3984, RunningAvgSamplesPerSec=12.003614384409877, CurrSamplesPerSec=11.8892707657619, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:48:22,098] [INFO] [timer.py:197:stop] 0/3985, RunningAvgSamplesPerSec=12.003581985299933, CurrSamplesPerSec=11.875940946675929, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:48:28,568] [INFO] [timer.py:197:stop] 0/3986, RunningAvgSamplesPerSec=12.003564801125734, CurrSamplesPerSec=11.935508391723934, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:48:35,005] [INFO] [timer.py:197:stop] 0/3987, RunningAvgSamplesPerSec=12.003557436537637, CurrSamplesPerSec=11.974288478178849, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:48:41,541] [INFO] [timer.py:197:stop] 0/3988, RunningAvgSamplesPerSec=12.003557975015356, CurrSamplesPerSec=12.005704192492033, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:48:48,149] [INFO] [timer.py:197:stop] 0/3989, RunningAvgSamplesPerSec=12.003559978240085, CurrSamplesPerSec=12.011550148466688, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:48:52,819] [INFO] [logging.py:68:log_dist] [Rank 0] step=3990, skipped=6, lr=[2.2600000000000004e-06], mom=[[0.9, 0.999]] [2022-12-20 01:48:52,820] [INFO] [timer.py:197:stop] 0/3990, RunningAvgSamplesPerSec=12.004382264150882, CurrSamplesPerSec=16.51502247106054, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:48:59,374] [INFO] [timer.py:197:stop] 0/3991, RunningAvgSamplesPerSec=12.00438604314129, CurrSamplesPerSec=12.01947560545596, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:49:06,087] [INFO] [timer.py:197:stop] 0/3992, RunningAvgSamplesPerSec=12.004361515412473, CurrSamplesPerSec=11.907311605212088, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:49:12,673] [INFO] [timer.py:197:stop] 0/3993, RunningAvgSamplesPerSec=12.004342665910562, CurrSamplesPerSec=11.929601538164261, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:49:19,150] [INFO] [timer.py:197:stop] 0/3994, RunningAvgSamplesPerSec=12.004319053088826, CurrSamplesPerSec=11.910814514010664, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:49:25,636] [INFO] [timer.py:197:stop] 0/3995, RunningAvgSamplesPerSec=12.004313503708186, CurrSamplesPerSec=11.982201193151765, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:49:32,196] [INFO] [timer.py:197:stop] 0/3996, RunningAvgSamplesPerSec=12.004284437574418, CurrSamplesPerSec=11.889335009785704, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:49:38,674] [INFO] [timer.py:197:stop] 0/3997, RunningAvgSamplesPerSec=12.004277250495793, CurrSamplesPerSec=11.975640553031987, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:49:45,168] [INFO] [timer.py:197:stop] 0/3998, RunningAvgSamplesPerSec=12.004254174457863, CurrSamplesPerSec=11.9127681616429, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:49:51,815] [INFO] [timer.py:197:stop] 0/3999, RunningAvgSamplesPerSec=12.00423177070538, CurrSamplesPerSec=11.91536926218584, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:49:58,365] [INFO] [logging.py:68:log_dist] [Rank 0] step=4000, skipped=6, lr=[2.237777777777778e-06], mom=[[0.9, 0.999]] [2022-12-20 01:49:58,366] [INFO] [timer.py:197:stop] 0/4000, RunningAvgSamplesPerSec=12.00423268302079, CurrSamplesPerSec=12.007880316033937, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0, 'learning_rate': 2.237777777777778e-06, 'epoch': 105.26} {'eval_loss': 0.44580078125, 'eval_wer': 18.032069970845484, 'eval_runtime': 168.4403, 'eval_samples_per_second': 7.166, 'eval_steps_per_second': 0.226, 'epoch': 105.26} [2022-12-20 01:52:48,626] [INFO] [logging.py:68:log_dist] [Rank 0] [Torch] Checkpoint global_step4000 is begin to save! [2022-12-20 01:52:48,634] [INFO] [logging.py:68:log_dist] [Rank 0] Saving model checkpoint: ./checkpoint-4000/global_step4000/mp_rank_00_model_states.pt [2022-12-20 01:52:48,634] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving ./checkpoint-4000/global_step4000/mp_rank_00_model_states.pt... [2022-12-20 01:52:50,506] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved ./checkpoint-4000/global_step4000/mp_rank_00_model_states.pt. [2022-12-20 01:52:50,507] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving ./checkpoint-4000/global_step4000/zero_pp_rank_0_mp_rank_00_optim_states.pt... [2022-12-20 01:52:57,817] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved ./checkpoint-4000/global_step4000/zero_pp_rank_0_mp_rank_00_optim_states.pt. [2022-12-20 01:52:57,818] [INFO] [engine.py:3269:_save_zero_checkpoint] zero checkpoint saved ./checkpoint-4000/global_step4000/zero_pp_rank_0_mp_rank_00_optim_states.pt [2022-12-20 01:52:57,818] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step4000 is ready now! [2022-12-20 01:54:15,312] [INFO] [timer.py:197:stop] 0/4001, RunningAvgSamplesPerSec=12.004123110771685, CurrSamplesPerSec=11.581480724537343, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:54:21,788] [INFO] [timer.py:197:stop] 0/4002, RunningAvgSamplesPerSec=12.004096423422798, CurrSamplesPerSec=11.898314407952874, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:54:28,430] [INFO] [timer.py:197:stop] 0/4003, RunningAvgSamplesPerSec=12.004084639772081, CurrSamplesPerSec=11.957134435183542, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:54:35,043] [INFO] [timer.py:197:stop] 0/4004, RunningAvgSamplesPerSec=12.004083561678474, CurrSamplesPerSec=11.999771658950918, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:54:41,503] [INFO] [timer.py:197:stop] 0/4005, RunningAvgSamplesPerSec=12.004081138553007, CurrSamplesPerSec=11.994391619940965, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:54:47,985] [INFO] [timer.py:197:stop] 0/4006, RunningAvgSamplesPerSec=12.004054381548189, CurrSamplesPerSec=11.897893566478789, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:54:53,983] [INFO] [stage_1_and_2.py:1765:step] [deepspeed] OVERFLOW! Rank 0 Skipping step. Attempted loss scale: 65536.0, reducing to 65536.0 [2022-12-20 01:54:53,984] [INFO] [timer.py:197:stop] 0/4007, RunningAvgSamplesPerSec=12.00424334166192, CurrSamplesPerSec=12.811747221342047, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:55:00,197] [INFO] [stage_1_and_2.py:1765:step] [deepspeed] OVERFLOW! Rank 0 Skipping step. Attempted loss scale: 65536.0, reducing to 32768.0 [2022-12-20 01:55:00,198] [INFO] [timer.py:197:stop] 0/4008, RunningAvgSamplesPerSec=12.004428118618659, CurrSamplesPerSec=12.793090279884327, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:55:06,832] [INFO] [timer.py:197:stop] 0/4009, RunningAvgSamplesPerSec=12.004428251356638, CurrSamplesPerSec=12.004960023262575, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:55:13,382] [INFO] [logging.py:68:log_dist] [Rank 0] step=4010, skipped=8, lr=[2.2200000000000003e-06], mom=[[0.9, 0.999]] [2022-12-20 01:55:13,382] [INFO] [timer.py:197:stop] 0/4010, RunningAvgSamplesPerSec=12.004429985166102, CurrSamplesPerSec=12.011381383699298, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:55:19,808] [INFO] [timer.py:197:stop] 0/4011, RunningAvgSamplesPerSec=12.004428291400423, CurrSamplesPerSec=11.997643516373794, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:55:26,261] [INFO] [timer.py:197:stop] 0/4012, RunningAvgSamplesPerSec=12.004416233773906, CurrSamplesPerSec=11.956271126711004, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:55:32,770] [INFO] [timer.py:197:stop] 0/4013, RunningAvgSamplesPerSec=12.004393593776078, CurrSamplesPerSec=11.914288813569915, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:55:39,252] [INFO] [timer.py:197:stop] 0/4014, RunningAvgSamplesPerSec=12.00439014878608, CurrSamplesPerSec=11.990588184845077, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:55:45,816] [INFO] [timer.py:197:stop] 0/4015, RunningAvgSamplesPerSec=12.004394572888401, CurrSamplesPerSec=12.022170360943068, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:55:52,433] [INFO] [timer.py:197:stop] 0/4016, RunningAvgSamplesPerSec=12.004370760722816, CurrSamplesPerSec=11.909567390197111, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:55:58,930] [INFO] [timer.py:197:stop] 0/4017, RunningAvgSamplesPerSec=12.004371065655986, CurrSamplesPerSec=12.005595192242154, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:56:05,370] [INFO] [timer.py:197:stop] 0/4018, RunningAvgSamplesPerSec=12.004359632179282, CurrSamplesPerSec=11.958629142974015, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:56:11,818] [INFO] [timer.py:197:stop] 0/4019, RunningAvgSamplesPerSec=12.004361501447095, CurrSamplesPerSec=12.011873179607035, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:56:18,326] [INFO] [logging.py:68:log_dist] [Rank 0] step=4020, skipped=8, lr=[2.197777777777778e-06], mom=[[0.9, 0.999]] [2022-12-20 01:56:18,327] [INFO] [timer.py:197:stop] 0/4020, RunningAvgSamplesPerSec=12.004334827439283, CurrSamplesPerSec=11.898133516302812, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:56:24,818] [INFO] [timer.py:197:stop] 0/4021, RunningAvgSamplesPerSec=12.004317225570166, CurrSamplesPerSec=11.934007254864678, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:56:31,222] [INFO] [timer.py:197:stop] 0/4022, RunningAvgSamplesPerSec=12.004322320941542, CurrSamplesPerSec=12.02483562106523, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:56:37,811] [INFO] [timer.py:197:stop] 0/4023, RunningAvgSamplesPerSec=12.004305803025838, CurrSamplesPerSec=11.938269155600432, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:56:44,311] [INFO] [timer.py:197:stop] 0/4024, RunningAvgSamplesPerSec=12.004290280396663, CurrSamplesPerSec=11.942196725173922, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:56:50,781] [INFO] [timer.py:197:stop] 0/4025, RunningAvgSamplesPerSec=12.004296665397634, CurrSamplesPerSec=12.030032208352715, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0, 'learning_rate': 2.1866666666666668e-06, 'epoch': 105.92} [2022-12-20 01:56:57,253] [INFO] [timer.py:197:stop] 0/4026, RunningAvgSamplesPerSec=12.004275797074035, CurrSamplesPerSec=11.920905734166407, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:57:03,716] [INFO] [timer.py:197:stop] 0/4027, RunningAvgSamplesPerSec=12.004268563557458, CurrSamplesPerSec=11.975231319177214, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:57:08,383] [INFO] [timer.py:197:stop] 0/4028, RunningAvgSamplesPerSec=12.005091868871453, CurrSamplesPerSec=16.582812578860658, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:57:14,884] [INFO] [timer.py:197:stop] 0/4029, RunningAvgSamplesPerSec=12.005093410291462, CurrSamplesPerSec=12.011302377619474, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:57:21,315] [INFO] [logging.py:68:log_dist] [Rank 0] step=4030, skipped=8, lr=[2.1755555555555556e-06], mom=[[0.9, 0.999]] [2022-12-20 01:57:21,316] [INFO] [timer.py:197:stop] 0/4030, RunningAvgSamplesPerSec=12.005096673203859, CurrSamplesPerSec=12.018250822418384, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:57:27,801] [INFO] [timer.py:197:stop] 0/4031, RunningAvgSamplesPerSec=12.005082804162592, CurrSamplesPerSec=11.949477125850146, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:57:34,245] [INFO] [timer.py:197:stop] 0/4032, RunningAvgSamplesPerSec=12.005072064155321, CurrSamplesPerSec=11.961956022356812, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:57:40,712] [INFO] [timer.py:197:stop] 0/4033, RunningAvgSamplesPerSec=12.005074462445888, CurrSamplesPerSec=12.014747362878547, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:57:47,206] [INFO] [timer.py:197:stop] 0/4034, RunningAvgSamplesPerSec=12.005075700268412, CurrSamplesPerSec=12.010067438090777, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:57:53,673] [INFO] [timer.py:197:stop] 0/4035, RunningAvgSamplesPerSec=12.005078468156453, CurrSamplesPerSec=12.016248979604262, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:58:00,079] [INFO] [timer.py:197:stop] 0/4036, RunningAvgSamplesPerSec=12.005081548772411, CurrSamplesPerSec=12.017518547243473, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:58:06,576] [INFO] [timer.py:197:stop] 0/4037, RunningAvgSamplesPerSec=12.005062813139185, CurrSamplesPerSec=11.92995622986558, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:58:13,071] [INFO] [timer.py:197:stop] 0/4038, RunningAvgSamplesPerSec=12.005039401542271, CurrSamplesPerSec=11.911311322504524, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:58:19,496] [INFO] [timer.py:197:stop] 0/4039, RunningAvgSamplesPerSec=12.005014728498903, CurrSamplesPerSec=11.906253741003905, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:58:26,049] [INFO] [logging.py:68:log_dist] [Rank 0] step=4040, skipped=8, lr=[2.153333333333333e-06], mom=[[0.9, 0.999]] [2022-12-20 01:58:26,050] [INFO] [timer.py:197:stop] 0/4040, RunningAvgSamplesPerSec=12.004992321962934, CurrSamplesPerSec=11.915213767054338, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:58:32,579] [INFO] [timer.py:197:stop] 0/4041, RunningAvgSamplesPerSec=12.004972670056405, CurrSamplesPerSec=11.926139497853223, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:58:38,990] [INFO] [timer.py:197:stop] 0/4042, RunningAvgSamplesPerSec=12.004954695076465, CurrSamplesPerSec=11.932790279349998, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:58:45,366] [INFO] [timer.py:197:stop] 0/4043, RunningAvgSamplesPerSec=12.004951581639782, CurrSamplesPerSec=11.992386465886547, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:58:51,924] [INFO] [timer.py:197:stop] 0/4044, RunningAvgSamplesPerSec=12.00488212558153, CurrSamplesPerSec=11.7306238879994, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:58:58,379] [INFO] [timer.py:197:stop] 0/4045, RunningAvgSamplesPerSec=12.00486888783137, CurrSamplesPerSec=11.951599388287162, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:59:04,887] [INFO] [timer.py:197:stop] 0/4046, RunningAvgSamplesPerSec=12.004850851061196, CurrSamplesPerSec=11.932368586810073, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:59:11,384] [INFO] [timer.py:197:stop] 0/4047, RunningAvgSamplesPerSec=12.004832673018164, CurrSamplesPerSec=11.931768190498323, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:59:17,873] [INFO] [timer.py:197:stop] 0/4048, RunningAvgSamplesPerSec=12.004806619587603, CurrSamplesPerSec=11.900337815896387, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:59:24,377] [INFO] [timer.py:197:stop] 0/4049, RunningAvgSamplesPerSec=12.004778556500323, CurrSamplesPerSec=11.89229941528095, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:59:30,905] [INFO] [logging.py:68:log_dist] [Rank 0] step=4050, skipped=8, lr=[2.1311111111111112e-06], mom=[[0.9, 0.999]] [2022-12-20 01:59:30,905] [INFO] [timer.py:197:stop] 0/4050, RunningAvgSamplesPerSec=12.004748330898336, CurrSamplesPerSec=11.883659465857564, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0, 'learning_rate': 2.1311111111111112e-06, 'epoch': 106.58} [2022-12-20 01:59:37,322] [INFO] [timer.py:197:stop] 0/4051, RunningAvgSamplesPerSec=12.004750673713325, CurrSamplesPerSec=12.01424188868297, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:59:44,009] [INFO] [timer.py:197:stop] 0/4052, RunningAvgSamplesPerSec=12.004748111062568, CurrSamplesPerSec=11.994380901140705, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:59:50,423] [INFO] [timer.py:197:stop] 0/4053, RunningAvgSamplesPerSec=12.004723249165462, CurrSamplesPerSec=11.904870298180716, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 01:59:56,857] [INFO] [timer.py:197:stop] 0/4054, RunningAvgSamplesPerSec=12.004710278453896, CurrSamplesPerSec=11.952394965393616, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:00:03,249] [INFO] [timer.py:197:stop] 0/4055, RunningAvgSamplesPerSec=12.004714787175473, CurrSamplesPerSec=12.023011979411654, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:00:09,897] [INFO] [timer.py:197:stop] 0/4056, RunningAvgSamplesPerSec=12.004691163613824, CurrSamplesPerSec=11.90970265879606, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:00:16,364] [INFO] [timer.py:197:stop] 0/4057, RunningAvgSamplesPerSec=12.004673678084755, CurrSamplesPerSec=11.934203564485603, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:00:22,862] [INFO] [timer.py:197:stop] 0/4058, RunningAvgSamplesPerSec=12.004649512609479, CurrSamplesPerSec=11.907452104563784, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:00:29,345] [INFO] [timer.py:197:stop] 0/4059, RunningAvgSamplesPerSec=12.004651425205687, CurrSamplesPerSec=12.01241393284605, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:00:35,785] [INFO] [logging.py:68:log_dist] [Rank 0] step=4060, skipped=8, lr=[2.108888888888889e-06], mom=[[0.9, 0.999]] [2022-12-20 02:00:35,786] [INFO] [timer.py:197:stop] 0/4060, RunningAvgSamplesPerSec=12.004635725352784, CurrSamplesPerSec=11.941277670722814, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:00:42,194] [INFO] [timer.py:197:stop] 0/4061, RunningAvgSamplesPerSec=12.004620856216306, CurrSamplesPerSec=11.944583739962134, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:00:48,633] [INFO] [timer.py:197:stop] 0/4062, RunningAvgSamplesPerSec=12.004600304720528, CurrSamplesPerSec=11.921757589223835, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:00:55,080] [INFO] [timer.py:197:stop] 0/4063, RunningAvgSamplesPerSec=12.004605349110683, CurrSamplesPerSec=12.025120581379895, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:01:01,498] [INFO] [timer.py:197:stop] 0/4064, RunningAvgSamplesPerSec=12.004613900266525, CurrSamplesPerSec=12.039440914492733, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:01:07,950] [INFO] [timer.py:197:stop] 0/4065, RunningAvgSamplesPerSec=12.004622248761946, CurrSamplesPerSec=12.038629928341493, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:01:12,616] [INFO] [timer.py:197:stop] 0/4066, RunningAvgSamplesPerSec=12.005431747897077, CurrSamplesPerSec=16.535892222399564, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:01:19,054] [INFO] [timer.py:197:stop] 0/4067, RunningAvgSamplesPerSec=12.005428655433285, CurrSamplesPerSec=11.992874028505584, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:01:25,559] [INFO] [timer.py:197:stop] 0/4068, RunningAvgSamplesPerSec=12.005409456307794, CurrSamplesPerSec=11.927869206771717, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:01:32,130] [INFO] [timer.py:197:stop] 0/4069, RunningAvgSamplesPerSec=12.005368401765665, CurrSamplesPerSec=11.840730393641255, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:01:38,560] [INFO] [logging.py:68:log_dist] [Rank 0] step=4070, skipped=8, lr=[2.086666666666667e-06], mom=[[0.9, 0.999]] [2022-12-20 02:01:38,561] [INFO] [timer.py:197:stop] 0/4070, RunningAvgSamplesPerSec=12.005372236558813, CurrSamplesPerSec=12.020988632441169, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:01:44,989] [INFO] [timer.py:197:stop] 0/4071, RunningAvgSamplesPerSec=12.005375468021272, CurrSamplesPerSec=12.018535470726329, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:01:51,540] [INFO] [timer.py:197:stop] 0/4072, RunningAvgSamplesPerSec=12.005326109375579, CurrSamplesPerSec=11.807791206706398, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:01:58,017] [INFO] [timer.py:197:stop] 0/4073, RunningAvgSamplesPerSec=12.005318813459724, CurrSamplesPerSec=11.975697719778562, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:02:04,428] [INFO] [timer.py:197:stop] 0/4074, RunningAvgSamplesPerSec=12.00531613516777, CurrSamplesPerSec=11.994422704570054, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:02:10,920] [INFO] [timer.py:197:stop] 0/4075, RunningAvgSamplesPerSec=12.005290832582757, CurrSamplesPerSec=11.903135640708296, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0, 'learning_rate': 2.0755555555555557e-06, 'epoch': 107.24} [2022-12-20 02:02:17,702] [INFO] [timer.py:197:stop] 0/4076, RunningAvgSamplesPerSec=12.005272888903322, CurrSamplesPerSec=11.932630616152727, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:02:24,153] [INFO] [timer.py:197:stop] 0/4077, RunningAvgSamplesPerSec=12.005270655512, CurrSamplesPerSec=11.996178711767175, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:02:30,563] [INFO] [timer.py:197:stop] 0/4078, RunningAvgSamplesPerSec=12.005254015499967, CurrSamplesPerSec=11.93782690160385, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:02:36,981] [INFO] [timer.py:197:stop] 0/4079, RunningAvgSamplesPerSec=12.00525628106804, CurrSamplesPerSec=12.014497846896077, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:02:43,420] [INFO] [logging.py:68:log_dist] [Rank 0] step=4080, skipped=8, lr=[2.064444444444445e-06], mom=[[0.9, 0.999]] [2022-12-20 02:02:43,420] [INFO] [timer.py:197:stop] 0/4080, RunningAvgSamplesPerSec=12.005234906827573, CurrSamplesPerSec=11.91872026901847, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:02:49,931] [INFO] [timer.py:197:stop] 0/4081, RunningAvgSamplesPerSec=12.005217656763193, CurrSamplesPerSec=11.935281790841895, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:02:56,359] [INFO] [timer.py:197:stop] 0/4082, RunningAvgSamplesPerSec=12.005220382119191, CurrSamplesPerSec=12.016347415297842, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:03:02,832] [INFO] [timer.py:197:stop] 0/4083, RunningAvgSamplesPerSec=12.005218269621182, CurrSamplesPerSec=11.99660546271307, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:03:09,219] [INFO] [timer.py:197:stop] 0/4084, RunningAvgSamplesPerSec=12.005203035225097, CurrSamplesPerSec=11.9433518531685, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:03:15,710] [INFO] [timer.py:197:stop] 0/4085, RunningAvgSamplesPerSec=12.005187736926668, CurrSamplesPerSec=11.943063315902354, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:03:22,169] [INFO] [timer.py:197:stop] 0/4086, RunningAvgSamplesPerSec=12.005161207462509, CurrSamplesPerSec=11.897810245507076, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:03:28,793] [INFO] [timer.py:197:stop] 0/4087, RunningAvgSamplesPerSec=12.005160584244793, CurrSamplesPerSec=12.002615902724761, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:03:35,401] [INFO] [timer.py:197:stop] 0/4088, RunningAvgSamplesPerSec=12.005135814137892, CurrSamplesPerSec=11.904795854839925, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:03:41,904] [INFO] [timer.py:197:stop] 0/4089, RunningAvgSamplesPerSec=12.005110873391025, CurrSamplesPerSec=11.904060974799629, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:03:48,391] [INFO] [logging.py:68:log_dist] [Rank 0] step=4090, skipped=8, lr=[2.0422222222222225e-06], mom=[[0.9, 0.999]] [2022-12-20 02:03:48,392] [INFO] [timer.py:197:stop] 0/4090, RunningAvgSamplesPerSec=12.005087359031485, CurrSamplesPerSec=11.909747572877562, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:03:54,898] [INFO] [timer.py:197:stop] 0/4091, RunningAvgSamplesPerSec=12.005065513430397, CurrSamplesPerSec=11.916420282974936, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:04:01,334] [INFO] [timer.py:197:stop] 0/4092, RunningAvgSamplesPerSec=12.005047739550236, CurrSamplesPerSec=11.932807784212917, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:04:07,818] [INFO] [timer.py:197:stop] 0/4093, RunningAvgSamplesPerSec=12.005022085351971, CurrSamplesPerSec=11.9010057548659, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:04:14,257] [INFO] [timer.py:197:stop] 0/4094, RunningAvgSamplesPerSec=12.005016135916696, CurrSamplesPerSec=11.980726253891437, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:04:20,788] [INFO] [timer.py:197:stop] 0/4095, RunningAvgSamplesPerSec=12.004973574878123, CurrSamplesPerSec=11.833304863388125, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:04:27,216] [INFO] [timer.py:197:stop] 0/4096, RunningAvgSamplesPerSec=12.004972594445022, CurrSamplesPerSec=12.000961023040679, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:04:33,697] [INFO] [timer.py:197:stop] 0/4097, RunningAvgSamplesPerSec=12.00494810114184, CurrSamplesPerSec=11.905503368182638, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:04:40,173] [INFO] [timer.py:197:stop] 0/4098, RunningAvgSamplesPerSec=12.004939086080991, CurrSamplesPerSec=11.968135614772168, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:04:46,637] [INFO] [timer.py:197:stop] 0/4099, RunningAvgSamplesPerSec=12.004922129654753, CurrSamplesPerSec=11.935868211271112, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:04:53,076] [INFO] [logging.py:68:log_dist] [Rank 0] step=4100, skipped=8, lr=[2.02e-06], mom=[[0.9, 0.999]] [2022-12-20 02:04:53,077] [INFO] [timer.py:197:stop] 0/4100, RunningAvgSamplesPerSec=12.004907089587013, CurrSamplesPerSec=11.943602674052569, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0, 'learning_rate': 2.02e-06, 'epoch': 107.89} [2022-12-20 02:04:59,888] [INFO] [timer.py:197:stop] 0/4101, RunningAvgSamplesPerSec=12.004886612530674, CurrSamplesPerSec=11.92155427652141, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:05:06,446] [INFO] [timer.py:197:stop] 0/4102, RunningAvgSamplesPerSec=12.00487247348239, CurrSamplesPerSec=11.947195032047544, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:05:12,900] [INFO] [timer.py:197:stop] 0/4103, RunningAvgSamplesPerSec=12.00487115741532, CurrSamplesPerSec=11.9994777072313, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:05:17,600] [INFO] [timer.py:197:stop] 0/4104, RunningAvgSamplesPerSec=12.00565576975593, CurrSamplesPerSec=16.401896527357152, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:05:24,066] [INFO] [timer.py:197:stop] 0/4105, RunningAvgSamplesPerSec=12.00564854107965, CurrSamplesPerSec=11.976069584087316, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:05:30,490] [INFO] [timer.py:197:stop] 0/4106, RunningAvgSamplesPerSec=12.005649793778522, CurrSamplesPerSec=12.010791819171793, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:05:36,882] [INFO] [timer.py:197:stop] 0/4107, RunningAvgSamplesPerSec=12.005643656929513, CurrSamplesPerSec=11.98051076561553, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:05:43,365] [INFO] [timer.py:197:stop] 0/4108, RunningAvgSamplesPerSec=12.005614688531502, CurrSamplesPerSec=11.88786599627486, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:05:49,858] [INFO] [timer.py:197:stop] 0/4109, RunningAvgSamplesPerSec=12.005583483985756, CurrSamplesPerSec=11.878810889837741, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:05:56,358] [INFO] [logging.py:68:log_dist] [Rank 0] step=4110, skipped=8, lr=[1.9977777777777778e-06], mom=[[0.9, 0.999]] [2022-12-20 02:05:56,358] [INFO] [timer.py:197:stop] 0/4110, RunningAvgSamplesPerSec=12.00553984601997, CurrSamplesPerSec=11.828955449947257, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:06:03,004] [INFO] [timer.py:197:stop] 0/4111, RunningAvgSamplesPerSec=12.005535575361783, CurrSamplesPerSec=11.988017317449492, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:06:09,487] [INFO] [timer.py:197:stop] 0/4112, RunningAvgSamplesPerSec=12.005500372923317, CurrSamplesPerSec=11.862575979618155, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:06:15,914] [INFO] [timer.py:197:stop] 0/4113, RunningAvgSamplesPerSec=12.005484831330945, CurrSamplesPerSec=11.94194702586478, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:06:22,413] [INFO] [timer.py:197:stop] 0/4114, RunningAvgSamplesPerSec=12.00547590647529, CurrSamplesPerSec=11.968897639357598, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:06:28,888] [INFO] [timer.py:197:stop] 0/4115, RunningAvgSamplesPerSec=12.0054556588735, CurrSamplesPerSec=11.922771079408802, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:06:35,645] [INFO] [timer.py:197:stop] 0/4116, RunningAvgSamplesPerSec=12.00542999748324, CurrSamplesPerSec=11.900804732170613, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:06:42,113] [INFO] [timer.py:197:stop] 0/4117, RunningAvgSamplesPerSec=12.005399923317514, CurrSamplesPerSec=11.88293718740582, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:06:48,547] [INFO] [timer.py:197:stop] 0/4118, RunningAvgSamplesPerSec=12.005398157006958, CurrSamplesPerSec=11.998134187931042, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:06:55,081] [INFO] [timer.py:197:stop] 0/4119, RunningAvgSamplesPerSec=12.005393710324475, CurrSamplesPerSec=11.98711903222247, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:07:01,496] [INFO] [logging.py:68:log_dist] [Rank 0] step=4120, skipped=8, lr=[1.975555555555556e-06], mom=[[0.9, 0.999]] [2022-12-20 02:07:01,497] [INFO] [timer.py:197:stop] 0/4120, RunningAvgSamplesPerSec=12.005374409281368, CurrSamplesPerSec=11.926434636331555, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:07:08,022] [INFO] [timer.py:197:stop] 0/4121, RunningAvgSamplesPerSec=12.00534861231883, CurrSamplesPerSec=11.900048715485385, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:07:14,457] [INFO] [timer.py:197:stop] 0/4122, RunningAvgSamplesPerSec=12.005356716330548, CurrSamplesPerSec=12.038830234999756, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:07:20,868] [INFO] [timer.py:197:stop] 0/4123, RunningAvgSamplesPerSec=12.005348712703578, CurrSamplesPerSec=11.972464115227329, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:07:27,508] [INFO] [timer.py:197:stop] 0/4124, RunningAvgSamplesPerSec=12.005319697080495, CurrSamplesPerSec=11.886925807570954, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:07:34,023] [INFO] [timer.py:197:stop] 0/4125, RunningAvgSamplesPerSec=12.005305391376513, CurrSamplesPerSec=11.946625575228143, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0, 'learning_rate': 1.9644444444444446e-06, 'epoch': 108.55} [2022-12-20 02:07:40,612] [INFO] [timer.py:197:stop] 0/4126, RunningAvgSamplesPerSec=12.005261351873127, CurrSamplesPerSec=11.826392457485241, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:07:47,097] [INFO] [timer.py:197:stop] 0/4127, RunningAvgSamplesPerSec=12.005239718675373, CurrSamplesPerSec=11.91668267038006, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:07:53,622] [INFO] [timer.py:197:stop] 0/4128, RunningAvgSamplesPerSec=12.005201240055372, CurrSamplesPerSec=11.84854858560964, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:08:00,156] [INFO] [timer.py:197:stop] 0/4129, RunningAvgSamplesPerSec=12.005174313682852, CurrSamplesPerSec=11.895095042505389, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:08:06,525] [INFO] [logging.py:68:log_dist] [Rank 0] step=4130, skipped=8, lr=[1.9533333333333334e-06], mom=[[0.9, 0.999]] [2022-12-20 02:08:06,525] [INFO] [timer.py:197:stop] 0/4130, RunningAvgSamplesPerSec=12.005176487947674, CurrSamplesPerSec=12.014156392466042, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:08:13,192] [INFO] [timer.py:197:stop] 0/4131, RunningAvgSamplesPerSec=12.005147666261847, CurrSamplesPerSec=11.887339556504921, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:08:19,642] [INFO] [timer.py:197:stop] 0/4132, RunningAvgSamplesPerSec=12.005134594529334, CurrSamplesPerSec=11.951403038060324, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:08:26,141] [INFO] [timer.py:197:stop] 0/4133, RunningAvgSamplesPerSec=12.005113248217725, CurrSamplesPerSec=11.917595826302406, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:08:32,587] [INFO] [timer.py:197:stop] 0/4134, RunningAvgSamplesPerSec=12.005114515700834, CurrSamplesPerSec=12.010352773620122, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:08:39,070] [INFO] [timer.py:197:stop] 0/4135, RunningAvgSamplesPerSec=12.005089101086885, CurrSamplesPerSec=11.900986760304953, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:08:45,656] [INFO] [timer.py:197:stop] 0/4136, RunningAvgSamplesPerSec=12.00507220724615, CurrSamplesPerSec=11.935653802911114, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:08:52,166] [INFO] [timer.py:197:stop] 0/4137, RunningAvgSamplesPerSec=12.005070634480273, CurrSamplesPerSec=11.99857234060556, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:08:58,804] [INFO] [timer.py:197:stop] 0/4138, RunningAvgSamplesPerSec=12.005047725476063, CurrSamplesPerSec=11.911060798998822, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:09:05,281] [INFO] [timer.py:197:stop] 0/4139, RunningAvgSamplesPerSec=12.005026298407467, CurrSamplesPerSec=11.917053523898726, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:09:11,688] [INFO] [logging.py:68:log_dist] [Rank 0] step=4140, skipped=8, lr=[1.9311111111111114e-06], mom=[[0.9, 0.999]] [2022-12-20 02:09:11,689] [INFO] [timer.py:197:stop] 0/4140, RunningAvgSamplesPerSec=12.005030248225935, CurrSamplesPerSec=12.021392924338416, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:09:18,215] [INFO] [timer.py:197:stop] 0/4141, RunningAvgSamplesPerSec=12.005010029302335, CurrSamplesPerSec=11.921923315384712, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:09:22,815] [INFO] [timer.py:197:stop] 0/4142, RunningAvgSamplesPerSec=12.005820278728365, CurrSamplesPerSec=16.65975431969967, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:09:29,295] [INFO] [timer.py:197:stop] 0/4143, RunningAvgSamplesPerSec=12.005798705202128, CurrSamplesPerSec=11.917143992381849, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:09:35,767] [INFO] [timer.py:197:stop] 0/4144, RunningAvgSamplesPerSec=12.005803422986082, CurrSamplesPerSec=12.025371616211052, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:09:42,234] [INFO] [timer.py:197:stop] 0/4145, RunningAvgSamplesPerSec=12.005804045124238, CurrSamplesPerSec=12.008381494718979, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:09:48,694] [INFO] [timer.py:197:stop] 0/4146, RunningAvgSamplesPerSec=12.00578364052215, CurrSamplesPerSec=11.921838598560687, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:09:55,128] [INFO] [timer.py:197:stop] 0/4147, RunningAvgSamplesPerSec=12.005763876840483, CurrSamplesPerSec=11.92441823610419, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:10:01,601] [INFO] [timer.py:197:stop] 0/4148, RunningAvgSamplesPerSec=12.005754035979608, CurrSamplesPerSec=11.96510181964091, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:10:08,027] [INFO] [timer.py:197:stop] 0/4149, RunningAvgSamplesPerSec=12.00575432021219, CurrSamplesPerSec=12.006932864212324, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:10:14,582] [INFO] [logging.py:68:log_dist] [Rank 0] step=4150, skipped=8, lr=[1.908888888888889e-06], mom=[[0.9, 0.999]] [2022-12-20 02:10:14,583] [INFO] [timer.py:197:stop] 0/4150, RunningAvgSamplesPerSec=12.005733751068913, CurrSamplesPerSec=11.921035436992348, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0, 'learning_rate': 1.908888888888889e-06, 'epoch': 109.21} [2022-12-20 02:10:21,081] [INFO] [timer.py:197:stop] 0/4151, RunningAvgSamplesPerSec=12.005717376623844, CurrSamplesPerSec=11.938178366092723, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:10:27,532] [INFO] [timer.py:197:stop] 0/4152, RunningAvgSamplesPerSec=12.005708410606648, CurrSamplesPerSec=11.968623342019546, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:10:33,973] [INFO] [timer.py:197:stop] 0/4153, RunningAvgSamplesPerSec=12.005688630996236, CurrSamplesPerSec=11.924160804765659, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:10:40,493] [INFO] [timer.py:197:stop] 0/4154, RunningAvgSamplesPerSec=12.005663251341081, CurrSamplesPerSec=11.901228945498861, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:10:47,027] [INFO] [timer.py:197:stop] 0/4155, RunningAvgSamplesPerSec=12.00562971463284, CurrSamplesPerSec=11.867982158642292, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:10:53,572] [INFO] [timer.py:197:stop] 0/4156, RunningAvgSamplesPerSec=12.005599804853771, CurrSamplesPerSec=11.882656821831276, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:11:00,025] [INFO] [timer.py:197:stop] 0/4157, RunningAvgSamplesPerSec=12.005605204574294, CurrSamplesPerSec=12.028077639678774, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:11:06,513] [INFO] [timer.py:197:stop] 0/4158, RunningAvgSamplesPerSec=12.0055657261272, CurrSamplesPerSec=11.843744289038138, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:11:12,937] [INFO] [timer.py:197:stop] 0/4159, RunningAvgSamplesPerSec=12.005563577062011, CurrSamplesPerSec=11.996638703382914, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:11:19,433] [INFO] [logging.py:68:log_dist] [Rank 0] step=4160, skipped=8, lr=[1.8866666666666669e-06], mom=[[0.9, 0.999]] [2022-12-20 02:11:19,434] [INFO] [timer.py:197:stop] 0/4160, RunningAvgSamplesPerSec=12.005532250888773, CurrSamplesPerSec=11.87670704007222, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:11:25,934] [INFO] [timer.py:197:stop] 0/4161, RunningAvgSamplesPerSec=12.005520061605454, CurrSamplesPerSec=11.955050138887843, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:11:32,369] [INFO] [timer.py:197:stop] 0/4162, RunningAvgSamplesPerSec=12.005516229736392, CurrSamplesPerSec=11.989600618588822, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:11:38,864] [INFO] [timer.py:197:stop] 0/4163, RunningAvgSamplesPerSec=12.00549486219098, CurrSamplesPerSec=11.917259328463233, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:11:45,436] [INFO] [timer.py:197:stop] 0/4164, RunningAvgSamplesPerSec=12.005483637157138, CurrSamplesPerSec=11.958957325571195, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:11:51,910] [INFO] [timer.py:197:stop] 0/4165, RunningAvgSamplesPerSec=12.0054628580285, CurrSamplesPerSec=11.919598804840618, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:11:58,432] [INFO] [timer.py:197:stop] 0/4166, RunningAvgSamplesPerSec=12.005435644395481, CurrSamplesPerSec=11.893204622496011, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:12:04,880] [INFO] [timer.py:197:stop] 0/4167, RunningAvgSamplesPerSec=12.005420328517161, CurrSamplesPerSec=11.941982089392882, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:12:11,420] [INFO] [timer.py:197:stop] 0/4168, RunningAvgSamplesPerSec=12.005397609669009, CurrSamplesPerSec=11.911513757423355, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:12:17,899] [INFO] [timer.py:197:stop] 0/4169, RunningAvgSamplesPerSec=12.005400096245003, CurrSamplesPerSec=12.015768120218443, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:12:24,439] [INFO] [logging.py:68:log_dist] [Rank 0] step=4170, skipped=8, lr=[1.8644444444444445e-06], mom=[[0.9, 0.999]] [2022-12-20 02:12:24,440] [INFO] [timer.py:197:stop] 0/4170, RunningAvgSamplesPerSec=12.005393973702818, CurrSamplesPerSec=11.979935455242353, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:12:30,942] [INFO] [timer.py:197:stop] 0/4171, RunningAvgSamplesPerSec=12.005379670564212, CurrSamplesPerSec=11.946058830840917, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:12:37,472] [INFO] [timer.py:197:stop] 0/4172, RunningAvgSamplesPerSec=12.005349377424672, CurrSamplesPerSec=11.880372309306154, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:12:44,032] [INFO] [timer.py:197:stop] 0/4173, RunningAvgSamplesPerSec=12.005330678808065, CurrSamplesPerSec=11.927860726582713, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:12:50,474] [INFO] [timer.py:197:stop] 0/4174, RunningAvgSamplesPerSec=12.005314865554412, CurrSamplesPerSec=11.939718258088202, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:12:56,935] [INFO] [timer.py:197:stop] 0/4175, RunningAvgSamplesPerSec=12.005314495655735, CurrSamplesPerSec=12.003771476769264, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0, 'learning_rate': 1.8533333333333333e-06, 'epoch': 109.87} [2022-12-20 02:13:03,448] [INFO] [timer.py:197:stop] 0/4176, RunningAvgSamplesPerSec=12.005291787834931, CurrSamplesPerSec=11.911274324699097, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:13:09,988] [INFO] [timer.py:197:stop] 0/4177, RunningAvgSamplesPerSec=12.005237246012953, CurrSamplesPerSec=11.781817447646636, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:13:16,496] [INFO] [timer.py:197:stop] 0/4178, RunningAvgSamplesPerSec=12.00520156006416, CurrSamplesPerSec=11.85803949504323, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:13:23,030] [INFO] [timer.py:197:stop] 0/4179, RunningAvgSamplesPerSec=12.005189845570877, CurrSamplesPerSec=11.956468702041528, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:13:27,769] [INFO] [logging.py:68:log_dist] [Rank 0] step=4180, skipped=8, lr=[1.8422222222222225e-06], mom=[[0.9, 0.999]] [2022-12-20 02:13:27,770] [INFO] [timer.py:197:stop] 0/4180, RunningAvgSamplesPerSec=12.005970691448994, CurrSamplesPerSec=16.48451590530319, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:13:34,256] [INFO] [timer.py:197:stop] 0/4181, RunningAvgSamplesPerSec=12.005944042617243, CurrSamplesPerSec=11.895628495236462, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:13:40,826] [INFO] [timer.py:197:stop] 0/4182, RunningAvgSamplesPerSec=12.005917739499289, CurrSamplesPerSec=11.896994499918009, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:13:47,342] [INFO] [timer.py:197:stop] 0/4183, RunningAvgSamplesPerSec=12.005898990661356, CurrSamplesPerSec=11.928037223002484, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:13:53,779] [INFO] [timer.py:197:stop] 0/4184, RunningAvgSamplesPerSec=12.005902761674303, CurrSamplesPerSec=12.021690104298571, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:14:00,256] [INFO] [timer.py:197:stop] 0/4185, RunningAvgSamplesPerSec=12.005904257589863, CurrSamplesPerSec=12.012163438712273, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:14:06,753] [INFO] [timer.py:197:stop] 0/4186, RunningAvgSamplesPerSec=12.005883310594642, CurrSamplesPerSec=11.918897024252507, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:14:13,126] [INFO] [timer.py:197:stop] 0/4187, RunningAvgSamplesPerSec=12.005884799532101, CurrSamplesPerSec=12.012117748848786, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:14:19,608] [INFO] [timer.py:197:stop] 0/4188, RunningAvgSamplesPerSec=12.005887673793456, CurrSamplesPerSec=12.017928524259892, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:14:26,107] [INFO] [timer.py:197:stop] 0/4189, RunningAvgSamplesPerSec=12.005859172256192, CurrSamplesPerSec=11.887725958744234, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:14:32,644] [INFO] [logging.py:68:log_dist] [Rank 0] step=4190, skipped=8, lr=[1.8200000000000002e-06], mom=[[0.9, 0.999]] [2022-12-20 02:14:32,645] [INFO] [timer.py:197:stop] 0/4190, RunningAvgSamplesPerSec=12.005836609783046, CurrSamplesPerSec=11.912105245080493, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:14:39,091] [INFO] [timer.py:197:stop] 0/4191, RunningAvgSamplesPerSec=12.00581682168108, CurrSamplesPerSec=11.92351250841787, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:14:45,620] [INFO] [timer.py:197:stop] 0/4192, RunningAvgSamplesPerSec=12.0057888748922, CurrSamplesPerSec=11.88985056688803, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:14:52,101] [INFO] [timer.py:197:stop] 0/4193, RunningAvgSamplesPerSec=12.005789980110434, CurrSamplesPerSec=12.010422631842056, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:14:58,597] [INFO] [timer.py:197:stop] 0/4194, RunningAvgSamplesPerSec=12.005771584691177, CurrSamplesPerSec=11.929168408144093, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:15:04,987] [INFO] [timer.py:197:stop] 0/4195, RunningAvgSamplesPerSec=12.005765955982756, CurrSamplesPerSec=11.98221670386957, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:15:11,449] [INFO] [timer.py:197:stop] 0/4196, RunningAvgSamplesPerSec=12.005715287844234, CurrSamplesPerSec=11.796958790632166, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:15:17,850] [INFO] [timer.py:197:stop] 0/4197, RunningAvgSamplesPerSec=12.005710365714297, CurrSamplesPerSec=11.985102395913602, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:15:24,411] [INFO] [timer.py:197:stop] 0/4198, RunningAvgSamplesPerSec=12.005685293503413, CurrSamplesPerSec=11.901421012224922, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:15:30,889] [INFO] [timer.py:197:stop] 0/4199, RunningAvgSamplesPerSec=12.005677042770113, CurrSamplesPerSec=11.971156534351277, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:15:37,368] [INFO] [logging.py:68:log_dist] [Rank 0] step=4200, skipped=8, lr=[1.797777777777778e-06], mom=[[0.9, 0.999]] [2022-12-20 02:15:37,369] [INFO] [timer.py:197:stop] 0/4200, RunningAvgSamplesPerSec=12.00564691567691, CurrSamplesPerSec=11.880521638259786, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0, 'learning_rate': 1.797777777777778e-06, 'epoch': 110.53} [2022-12-20 02:15:43,848] [INFO] [timer.py:197:stop] 0/4201, RunningAvgSamplesPerSec=12.005644735751423, CurrSamplesPerSec=11.996500380516807, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:15:50,267] [INFO] [timer.py:197:stop] 0/4202, RunningAvgSamplesPerSec=12.005650893730586, CurrSamplesPerSec=12.03156407243515, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:15:56,741] [INFO] [timer.py:197:stop] 0/4203, RunningAvgSamplesPerSec=12.005632708653149, CurrSamplesPerSec=11.929738322599913, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:16:03,184] [INFO] [timer.py:197:stop] 0/4204, RunningAvgSamplesPerSec=12.005615136316175, CurrSamplesPerSec=11.93224500206542, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:16:09,651] [INFO] [timer.py:197:stop] 0/4205, RunningAvgSamplesPerSec=12.005594940825983, CurrSamplesPerSec=11.921329264036865, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:16:16,124] [INFO] [timer.py:197:stop] 0/4206, RunningAvgSamplesPerSec=12.005571477172916, CurrSamplesPerSec=11.907757411577727, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:16:22,652] [INFO] [timer.py:197:stop] 0/4207, RunningAvgSamplesPerSec=12.00556191532752, CurrSamplesPerSec=11.96549809413098, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:16:29,161] [INFO] [timer.py:197:stop] 0/4208, RunningAvgSamplesPerSec=12.005540324050154, CurrSamplesPerSec=11.91543061512299, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:16:35,670] [INFO] [timer.py:197:stop] 0/4209, RunningAvgSamplesPerSec=12.00552395638011, CurrSamplesPerSec=11.937074136426293, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:16:42,224] [INFO] [logging.py:68:log_dist] [Rank 0] step=4210, skipped=8, lr=[1.7755555555555556e-06], mom=[[0.9, 0.999]] [2022-12-20 02:16:42,224] [INFO] [timer.py:197:stop] 0/4210, RunningAvgSamplesPerSec=12.00550195515337, CurrSamplesPerSec=11.913651107676747, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:16:48,595] [INFO] [timer.py:197:stop] 0/4211, RunningAvgSamplesPerSec=12.005500550201676, CurrSamplesPerSec=11.99959142407173, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:16:55,033] [INFO] [timer.py:197:stop] 0/4212, RunningAvgSamplesPerSec=12.005505266198421, CurrSamplesPerSec=12.025387777610781, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:17:01,411] [INFO] [timer.py:197:stop] 0/4213, RunningAvgSamplesPerSec=12.005509951779842, CurrSamplesPerSec=12.025268722984798, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:17:07,897] [INFO] [timer.py:197:stop] 0/4214, RunningAvgSamplesPerSec=12.005502462841873, CurrSamplesPerSec=11.97404918578745, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:17:14,846] [INFO] [timer.py:197:stop] 0/4215, RunningAvgSamplesPerSec=12.005467415002984, CurrSamplesPerSec=11.859639471283693, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:17:21,916] [INFO] [timer.py:197:stop] 0/4216, RunningAvgSamplesPerSec=12.00543325055334, CurrSamplesPerSec=11.86320403500721, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:17:28,878] [INFO] [timer.py:197:stop] 0/4217, RunningAvgSamplesPerSec=12.00541793782895, CurrSamplesPerSec=11.941235174516962, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:17:33,878] [INFO] [timer.py:197:stop] 0/4218, RunningAvgSamplesPerSec=12.006177695011475, CurrSamplesPerSec=16.373793362328524, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:17:40,786] [INFO] [timer.py:197:stop] 0/4219, RunningAvgSamplesPerSec=12.006175003571903, CurrSamplesPerSec=11.994838610968456, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:17:47,275] [INFO] [logging.py:68:log_dist] [Rank 0] step=4220, skipped=8, lr=[1.7533333333333336e-06], mom=[[0.9, 0.999]] [2022-12-20 02:17:47,275] [INFO] [timer.py:197:stop] 0/4220, RunningAvgSamplesPerSec=12.006158370429173, CurrSamplesPerSec=11.936423904431933, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:17:53,907] [INFO] [timer.py:197:stop] 0/4221, RunningAvgSamplesPerSec=12.006125258125355, CurrSamplesPerSec=11.868064012939609, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:18:00,652] [INFO] [timer.py:197:stop] 0/4222, RunningAvgSamplesPerSec=12.006072500129509, CurrSamplesPerSec=11.787538967512845, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:18:07,105] [INFO] [timer.py:197:stop] 0/4223, RunningAvgSamplesPerSec=12.006067508767341, CurrSamplesPerSec=11.985040858501348, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:18:13,619] [INFO] [timer.py:197:stop] 0/4224, RunningAvgSamplesPerSec=12.006064442588787, CurrSamplesPerSec=11.993136042816452, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:18:20,100] [INFO] [timer.py:197:stop] 0/4225, RunningAvgSamplesPerSec=12.006057330840525, CurrSamplesPerSec=11.976106451203746, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0, 'learning_rate': 1.7422222222222224e-06, 'epoch': 111.18} [2022-12-20 02:18:26,600] [INFO] [timer.py:197:stop] 0/4226, RunningAvgSamplesPerSec=12.00601898296052, CurrSamplesPerSec=11.846231685834594, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:18:33,187] [INFO] [timer.py:197:stop] 0/4227, RunningAvgSamplesPerSec=12.005985764735453, CurrSamplesPerSec=11.867293263048508, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:18:39,836] [INFO] [timer.py:197:stop] 0/4228, RunningAvgSamplesPerSec=12.005962766456971, CurrSamplesPerSec=11.909575316006835, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:18:46,427] [INFO] [timer.py:197:stop] 0/4229, RunningAvgSamplesPerSec=12.005944905991123, CurrSamplesPerSec=11.930938236793132, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:18:52,978] [INFO] [logging.py:68:log_dist] [Rank 0] step=4230, skipped=8, lr=[1.7311111111111112e-06], mom=[[0.9, 0.999]] [2022-12-20 02:18:52,978] [INFO] [timer.py:197:stop] 0/4230, RunningAvgSamplesPerSec=12.005938339082828, CurrSamplesPerSec=11.978244063397595, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:18:59,498] [INFO] [timer.py:197:stop] 0/4231, RunningAvgSamplesPerSec=12.005919130232103, CurrSamplesPerSec=11.925249931818703, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:19:05,954] [INFO] [timer.py:197:stop] 0/4232, RunningAvgSamplesPerSec=12.005922664685736, CurrSamplesPerSec=12.020888505759995, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:19:12,450] [INFO] [timer.py:197:stop] 0/4233, RunningAvgSamplesPerSec=12.005903543548607, CurrSamplesPerSec=11.925562509882049, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:19:18,994] [INFO] [timer.py:197:stop] 0/4234, RunningAvgSamplesPerSec=12.005859404434728, CurrSamplesPerSec=11.821967943533338, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:19:25,502] [INFO] [timer.py:197:stop] 0/4235, RunningAvgSamplesPerSec=12.00584240446686, CurrSamplesPerSec=11.934327189804273, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:19:32,053] [INFO] [timer.py:197:stop] 0/4236, RunningAvgSamplesPerSec=12.00584271221334, CurrSamplesPerSec=12.007145544463533, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:19:38,654] [INFO] [timer.py:197:stop] 0/4237, RunningAvgSamplesPerSec=12.005839059339298, CurrSamplesPerSec=11.990392693794696, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:19:45,179] [INFO] [timer.py:197:stop] 0/4238, RunningAvgSamplesPerSec=12.005825305272568, CurrSamplesPerSec=11.947858137479695, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:19:51,596] [INFO] [timer.py:197:stop] 0/4239, RunningAvgSamplesPerSec=12.005831694517587, CurrSamplesPerSec=12.03295770122329, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:19:58,126] [INFO] [logging.py:68:log_dist] [Rank 0] step=4240, skipped=8, lr=[1.708888888888889e-06], mom=[[0.9, 0.999]] [2022-12-20 02:19:58,127] [INFO] [timer.py:197:stop] 0/4240, RunningAvgSamplesPerSec=12.005806703671539, CurrSamplesPerSec=11.90084641348374, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:20:04,602] [INFO] [timer.py:197:stop] 0/4241, RunningAvgSamplesPerSec=12.005781878119093, CurrSamplesPerSec=11.901485387624604, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:20:11,146] [INFO] [timer.py:197:stop] 0/4242, RunningAvgSamplesPerSec=12.005776387884602, CurrSamplesPerSec=11.982548321926046, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:20:17,709] [INFO] [timer.py:197:stop] 0/4243, RunningAvgSamplesPerSec=12.005745649476115, CurrSamplesPerSec=11.876814763461603, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:20:24,196] [INFO] [timer.py:197:stop] 0/4244, RunningAvgSamplesPerSec=12.005750352560176, CurrSamplesPerSec=12.025729332018203, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:20:30,663] [INFO] [timer.py:197:stop] 0/4245, RunningAvgSamplesPerSec=12.005730073807479, CurrSamplesPerSec=11.920319724337718, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:20:37,070] [INFO] [timer.py:197:stop] 0/4246, RunningAvgSamplesPerSec=12.00573179332644, CurrSamplesPerSec=12.013032149770373, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:20:43,551] [INFO] [timer.py:197:stop] 0/4247, RunningAvgSamplesPerSec=12.005709699255872, CurrSamplesPerSec=11.912669300983318, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:20:50,201] [INFO] [timer.py:197:stop] 0/4248, RunningAvgSamplesPerSec=12.00568047357943, CurrSamplesPerSec=11.88288668906469, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:20:56,770] [INFO] [timer.py:197:stop] 0/4249, RunningAvgSamplesPerSec=12.005663258866347, CurrSamplesPerSec=11.93301201140848, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:21:03,264] [INFO] [logging.py:68:log_dist] [Rank 0] step=4250, skipped=8, lr=[1.6866666666666667e-06], mom=[[0.9, 0.999]] [2022-12-20 02:21:03,265] [INFO] [timer.py:197:stop] 0/4250, RunningAvgSamplesPerSec=12.005646049021054, CurrSamplesPerSec=11.93299821922192, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0, 'learning_rate': 1.6866666666666667e-06, 'epoch': 111.84} [2022-12-20 02:21:09,730] [INFO] [timer.py:197:stop] 0/4251, RunningAvgSamplesPerSec=12.005632501182545, CurrSamplesPerSec=11.948355913737322, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:21:16,191] [INFO] [timer.py:197:stop] 0/4252, RunningAvgSamplesPerSec=12.005614477719764, CurrSamplesPerSec=11.9295183026591, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:21:22,707] [INFO] [timer.py:197:stop] 0/4253, RunningAvgSamplesPerSec=12.005556195286005, CurrSamplesPerSec=11.762864284028465, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:21:29,186] [INFO] [timer.py:197:stop] 0/4254, RunningAvgSamplesPerSec=12.005531744879333, CurrSamplesPerSec=11.90248540641641, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:21:35,733] [INFO] [timer.py:197:stop] 0/4255, RunningAvgSamplesPerSec=12.005509456080224, CurrSamplesPerSec=11.911479929699603, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:21:40,480] [INFO] [timer.py:197:stop] 0/4256, RunningAvgSamplesPerSec=12.00629059496938, CurrSamplesPerSec=16.599818922617402, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:21:46,910] [INFO] [timer.py:197:stop] 0/4257, RunningAvgSamplesPerSec=12.006269816527695, CurrSamplesPerSec=11.918524468442525, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:21:53,398] [INFO] [timer.py:197:stop] 0/4258, RunningAvgSamplesPerSec=12.006274288520219, CurrSamplesPerSec=12.025332829028939, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:22:00,037] [INFO] [timer.py:197:stop] 0/4259, RunningAvgSamplesPerSec=12.006260725247685, CurrSamplesPerSec=11.94881171320301, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:22:06,552] [INFO] [logging.py:68:log_dist] [Rank 0] step=4260, skipped=8, lr=[1.6644444444444447e-06], mom=[[0.9, 0.999]] [2022-12-20 02:22:06,553] [INFO] [timer.py:197:stop] 0/4260, RunningAvgSamplesPerSec=12.00627451548505, CurrSamplesPerSec=12.065268074487886, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:22:13,158] [INFO] [timer.py:197:stop] 0/4261, RunningAvgSamplesPerSec=12.006256601742123, CurrSamplesPerSec=11.930461529654703, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:22:19,727] [INFO] [timer.py:197:stop] 0/4262, RunningAvgSamplesPerSec=12.006238646544537, CurrSamplesPerSec=11.930251557283288, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:22:26,192] [INFO] [timer.py:197:stop] 0/4263, RunningAvgSamplesPerSec=12.006235089403981, CurrSamplesPerSec=11.99110077658031, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:22:32,672] [INFO] [timer.py:197:stop] 0/4264, RunningAvgSamplesPerSec=12.006233437245314, CurrSamplesPerSec=11.99919771551863, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:22:39,088] [INFO] [timer.py:197:stop] 0/4265, RunningAvgSamplesPerSec=12.00622951731537, CurrSamplesPerSec=11.989545996569943, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:22:45,784] [INFO] [timer.py:197:stop] 0/4266, RunningAvgSamplesPerSec=12.006197444145837, CurrSamplesPerSec=11.871009419931905, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:22:52,477] [INFO] [timer.py:197:stop] 0/4267, RunningAvgSamplesPerSec=12.006175575090978, CurrSamplesPerSec=11.913644762687488, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:22:59,052] [INFO] [timer.py:197:stop] 0/4268, RunningAvgSamplesPerSec=12.006151569659092, CurrSamplesPerSec=11.904634300154065, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:23:05,587] [INFO] [timer.py:197:stop] 0/4269, RunningAvgSamplesPerSec=12.006118739223636, CurrSamplesPerSec=11.86767941249483, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:23:12,080] [INFO] [logging.py:68:log_dist] [Rank 0] step=4270, skipped=8, lr=[1.6422222222222223e-06], mom=[[0.9, 0.999]] [2022-12-20 02:23:12,081] [INFO] [timer.py:197:stop] 0/4270, RunningAvgSamplesPerSec=12.00610000490313, CurrSamplesPerSec=11.926689516322877, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:23:18,529] [INFO] [timer.py:197:stop] 0/4271, RunningAvgSamplesPerSec=12.006099125758146, CurrSamplesPerSec=12.002348107521291, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:23:25,084] [INFO] [timer.py:197:stop] 0/4272, RunningAvgSamplesPerSec=12.00606348494076, CurrSamplesPerSec=11.855817323689873, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:23:31,581] [INFO] [timer.py:197:stop] 0/4273, RunningAvgSamplesPerSec=12.006042734570693, CurrSamplesPerSec=11.918087909240564, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:23:38,132] [INFO] [timer.py:197:stop] 0/4274, RunningAvgSamplesPerSec=12.006020716144759, CurrSamplesPerSec=11.91271106544682, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:23:44,749] [INFO] [timer.py:197:stop] 0/4275, RunningAvgSamplesPerSec=12.005991647341743, CurrSamplesPerSec=11.883081320448532, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0, 'learning_rate': 1.6311111111111114e-06, 'epoch': 112.5} [2022-12-20 02:23:51,417] [INFO] [timer.py:197:stop] 0/4276, RunningAvgSamplesPerSec=12.005958233287629, CurrSamplesPerSec=11.864858372237105, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:23:57,874] [INFO] [timer.py:197:stop] 0/4277, RunningAvgSamplesPerSec=12.005962261778707, CurrSamplesPerSec=12.023204765935002, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:24:04,396] [INFO] [timer.py:197:stop] 0/4278, RunningAvgSamplesPerSec=12.00593030511897, CurrSamplesPerSec=11.870852980596814, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:24:10,864] [INFO] [timer.py:197:stop] 0/4279, RunningAvgSamplesPerSec=12.005908835801216, CurrSamplesPerSec=11.914802835170978, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:24:17,379] [INFO] [logging.py:68:log_dist] [Rank 0] step=4280, skipped=8, lr=[1.6200000000000002e-06], mom=[[0.9, 0.999]] [2022-12-20 02:24:17,380] [INFO] [timer.py:197:stop] 0/4280, RunningAvgSamplesPerSec=12.005892093836819, CurrSamplesPerSec=11.934711346624047, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:24:23,864] [INFO] [timer.py:197:stop] 0/4281, RunningAvgSamplesPerSec=12.00589811658156, CurrSamplesPerSec=12.031718844216499, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:24:30,489] [INFO] [timer.py:197:stop] 0/4282, RunningAvgSamplesPerSec=12.005876342498343, CurrSamplesPerSec=11.91342269232111, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:24:37,095] [INFO] [timer.py:197:stop] 0/4283, RunningAvgSamplesPerSec=12.005836870124053, CurrSamplesPerSec=11.83923994709388, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:24:43,539] [INFO] [timer.py:197:stop] 0/4284, RunningAvgSamplesPerSec=12.005807770432657, CurrSamplesPerSec=11.882511647108043, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:24:50,038] [INFO] [timer.py:197:stop] 0/4285, RunningAvgSamplesPerSec=12.005780145153667, CurrSamplesPerSec=11.888643106582842, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:24:56,525] [INFO] [timer.py:197:stop] 0/4286, RunningAvgSamplesPerSec=12.005760076110835, CurrSamplesPerSec=11.920415535916211, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:25:03,060] [INFO] [timer.py:197:stop] 0/4287, RunningAvgSamplesPerSec=12.00574226695231, CurrSamplesPerSec=11.929929720040555, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:25:09,712] [INFO] [timer.py:197:stop] 0/4288, RunningAvgSamplesPerSec=12.005727804516768, CurrSamplesPerSec=11.944074585966, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:25:16,173] [INFO] [timer.py:197:stop] 0/4289, RunningAvgSamplesPerSec=12.005717604781873, CurrSamplesPerSec=11.962160182381913, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:25:22,650] [INFO] [logging.py:68:log_dist] [Rank 0] step=4290, skipped=8, lr=[1.5977777777777778e-06], mom=[[0.9, 0.999]] [2022-12-20 02:25:22,651] [INFO] [timer.py:197:stop] 0/4290, RunningAvgSamplesPerSec=12.005671929131472, CurrSamplesPerSec=11.813003554082636, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:25:29,027] [INFO] [timer.py:197:stop] 0/4291, RunningAvgSamplesPerSec=12.005654685453848, CurrSamplesPerSec=11.932166503213725, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:25:35,544] [INFO] [timer.py:197:stop] 0/4292, RunningAvgSamplesPerSec=12.005634185759938, CurrSamplesPerSec=11.91835037009049, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:25:42,064] [INFO] [timer.py:197:stop] 0/4293, RunningAvgSamplesPerSec=12.005607158143578, CurrSamplesPerSec=11.890768042599118, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:25:46,694] [INFO] [timer.py:197:stop] 0/4294, RunningAvgSamplesPerSec=12.006388698589964, CurrSamplesPerSec=16.660159635878365, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:25:53,121] [INFO] [timer.py:197:stop] 0/4295, RunningAvgSamplesPerSec=12.00636215512231, CurrSamplesPerSec=11.893508672363957, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:25:59,605] [INFO] [timer.py:197:stop] 0/4296, RunningAvgSamplesPerSec=12.006354658324156, CurrSamplesPerSec=11.974256963678677, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:26:06,067] [INFO] [timer.py:197:stop] 0/4297, RunningAvgSamplesPerSec=12.006357292820208, CurrSamplesPerSec=12.017680490202878, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:26:12,517] [INFO] [timer.py:197:stop] 0/4298, RunningAvgSamplesPerSec=12.006355276001157, CurrSamplesPerSec=11.997699284671329, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:26:19,076] [INFO] [timer.py:197:stop] 0/4299, RunningAvgSamplesPerSec=12.006316823661335, CurrSamplesPerSec=11.843368056054969, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:26:25,504] [INFO] [logging.py:68:log_dist] [Rank 0] step=4300, skipped=8, lr=[1.5755555555555558e-06], mom=[[0.9, 0.999]] [2022-12-20 02:26:25,505] [INFO] [timer.py:197:stop] 0/4300, RunningAvgSamplesPerSec=12.006304426554891, CurrSamplesPerSec=11.953269424146708, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0, 'learning_rate': 1.5755555555555558e-06, 'epoch': 113.16} [2022-12-20 02:26:31,979] [INFO] [timer.py:197:stop] 0/4301, RunningAvgSamplesPerSec=12.006295156713673, CurrSamplesPerSec=11.966585183677754, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:26:38,366] [INFO] [timer.py:197:stop] 0/4302, RunningAvgSamplesPerSec=12.006296391882692, CurrSamplesPerSec=12.01160873352347, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:26:44,768] [INFO] [timer.py:197:stop] 0/4303, RunningAvgSamplesPerSec=12.00629343681623, CurrSamplesPerSec=11.99360008807251, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:26:51,193] [INFO] [timer.py:197:stop] 0/4304, RunningAvgSamplesPerSec=12.006294378717222, CurrSamplesPerSec=12.010346862577096, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:26:57,692] [INFO] [timer.py:197:stop] 0/4305, RunningAvgSamplesPerSec=12.006261558236119, CurrSamplesPerSec=11.866709364673811, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:27:04,240] [INFO] [timer.py:197:stop] 0/4306, RunningAvgSamplesPerSec=12.006220692219438, CurrSamplesPerSec=11.832913123506424, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:27:10,715] [INFO] [timer.py:197:stop] 0/4307, RunningAvgSamplesPerSec=12.00619176625617, CurrSamplesPerSec=11.88297243129404, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:27:17,238] [INFO] [timer.py:197:stop] 0/4308, RunningAvgSamplesPerSec=12.00616642725346, CurrSamplesPerSec=11.898064430855868, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:27:23,704] [INFO] [timer.py:197:stop] 0/4309, RunningAvgSamplesPerSec=12.006142591405725, CurrSamplesPerSec=11.904375610606268, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:27:30,164] [INFO] [logging.py:68:log_dist] [Rank 0] step=4310, skipped=8, lr=[1.5533333333333334e-06], mom=[[0.9, 0.999]] [2022-12-20 02:27:30,164] [INFO] [timer.py:197:stop] 0/4310, RunningAvgSamplesPerSec=12.006143436228383, CurrSamplesPerSec=12.0097831907524, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:27:36,641] [INFO] [timer.py:197:stop] 0/4311, RunningAvgSamplesPerSec=12.006141128873699, CurrSamplesPerSec=11.996209269556847, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:27:43,165] [INFO] [timer.py:197:stop] 0/4312, RunningAvgSamplesPerSec=12.006103870322702, CurrSamplesPerSec=11.847675785538609, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:27:49,700] [INFO] [timer.py:197:stop] 0/4313, RunningAvgSamplesPerSec=12.006082177813742, CurrSamplesPerSec=11.913310074084876, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:27:56,245] [INFO] [timer.py:197:stop] 0/4314, RunningAvgSamplesPerSec=12.006046414082354, CurrSamplesPerSec=11.853824205835753, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:28:02,777] [INFO] [timer.py:197:stop] 0/4315, RunningAvgSamplesPerSec=12.006014468584434, CurrSamplesPerSec=11.869828352407744, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:28:09,259] [INFO] [timer.py:197:stop] 0/4316, RunningAvgSamplesPerSec=12.00600277306501, CurrSamplesPerSec=11.955771093258509, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:28:15,753] [INFO] [timer.py:197:stop] 0/4317, RunningAvgSamplesPerSec=12.005976615474173, CurrSamplesPerSec=11.894183750260096, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:28:22,212] [INFO] [timer.py:197:stop] 0/4318, RunningAvgSamplesPerSec=12.005959514453622, CurrSamplesPerSec=11.932619477019339, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:28:28,662] [INFO] [timer.py:197:stop] 0/4319, RunningAvgSamplesPerSec=12.005944762717588, CurrSamplesPerSec=11.942612205463744, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:28:35,193] [INFO] [logging.py:68:log_dist] [Rank 0] step=4320, skipped=8, lr=[1.5311111111111113e-06], mom=[[0.9, 0.999]] [2022-12-20 02:28:35,193] [INFO] [timer.py:197:stop] 0/4320, RunningAvgSamplesPerSec=12.005922632685184, CurrSamplesPerSec=11.911141663029234, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:28:41,676] [INFO] [timer.py:197:stop] 0/4321, RunningAvgSamplesPerSec=12.005906508916318, CurrSamplesPerSec=11.93668558111663, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:28:48,181] [INFO] [timer.py:197:stop] 0/4322, RunningAvgSamplesPerSec=12.005901052609785, CurrSamplesPerSec=11.982381440867817, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:28:54,616] [INFO] [timer.py:197:stop] 0/4323, RunningAvgSamplesPerSec=12.005897081494746, CurrSamplesPerSec=11.988766348279826, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:29:01,112] [INFO] [timer.py:197:stop] 0/4324, RunningAvgSamplesPerSec=12.005872588665751, CurrSamplesPerSec=11.900964072436619, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:29:07,545] [INFO] [timer.py:197:stop] 0/4325, RunningAvgSamplesPerSec=12.005846388185985, CurrSamplesPerSec=11.89366623638313, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0, 'learning_rate': 1.52e-06, 'epoch': 113.82} [2022-12-20 02:29:14,027] [INFO] [timer.py:197:stop] 0/4326, RunningAvgSamplesPerSec=12.0058442659321, CurrSamplesPerSec=11.996676769537542, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:29:20,471] [INFO] [timer.py:197:stop] 0/4327, RunningAvgSamplesPerSec=12.005847561856209, CurrSamplesPerSec=12.020116079102873, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:29:26,923] [INFO] [timer.py:197:stop] 0/4328, RunningAvgSamplesPerSec=12.005825172726905, CurrSamplesPerSec=11.909767123818863, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:29:33,463] [INFO] [timer.py:197:stop] 0/4329, RunningAvgSamplesPerSec=12.005802625508082, CurrSamplesPerSec=11.909049593486557, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:29:39,932] [INFO] [logging.py:68:log_dist] [Rank 0] step=4330, skipped=8, lr=[1.5088888888888889e-06], mom=[[0.9, 0.999]] [2022-12-20 02:29:39,933] [INFO] [timer.py:197:stop] 0/4330, RunningAvgSamplesPerSec=12.005795491637176, CurrSamplesPerSec=11.975006412741436, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:29:46,349] [INFO] [timer.py:197:stop] 0/4331, RunningAvgSamplesPerSec=12.005794593474944, CurrSamplesPerSec=12.001908605823537, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:29:50,982] [INFO] [timer.py:197:stop] 0/4332, RunningAvgSamplesPerSec=12.006567376732368, CurrSamplesPerSec=16.64451035918809, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:29:57,384] [INFO] [timer.py:197:stop] 0/4333, RunningAvgSamplesPerSec=12.006550774055427, CurrSamplesPerSec=11.93508916032443, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:30:03,841] [INFO] [timer.py:197:stop] 0/4334, RunningAvgSamplesPerSec=12.006527610068078, CurrSamplesPerSec=11.90703589807646, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:30:10,318] [INFO] [timer.py:197:stop] 0/4335, RunningAvgSamplesPerSec=12.006514584590926, CurrSamplesPerSec=11.950352221715107, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:30:16,863] [INFO] [timer.py:197:stop] 0/4336, RunningAvgSamplesPerSec=12.00648481832809, CurrSamplesPerSec=11.878878700540895, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:30:23,329] [INFO] [timer.py:197:stop] 0/4337, RunningAvgSamplesPerSec=12.006459212205256, CurrSamplesPerSec=11.896498884961337, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:30:29,737] [INFO] [timer.py:197:stop] 0/4338, RunningAvgSamplesPerSec=12.006441244493315, CurrSamplesPerSec=11.92905337155222, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:30:36,181] [INFO] [timer.py:197:stop] 0/4339, RunningAvgSamplesPerSec=12.006426331887102, CurrSamplesPerSec=11.94211171998148, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:30:42,648] [INFO] [logging.py:68:log_dist] [Rank 0] step=4340, skipped=8, lr=[1.486666666666667e-06], mom=[[0.9, 0.999]] [2022-12-20 02:30:42,649] [INFO] [timer.py:197:stop] 0/4340, RunningAvgSamplesPerSec=12.006432569253533, CurrSamplesPerSec=12.033545128328832, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:30:49,090] [INFO] [timer.py:197:stop] 0/4341, RunningAvgSamplesPerSec=12.00643638055112, CurrSamplesPerSec=12.022992593399461, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:30:55,580] [INFO] [timer.py:197:stop] 0/4342, RunningAvgSamplesPerSec=12.006420869489135, CurrSamplesPerSec=11.939493621509778, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:31:02,061] [INFO] [timer.py:197:stop] 0/4343, RunningAvgSamplesPerSec=12.00638089135416, CurrSamplesPerSec=11.835347964635325, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:31:08,657] [INFO] [timer.py:197:stop] 0/4344, RunningAvgSamplesPerSec=12.006353710424298, CurrSamplesPerSec=11.889509841300438, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:31:15,172] [INFO] [timer.py:197:stop] 0/4345, RunningAvgSamplesPerSec=12.006339875838819, CurrSamplesPerSec=11.946569217374764, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:31:21,640] [INFO] [timer.py:197:stop] 0/4346, RunningAvgSamplesPerSec=12.006345985081886, CurrSamplesPerSec=12.032937204343298, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:31:28,132] [INFO] [timer.py:197:stop] 0/4347, RunningAvgSamplesPerSec=12.006330780730988, CurrSamplesPerSec=11.940644508576156, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:31:34,565] [INFO] [timer.py:197:stop] 0/4348, RunningAvgSamplesPerSec=12.006334574771998, CurrSamplesPerSec=12.022842353924629, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:31:41,055] [INFO] [timer.py:197:stop] 0/4349, RunningAvgSamplesPerSec=12.0063041695869, CurrSamplesPerSec=11.875602068576967, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:31:47,804] [INFO] [logging.py:68:log_dist] [Rank 0] step=4350, skipped=8, lr=[1.4644444444444445e-06], mom=[[0.9, 0.999]] [2022-12-20 02:31:47,805] [INFO] [timer.py:197:stop] 0/4350, RunningAvgSamplesPerSec=12.00627729549142, CurrSamplesPerSec=11.890581586588523, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0, 'learning_rate': 1.4644444444444445e-06, 'epoch': 114.47} [2022-12-20 02:31:54,806] [INFO] [timer.py:197:stop] 0/4351, RunningAvgSamplesPerSec=12.006249710424147, CurrSamplesPerSec=11.887496430679365, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:32:01,814] [INFO] [timer.py:197:stop] 0/4352, RunningAvgSamplesPerSec=12.00623609593982, CurrSamplesPerSec=11.947317331442903, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:32:08,670] [INFO] [timer.py:197:stop] 0/4353, RunningAvgSamplesPerSec=12.006203190641648, CurrSamplesPerSec=11.864751914463433, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:32:15,366] [INFO] [timer.py:197:stop] 0/4354, RunningAvgSamplesPerSec=12.006195976324598, CurrSamplesPerSec=11.974888353599404, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:32:22,221] [INFO] [timer.py:197:stop] 0/4355, RunningAvgSamplesPerSec=12.00617514740299, CurrSamplesPerSec=11.916207101716209, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:32:28,806] [INFO] [timer.py:197:stop] 0/4356, RunningAvgSamplesPerSec=12.006161079253998, CurrSamplesPerSec=11.945233266622708, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:32:35,412] [INFO] [timer.py:197:stop] 0/4357, RunningAvgSamplesPerSec=12.006134085426195, CurrSamplesPerSec=11.889742606898984, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:32:42,119] [INFO] [timer.py:197:stop] 0/4358, RunningAvgSamplesPerSec=12.006105177934597, CurrSamplesPerSec=11.88151971476434, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:32:48,642] [INFO] [timer.py:197:stop] 0/4359, RunningAvgSamplesPerSec=12.006094356259334, CurrSamplesPerSec=11.959139538620846, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:32:55,075] [INFO] [logging.py:68:log_dist] [Rank 0] step=4360, skipped=8, lr=[1.4422222222222223e-06], mom=[[0.9, 0.999]] [2022-12-20 02:32:55,076] [INFO] [timer.py:197:stop] 0/4360, RunningAvgSamplesPerSec=12.006077298190796, CurrSamplesPerSec=11.932212647683857, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:33:01,578] [INFO] [timer.py:197:stop] 0/4361, RunningAvgSamplesPerSec=12.00607432415816, CurrSamplesPerSec=11.993127469569895, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:33:08,149] [INFO] [timer.py:197:stop] 0/4362, RunningAvgSamplesPerSec=12.006057199619379, CurrSamplesPerSec=11.931872672086417, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:33:14,787] [INFO] [timer.py:197:stop] 0/4363, RunningAvgSamplesPerSec=12.006038971586722, CurrSamplesPerSec=11.927087490026713, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:33:21,369] [INFO] [timer.py:197:stop] 0/4364, RunningAvgSamplesPerSec=12.006035580247664, CurrSamplesPerSec=11.991264150972054, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:33:27,813] [INFO] [timer.py:197:stop] 0/4365, RunningAvgSamplesPerSec=12.006033948977455, CurrSamplesPerSec=11.998922563979468, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:33:34,238] [INFO] [timer.py:197:stop] 0/4366, RunningAvgSamplesPerSec=12.006040674032398, CurrSamplesPerSec=12.035453988067015, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:33:40,666] [INFO] [timer.py:197:stop] 0/4367, RunningAvgSamplesPerSec=12.006014470642421, CurrSamplesPerSec=11.892741990649393, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:33:47,287] [INFO] [timer.py:197:stop] 0/4368, RunningAvgSamplesPerSec=12.00594274347164, CurrSamplesPerSec=11.700812602150643, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:33:53,835] [INFO] [timer.py:197:stop] 0/4369, RunningAvgSamplesPerSec=12.00594133860369, CurrSamplesPerSec=11.999810817840022, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:33:58,559] [INFO] [logging.py:68:log_dist] [Rank 0] step=4370, skipped=8, lr=[1.42e-06], mom=[[0.9, 0.999]] [2022-12-20 02:33:58,560] [INFO] [timer.py:197:stop] 0/4370, RunningAvgSamplesPerSec=12.006707877637767, CurrSamplesPerSec=16.648658183133254, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:34:04,984] [INFO] [timer.py:197:stop] 0/4371, RunningAvgSamplesPerSec=12.00668984601082, CurrSamplesPerSec=11.92844111734467, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:34:11,466] [INFO] [timer.py:197:stop] 0/4372, RunningAvgSamplesPerSec=12.006692942718098, CurrSamplesPerSec=12.020237722987043, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:34:18,041] [INFO] [timer.py:197:stop] 0/4373, RunningAvgSamplesPerSec=12.006678954219849, CurrSamplesPerSec=11.94585894133351, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:34:24,719] [INFO] [timer.py:197:stop] 0/4374, RunningAvgSamplesPerSec=12.006659711100784, CurrSamplesPerSec=11.923133308918716, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:34:31,279] [INFO] [timer.py:197:stop] 0/4375, RunningAvgSamplesPerSec=12.006622199238135, CurrSamplesPerSec=11.844830795951774, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0, 'learning_rate': 1.4088888888888892e-06, 'epoch': 115.13} [2022-12-20 02:34:37,790] [INFO] [timer.py:197:stop] 0/4376, RunningAvgSamplesPerSec=12.00659814171682, CurrSamplesPerSec=11.902308609863617, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:34:44,254] [INFO] [timer.py:197:stop] 0/4377, RunningAvgSamplesPerSec=12.00657810601309, CurrSamplesPerSec=11.919577104504366, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:34:50,789] [INFO] [timer.py:197:stop] 0/4378, RunningAvgSamplesPerSec=12.006546742307107, CurrSamplesPerSec=11.870881328358363, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:34:57,345] [INFO] [timer.py:197:stop] 0/4379, RunningAvgSamplesPerSec=12.006525163465083, CurrSamplesPerSec=11.91283318853283, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:35:03,879] [INFO] [logging.py:68:log_dist] [Rank 0] step=4380, skipped=8, lr=[1.397777777777778e-06], mom=[[0.9, 0.999]] [2022-12-20 02:35:03,880] [INFO] [timer.py:197:stop] 0/4380, RunningAvgSamplesPerSec=12.006506316843401, CurrSamplesPerSec=11.924577679605997, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:35:10,482] [INFO] [timer.py:197:stop] 0/4381, RunningAvgSamplesPerSec=12.00649427671514, CurrSamplesPerSec=11.954013054249044, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:35:17,097] [INFO] [timer.py:197:stop] 0/4382, RunningAvgSamplesPerSec=12.006498533990518, CurrSamplesPerSec=12.025170141074154, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:35:23,626] [INFO] [timer.py:197:stop] 0/4383, RunningAvgSamplesPerSec=12.006500395949697, CurrSamplesPerSec=12.014661321713891, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:35:30,239] [INFO] [timer.py:197:stop] 0/4384, RunningAvgSamplesPerSec=12.006481188061917, CurrSamplesPerSec=11.922917239544763, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:35:36,730] [INFO] [timer.py:197:stop] 0/4385, RunningAvgSamplesPerSec=12.006465072667945, CurrSamplesPerSec=11.936260428674812, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:35:43,157] [INFO] [timer.py:197:stop] 0/4386, RunningAvgSamplesPerSec=12.00644437477929, CurrSamplesPerSec=11.91640600012119, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:35:49,623] [INFO] [timer.py:197:stop] 0/4387, RunningAvgSamplesPerSec=12.006438304162378, CurrSamplesPerSec=11.979883594475771, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:35:56,171] [INFO] [timer.py:197:stop] 0/4388, RunningAvgSamplesPerSec=12.006439711342118, CurrSamplesPerSec=12.01261336804896, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:36:02,795] [INFO] [timer.py:197:stop] 0/4389, RunningAvgSamplesPerSec=12.00644079385828, CurrSamplesPerSec=12.01119058847955, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:36:09,408] [INFO] [logging.py:68:log_dist] [Rank 0] step=4390, skipped=8, lr=[1.3755555555555556e-06], mom=[[0.9, 0.999]] [2022-12-20 02:36:09,409] [INFO] [timer.py:197:stop] 0/4390, RunningAvgSamplesPerSec=12.006433106634903, CurrSamplesPerSec=11.972803737875903, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:36:15,948] [INFO] [timer.py:197:stop] 0/4391, RunningAvgSamplesPerSec=12.006372129326905, CurrSamplesPerSec=11.744637918067994, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:36:22,464] [INFO] [timer.py:197:stop] 0/4392, RunningAvgSamplesPerSec=12.006338855384488, CurrSamplesPerSec=11.862054922367713, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:36:28,977] [INFO] [timer.py:197:stop] 0/4393, RunningAvgSamplesPerSec=12.006325325628929, CurrSamplesPerSec=11.947222150393078, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:36:35,596] [INFO] [timer.py:197:stop] 0/4394, RunningAvgSamplesPerSec=12.006293339812856, CurrSamplesPerSec=11.867467971573829, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:36:42,171] [INFO] [timer.py:197:stop] 0/4395, RunningAvgSamplesPerSec=12.006289712905412, CurrSamplesPerSec=11.990381446558326, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:36:48,721] [INFO] [timer.py:197:stop] 0/4396, RunningAvgSamplesPerSec=12.006294444704725, CurrSamplesPerSec=12.027117298413899, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:36:55,219] [INFO] [timer.py:197:stop] 0/4397, RunningAvgSamplesPerSec=12.006273490168342, CurrSamplesPerSec=11.914900144664356, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:37:01,692] [INFO] [timer.py:197:stop] 0/4398, RunningAvgSamplesPerSec=12.006254926073582, CurrSamplesPerSec=11.925216555807307, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:37:08,184] [INFO] [timer.py:197:stop] 0/4399, RunningAvgSamplesPerSec=12.006226761408383, CurrSamplesPerSec=11.88367893124057, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:37:14,747] [INFO] [logging.py:68:log_dist] [Rank 0] step=4400, skipped=8, lr=[1.3533333333333334e-06], mom=[[0.9, 0.999]] [2022-12-20 02:37:14,747] [INFO] [timer.py:197:stop] 0/4400, RunningAvgSamplesPerSec=12.006207734228976, CurrSamplesPerSec=11.92312430587178, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0, 'learning_rate': 1.3533333333333334e-06, 'epoch': 115.79} [2022-12-20 02:37:21,324] [INFO] [timer.py:197:stop] 0/4401, RunningAvgSamplesPerSec=12.006189108588615, CurrSamplesPerSec=11.924828772037252, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:37:27,942] [INFO] [timer.py:197:stop] 0/4402, RunningAvgSamplesPerSec=12.006169811687124, CurrSamplesPerSec=11.921878838901975, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:37:34,507] [INFO] [timer.py:197:stop] 0/4403, RunningAvgSamplesPerSec=12.006171796389467, CurrSamplesPerSec=12.014910844486538, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:37:41,136] [INFO] [timer.py:197:stop] 0/4404, RunningAvgSamplesPerSec=12.006167193174607, CurrSamplesPerSec=11.98594257857603, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:37:47,641] [INFO] [timer.py:197:stop] 0/4405, RunningAvgSamplesPerSec=12.006163533102987, CurrSamplesPerSec=11.990073494713116, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:37:54,089] [INFO] [timer.py:197:stop] 0/4406, RunningAvgSamplesPerSec=12.006144307659065, CurrSamplesPerSec=11.92208745765753, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:38:00,555] [INFO] [timer.py:197:stop] 0/4407, RunningAvgSamplesPerSec=12.006119932982239, CurrSamplesPerSec=11.899725340611244, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:38:05,280] [INFO] [timer.py:197:stop] 0/4408, RunningAvgSamplesPerSec=12.006881235191226, CurrSamplesPerSec=16.660463635955136, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:38:11,772] [INFO] [timer.py:197:stop] 0/4409, RunningAvgSamplesPerSec=12.006855538043743, CurrSamplesPerSec=11.894691821946335, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:38:18,311] [INFO] [logging.py:68:log_dist] [Rank 0] step=4410, skipped=8, lr=[1.3311111111111113e-06], mom=[[0.9, 0.999]] [2022-12-20 02:38:18,312] [INFO] [timer.py:197:stop] 0/4410, RunningAvgSamplesPerSec=12.00685380077625, CurrSamplesPerSec=11.999202542851252, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:38:24,857] [INFO] [timer.py:197:stop] 0/4411, RunningAvgSamplesPerSec=12.006852990409007, CurrSamplesPerSec=12.003281954237496, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:38:31,481] [INFO] [timer.py:197:stop] 0/4412, RunningAvgSamplesPerSec=12.006826968610214, CurrSamplesPerSec=11.893183018157615, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:38:37,949] [INFO] [timer.py:197:stop] 0/4413, RunningAvgSamplesPerSec=12.006830533316373, CurrSamplesPerSec=12.02257150156237, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:38:44,406] [INFO] [timer.py:197:stop] 0/4414, RunningAvgSamplesPerSec=12.006811367132377, CurrSamplesPerSec=11.922860575588459, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:38:50,926] [INFO] [timer.py:197:stop] 0/4415, RunningAvgSamplesPerSec=12.0067860243228, CurrSamplesPerSec=11.896005420033063, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:38:57,378] [INFO] [timer.py:197:stop] 0/4416, RunningAvgSamplesPerSec=12.006782114528892, CurrSamplesPerSec=11.989552958172126, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:39:03,967] [INFO] [timer.py:197:stop] 0/4417, RunningAvgSamplesPerSec=12.006782132251717, CurrSamplesPerSec=12.006860361303296, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:39:10,606] [INFO] [timer.py:197:stop] 0/4418, RunningAvgSamplesPerSec=12.006754711188213, CurrSamplesPerSec=11.886899488638573, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:39:17,103] [INFO] [timer.py:197:stop] 0/4419, RunningAvgSamplesPerSec=12.006753251026979, CurrSamplesPerSec=12.000308640788704, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:39:23,532] [INFO] [logging.py:68:log_dist] [Rank 0] step=4420, skipped=8, lr=[1.308888888888889e-06], mom=[[0.9, 0.999]] [2022-12-20 02:39:23,533] [INFO] [timer.py:197:stop] 0/4420, RunningAvgSamplesPerSec=12.00673686269824, CurrSamplesPerSec=11.934783511437784, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:39:30,013] [INFO] [timer.py:197:stop] 0/4421, RunningAvgSamplesPerSec=12.00672582127184, CurrSamplesPerSec=11.958142229906972, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:39:36,508] [INFO] [timer.py:197:stop] 0/4422, RunningAvgSamplesPerSec=12.006701927329173, CurrSamplesPerSec=11.902035245234671, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:39:43,083] [INFO] [timer.py:197:stop] 0/4423, RunningAvgSamplesPerSec=12.006701331738133, CurrSamplesPerSec=12.004069396537114, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:39:49,725] [INFO] [timer.py:197:stop] 0/4424, RunningAvgSamplesPerSec=12.006671066853565, CurrSamplesPerSec=11.874344973520257, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:39:56,225] [INFO] [timer.py:197:stop] 0/4425, RunningAvgSamplesPerSec=12.006670668047965, CurrSamplesPerSec=12.004907408728355, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0, 'learning_rate': 1.2977777777777779e-06, 'epoch': 116.45} [2022-12-20 02:40:02,818] [INFO] [timer.py:197:stop] 0/4426, RunningAvgSamplesPerSec=12.006645100537673, CurrSamplesPerSec=11.894615397601886, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:40:09,249] [INFO] [timer.py:197:stop] 0/4427, RunningAvgSamplesPerSec=12.006623945094086, CurrSamplesPerSec=11.91375632973401, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:40:15,778] [INFO] [timer.py:197:stop] 0/4428, RunningAvgSamplesPerSec=12.006568296211082, CurrSamplesPerSec=11.765271926842647, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:40:22,321] [INFO] [timer.py:197:stop] 0/4429, RunningAvgSamplesPerSec=12.00654515654871, CurrSamplesPerSec=11.904995428446945, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:40:28,868] [INFO] [logging.py:68:log_dist] [Rank 0] step=4430, skipped=8, lr=[1.286666666666667e-06], mom=[[0.9, 0.999]] [2022-12-20 02:40:28,868] [INFO] [timer.py:197:stop] 0/4430, RunningAvgSamplesPerSec=12.006522298243839, CurrSamplesPerSec=11.9061745275849, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:40:35,296] [INFO] [timer.py:197:stop] 0/4431, RunningAvgSamplesPerSec=12.006522457749995, CurrSamplesPerSec=12.007228792573176, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:40:41,745] [INFO] [timer.py:197:stop] 0/4432, RunningAvgSamplesPerSec=12.006501727549098, CurrSamplesPerSec=11.915384600360895, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:40:48,201] [INFO] [timer.py:197:stop] 0/4433, RunningAvgSamplesPerSec=12.006478531945925, CurrSamplesPerSec=11.904594176140597, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:40:54,652] [INFO] [timer.py:197:stop] 0/4434, RunningAvgSamplesPerSec=12.006477223077004, CurrSamplesPerSec=12.000680425599507, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:41:01,151] [INFO] [timer.py:197:stop] 0/4435, RunningAvgSamplesPerSec=12.006455899323647, CurrSamplesPerSec=11.91268727552665, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:41:07,678] [INFO] [timer.py:197:stop] 0/4436, RunningAvgSamplesPerSec=12.006435952030264, CurrSamplesPerSec=11.918656236230309, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:41:14,172] [INFO] [timer.py:197:stop] 0/4437, RunningAvgSamplesPerSec=12.006412269463, CurrSamplesPerSec=11.902314415040161, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:41:20,655] [INFO] [timer.py:197:stop] 0/4438, RunningAvgSamplesPerSec=12.006388767995103, CurrSamplesPerSec=11.903056996583157, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:41:27,145] [INFO] [timer.py:197:stop] 0/4439, RunningAvgSamplesPerSec=12.006384486311447, CurrSamplesPerSec=11.987420943822192, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:41:33,574] [INFO] [logging.py:68:log_dist] [Rank 0] step=4440, skipped=8, lr=[1.2644444444444445e-06], mom=[[0.9, 0.999]] [2022-12-20 02:41:33,575] [INFO] [timer.py:197:stop] 0/4440, RunningAvgSamplesPerSec=12.006385508682811, CurrSamplesPerSec=12.010923485354805, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:41:40,088] [INFO] [timer.py:197:stop] 0/4441, RunningAvgSamplesPerSec=12.00636200660861, CurrSamplesPerSec=11.902958297037578, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:41:46,556] [INFO] [timer.py:197:stop] 0/4442, RunningAvgSamplesPerSec=12.006341780806963, CurrSamplesPerSec=11.917225997164483, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:41:53,039] [INFO] [timer.py:197:stop] 0/4443, RunningAvgSamplesPerSec=12.006319032232026, CurrSamplesPerSec=11.906158156943066, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:41:59,509] [INFO] [timer.py:197:stop] 0/4444, RunningAvgSamplesPerSec=12.006296423099014, CurrSamplesPerSec=11.906722177900726, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:42:05,982] [INFO] [timer.py:197:stop] 0/4445, RunningAvgSamplesPerSec=12.006283326878373, CurrSamplesPerSec=11.948390483073183, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:42:10,612] [INFO] [timer.py:197:stop] 0/4446, RunningAvgSamplesPerSec=12.007027459504016, CurrSamplesPerSec=16.569894515441206, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:42:17,044] [INFO] [timer.py:197:stop] 0/4447, RunningAvgSamplesPerSec=12.00702990149686, CurrSamplesPerSec=12.01789193724767, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:42:23,619] [INFO] [timer.py:197:stop] 0/4448, RunningAvgSamplesPerSec=12.006996915035675, CurrSamplesPerSec=11.862141412605926, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:42:30,130] [INFO] [timer.py:197:stop] 0/4449, RunningAvgSamplesPerSec=12.006970083666486, CurrSamplesPerSec=11.888851616878416, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:42:36,570] [INFO] [logging.py:68:log_dist] [Rank 0] step=4450, skipped=8, lr=[1.2422222222222224e-06], mom=[[0.9, 0.999]] [2022-12-20 02:42:36,571] [INFO] [timer.py:197:stop] 0/4450, RunningAvgSamplesPerSec=12.006962118698233, CurrSamplesPerSec=11.971646109581751, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0, 'learning_rate': 1.2422222222222224e-06, 'epoch': 117.11} [2022-12-20 02:42:43,090] [INFO] [timer.py:197:stop] 0/4451, RunningAvgSamplesPerSec=12.00693558923738, CurrSamplesPerSec=11.890081239190325, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:42:49,616] [INFO] [timer.py:197:stop] 0/4452, RunningAvgSamplesPerSec=12.00691102104234, CurrSamplesPerSec=11.898593403234393, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:42:56,099] [INFO] [timer.py:197:stop] 0/4453, RunningAvgSamplesPerSec=12.00687082562642, CurrSamplesPerSec=11.830627361323057, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:43:02,571] [INFO] [timer.py:197:stop] 0/4454, RunningAvgSamplesPerSec=12.006852809118085, CurrSamplesPerSec=11.927193479512797, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:43:08,983] [INFO] [timer.py:197:stop] 0/4455, RunningAvgSamplesPerSec=12.006856907892507, CurrSamplesPerSec=12.025132432573967, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:43:15,404] [INFO] [timer.py:197:stop] 0/4456, RunningAvgSamplesPerSec=12.006855371055616, CurrSamplesPerSec=12.000015735646855, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:43:21,850] [INFO] [timer.py:197:stop] 0/4457, RunningAvgSamplesPerSec=12.006856410843637, CurrSamplesPerSec=12.011489414102883, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:43:28,333] [INFO] [timer.py:197:stop] 0/4458, RunningAvgSamplesPerSec=12.006837768141057, CurrSamplesPerSec=11.924355201477216, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:43:34,855] [INFO] [timer.py:197:stop] 0/4459, RunningAvgSamplesPerSec=12.00683605312155, CurrSamplesPerSec=11.999198788258878, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:43:41,298] [INFO] [logging.py:68:log_dist] [Rank 0] step=4460, skipped=8, lr=[1.2200000000000002e-06], mom=[[0.9, 0.999]] [2022-12-20 02:43:41,299] [INFO] [timer.py:197:stop] 0/4460, RunningAvgSamplesPerSec=12.006812328653439, CurrSamplesPerSec=11.901995666454537, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:43:47,781] [INFO] [timer.py:197:stop] 0/4461, RunningAvgSamplesPerSec=12.006811971354606, CurrSamplesPerSec=12.00521934449069, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:43:54,255] [INFO] [timer.py:197:stop] 0/4462, RunningAvgSamplesPerSec=12.006788758044078, CurrSamplesPerSec=11.904165499615338, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:44:00,619] [INFO] [timer.py:197:stop] 0/4463, RunningAvgSamplesPerSec=12.006788971771314, CurrSamplesPerSec=12.007742270947455, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:44:07,057] [INFO] [timer.py:197:stop] 0/4464, RunningAvgSamplesPerSec=12.006787020717848, CurrSamplesPerSec=11.998089677271222, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:44:13,538] [INFO] [timer.py:197:stop] 0/4465, RunningAvgSamplesPerSec=12.006785054895978, CurrSamplesPerSec=11.99801996244365, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:44:19,991] [INFO] [timer.py:197:stop] 0/4466, RunningAvgSamplesPerSec=12.006784482383644, CurrSamplesPerSec=12.004229903586658, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:44:26,413] [INFO] [timer.py:197:stop] 0/4467, RunningAvgSamplesPerSec=12.006764809634259, CurrSamplesPerSec=11.919583455814115, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:44:32,899] [INFO] [timer.py:197:stop] 0/4468, RunningAvgSamplesPerSec=12.006749762352205, CurrSamplesPerSec=11.939937592264867, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:44:39,372] [INFO] [timer.py:197:stop] 0/4469, RunningAvgSamplesPerSec=12.006722709597812, CurrSamplesPerSec=11.887108990568798, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:44:45,885] [INFO] [logging.py:68:log_dist] [Rank 0] step=4470, skipped=8, lr=[1.1977777777777778e-06], mom=[[0.9, 0.999]] [2022-12-20 02:44:45,885] [INFO] [timer.py:197:stop] 0/4470, RunningAvgSamplesPerSec=12.00671433499111, CurrSamplesPerSec=11.969421187268608, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:44:52,305] [INFO] [timer.py:197:stop] 0/4471, RunningAvgSamplesPerSec=12.006691654509767, CurrSamplesPerSec=11.906203572382818, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:44:58,795] [INFO] [timer.py:197:stop] 0/4472, RunningAvgSamplesPerSec=12.006691851801465, CurrSamplesPerSec=12.00757361316529, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:45:05,279] [INFO] [timer.py:197:stop] 0/4473, RunningAvgSamplesPerSec=12.006653243975826, CurrSamplesPerSec=11.836522180075248, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:45:11,753] [INFO] [timer.py:197:stop] 0/4474, RunningAvgSamplesPerSec=12.006625996463773, CurrSamplesPerSec=11.886026292544217, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:45:18,306] [INFO] [timer.py:197:stop] 0/4475, RunningAvgSamplesPerSec=12.006586057278847, CurrSamplesPerSec=11.830596598474019, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0, 'learning_rate': 1.1866666666666668e-06, 'epoch': 117.76} [2022-12-20 02:45:24,822] [INFO] [timer.py:197:stop] 0/4476, RunningAvgSamplesPerSec=12.0065472758059, CurrSamplesPerSec=11.835548869422286, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:45:31,344] [INFO] [timer.py:197:stop] 0/4477, RunningAvgSamplesPerSec=12.00652920507115, CurrSamplesPerSec=11.926221626483555, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:45:37,825] [INFO] [timer.py:197:stop] 0/4478, RunningAvgSamplesPerSec=12.006503827615942, CurrSamplesPerSec=11.894004038198814, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:45:44,259] [INFO] [timer.py:197:stop] 0/4479, RunningAvgSamplesPerSec=12.006506280733726, CurrSamplesPerSec=12.017496488905676, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:45:50,723] [INFO] [logging.py:68:log_dist] [Rank 0] step=4480, skipped=8, lr=[1.1755555555555556e-06], mom=[[0.9, 0.999]] [2022-12-20 02:45:50,724] [INFO] [timer.py:197:stop] 0/4480, RunningAvgSamplesPerSec=12.006509422924955, CurrSamplesPerSec=12.020593518567528, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:45:57,153] [INFO] [timer.py:197:stop] 0/4481, RunningAvgSamplesPerSec=12.006509502353698, CurrSamplesPerSec=12.006865194803325, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:46:03,623] [INFO] [timer.py:197:stop] 0/4482, RunningAvgSamplesPerSec=12.006485076482551, CurrSamplesPerSec=11.898069704525824, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:46:10,066] [INFO] [timer.py:197:stop] 0/4483, RunningAvgSamplesPerSec=12.006469028865459, CurrSamplesPerSec=11.935003725639124, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:46:14,771] [INFO] [timer.py:197:stop] 0/4484, RunningAvgSamplesPerSec=12.007194218302322, CurrSamplesPerSec=16.462913262516587, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:46:21,208] [INFO] [timer.py:197:stop] 0/4485, RunningAvgSamplesPerSec=12.007199410811989, CurrSamplesPerSec=12.030517445150618, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:46:27,692] [INFO] [timer.py:197:stop] 0/4486, RunningAvgSamplesPerSec=12.00718328852615, CurrSamplesPerSec=11.93533963427729, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:46:34,724] [INFO] [timer.py:197:stop] 0/4487, RunningAvgSamplesPerSec=12.00715829670455, CurrSamplesPerSec=11.896131418403694, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:46:41,707] [INFO] [timer.py:197:stop] 0/4488, RunningAvgSamplesPerSec=12.007131002232816, CurrSamplesPerSec=11.885951032099165, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:46:48,582] [INFO] [timer.py:197:stop] 0/4489, RunningAvgSamplesPerSec=12.00709367060103, CurrSamplesPerSec=11.841928140636703, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:46:55,443] [INFO] [logging.py:68:log_dist] [Rank 0] step=4490, skipped=8, lr=[1.1533333333333334e-06], mom=[[0.9, 0.999]] [2022-12-20 02:46:55,444] [INFO] [timer.py:197:stop] 0/4490, RunningAvgSamplesPerSec=12.007055617242601, CurrSamplesPerSec=11.838704752153465, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:47:02,246] [INFO] [timer.py:197:stop] 0/4491, RunningAvgSamplesPerSec=12.007034802020932, CurrSamplesPerSec=11.91433746391995, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:47:08,728] [INFO] [timer.py:197:stop] 0/4492, RunningAvgSamplesPerSec=12.007032115015528, CurrSamplesPerSec=11.99498225544606, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:47:15,203] [INFO] [timer.py:197:stop] 0/4493, RunningAvgSamplesPerSec=12.007023188383, CurrSamplesPerSec=11.967075985576317, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:47:21,745] [INFO] [timer.py:197:stop] 0/4494, RunningAvgSamplesPerSec=12.00699703471592, CurrSamplesPerSec=11.89067902755952, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:47:28,350] [INFO] [timer.py:197:stop] 0/4495, RunningAvgSamplesPerSec=12.006961125708228, CurrSamplesPerSec=11.84779657884007, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:47:35,000] [INFO] [timer.py:197:stop] 0/4496, RunningAvgSamplesPerSec=12.006942348726927, CurrSamplesPerSec=11.923166143675695, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:47:41,691] [INFO] [timer.py:197:stop] 0/4497, RunningAvgSamplesPerSec=12.006903784445774, CurrSamplesPerSec=11.836062381743195, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:47:48,433] [INFO] [timer.py:197:stop] 0/4498, RunningAvgSamplesPerSec=12.006880792973083, CurrSamplesPerSec=11.904416261076374, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:47:54,950] [INFO] [timer.py:197:stop] 0/4499, RunningAvgSamplesPerSec=12.006859517922253, CurrSamplesPerSec=11.911963050101694, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:48:01,428] [INFO] [logging.py:68:log_dist] [Rank 0] step=4500, skipped=8, lr=[1.131111111111111e-06], mom=[[0.9, 0.999]] [2022-12-20 02:48:01,429] [INFO] [timer.py:197:stop] 0/4500, RunningAvgSamplesPerSec=12.006852767312127, CurrSamplesPerSec=11.97657185127812, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0, 'learning_rate': 1.131111111111111e-06, 'epoch': 118.42} [2022-12-20 02:48:07,847] [INFO] [timer.py:197:stop] 0/4501, RunningAvgSamplesPerSec=12.006841384621834, CurrSamplesPerSec=11.95585948803523, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:48:14,258] [INFO] [timer.py:197:stop] 0/4502, RunningAvgSamplesPerSec=12.006835953023023, CurrSamplesPerSec=11.98244883458178, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:48:20,873] [INFO] [timer.py:197:stop] 0/4503, RunningAvgSamplesPerSec=12.006810365847198, CurrSamplesPerSec=11.892762012694865, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:48:27,557] [INFO] [timer.py:197:stop] 0/4504, RunningAvgSamplesPerSec=12.00678210037201, CurrSamplesPerSec=11.880893399530617, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:48:34,113] [INFO] [timer.py:197:stop] 0/4505, RunningAvgSamplesPerSec=12.006777209301138, CurrSamplesPerSec=11.984797925671565, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:48:40,655] [INFO] [timer.py:197:stop] 0/4506, RunningAvgSamplesPerSec=12.00675859256329, CurrSamplesPerSec=11.923508801043445, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:48:47,113] [INFO] [timer.py:197:stop] 0/4507, RunningAvgSamplesPerSec=12.006735377368361, CurrSamplesPerSec=11.90307705337541, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:48:53,657] [INFO] [timer.py:197:stop] 0/4508, RunningAvgSamplesPerSec=12.006704324082513, CurrSamplesPerSec=11.868420826910981, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:49:00,182] [INFO] [timer.py:197:stop] 0/4509, RunningAvgSamplesPerSec=12.006709232471888, CurrSamplesPerSec=12.028867260699505, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:49:06,762] [INFO] [logging.py:68:log_dist] [Rank 0] step=4510, skipped=8, lr=[1.1088888888888889e-06], mom=[[0.9, 0.999]] [2022-12-20 02:49:06,763] [INFO] [timer.py:197:stop] 0/4510, RunningAvgSamplesPerSec=12.006692157759657, CurrSamplesPerSec=11.93022663681067, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:49:13,255] [INFO] [timer.py:197:stop] 0/4511, RunningAvgSamplesPerSec=12.0066651311039, CurrSamplesPerSec=11.886053133912531, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:49:19,850] [INFO] [timer.py:197:stop] 0/4512, RunningAvgSamplesPerSec=12.006619698172113, CurrSamplesPerSec=11.805199996288273, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:49:26,276] [INFO] [timer.py:197:stop] 0/4513, RunningAvgSamplesPerSec=12.006612748656805, CurrSamplesPerSec=11.975352056283144, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:49:32,789] [INFO] [timer.py:197:stop] 0/4514, RunningAvgSamplesPerSec=12.006618374423198, CurrSamplesPerSec=12.032049972131375, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:49:39,367] [INFO] [timer.py:197:stop] 0/4515, RunningAvgSamplesPerSec=12.006604615356057, CurrSamplesPerSec=11.9448431169324, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:49:45,866] [INFO] [timer.py:197:stop] 0/4516, RunningAvgSamplesPerSec=12.00658222614206, CurrSamplesPerSec=11.906383125188396, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:49:52,331] [INFO] [timer.py:197:stop] 0/4517, RunningAvgSamplesPerSec=12.006568892089119, CurrSamplesPerSec=11.946679275186918, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:49:58,781] [INFO] [timer.py:197:stop] 0/4518, RunningAvgSamplesPerSec=12.006563225782589, CurrSamplesPerSec=11.981034260517905, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:50:05,252] [INFO] [timer.py:197:stop] 0/4519, RunningAvgSamplesPerSec=12.00654140907714, CurrSamplesPerSec=11.9088192412582, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:50:11,667] [INFO] [logging.py:68:log_dist] [Rank 0] step=4520, skipped=8, lr=[1.0866666666666667e-06], mom=[[0.9, 0.999]] [2022-12-20 02:50:11,668] [INFO] [timer.py:197:stop] 0/4520, RunningAvgSamplesPerSec=12.00654406828836, CurrSamplesPerSec=12.018567756830047, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:50:18,280] [INFO] [timer.py:197:stop] 0/4521, RunningAvgSamplesPerSec=12.006525138792718, CurrSamplesPerSec=11.921606692414054, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:50:22,868] [INFO] [timer.py:197:stop] 0/4522, RunningAvgSamplesPerSec=12.00726219694435, CurrSamplesPerSec=16.617046367197382, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:50:29,392] [INFO] [timer.py:197:stop] 0/4523, RunningAvgSamplesPerSec=12.007242485192332, CurrSamplesPerSec=11.918801766289405, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:50:35,937] [INFO] [timer.py:197:stop] 0/4524, RunningAvgSamplesPerSec=12.007242851903326, CurrSamplesPerSec=12.008900981302723, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:50:42,570] [INFO] [timer.py:197:stop] 0/4525, RunningAvgSamplesPerSec=12.0072279378393, CurrSamplesPerSec=11.940163307760562, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0, 'learning_rate': 1.0755555555555557e-06, 'epoch': 119.08} [2022-12-20 02:50:49,140] [INFO] [timer.py:197:stop] 0/4526, RunningAvgSamplesPerSec=12.007211313247408, CurrSamplesPerSec=11.932486339463379, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:50:55,608] [INFO] [timer.py:197:stop] 0/4527, RunningAvgSamplesPerSec=12.007195183402365, CurrSamplesPerSec=11.934664652209609, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:51:02,132] [INFO] [timer.py:197:stop] 0/4528, RunningAvgSamplesPerSec=12.007176002578808, CurrSamplesPerSec=11.921005790383301, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:51:08,620] [INFO] [timer.py:197:stop] 0/4529, RunningAvgSamplesPerSec=12.007165609040534, CurrSamplesPerSec=11.960308071811896, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:51:15,222] [INFO] [logging.py:68:log_dist] [Rank 0] step=4530, skipped=8, lr=[1.0644444444444445e-06], mom=[[0.9, 0.999]] [2022-12-20 02:51:15,222] [INFO] [timer.py:197:stop] 0/4530, RunningAvgSamplesPerSec=12.007137334533539, CurrSamplesPerSec=11.880489037956645, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:51:21,763] [INFO] [timer.py:197:stop] 0/4531, RunningAvgSamplesPerSec=12.007128013287748, CurrSamplesPerSec=11.965069286831199, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:51:28,281] [INFO] [timer.py:197:stop] 0/4532, RunningAvgSamplesPerSec=12.007099513965292, CurrSamplesPerSec=11.879399134033612, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:51:34,691] [INFO] [timer.py:197:stop] 0/4533, RunningAvgSamplesPerSec=12.007098873729236, CurrSamplesPerSec=12.004199304921661, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:51:41,118] [INFO] [timer.py:197:stop] 0/4534, RunningAvgSamplesPerSec=12.007074594510936, CurrSamplesPerSec=11.898064430855868, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:51:47,695] [INFO] [timer.py:197:stop] 0/4535, RunningAvgSamplesPerSec=12.007044330634653, CurrSamplesPerSec=11.871437811832845, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:51:54,254] [INFO] [timer.py:197:stop] 0/4536, RunningAvgSamplesPerSec=12.007017884047125, CurrSamplesPerSec=11.888320877784274, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:52:00,825] [INFO] [timer.py:197:stop] 0/4537, RunningAvgSamplesPerSec=12.006998650797906, CurrSamplesPerSec=11.920424005522104, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:52:07,313] [INFO] [timer.py:197:stop] 0/4538, RunningAvgSamplesPerSec=12.006994134091766, CurrSamplesPerSec=11.986545763221972, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:52:13,721] [INFO] [timer.py:197:stop] 0/4539, RunningAvgSamplesPerSec=12.006996058030902, CurrSamplesPerSec=12.015729394962602, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:52:20,197] [INFO] [logging.py:68:log_dist] [Rank 0] step=4540, skipped=8, lr=[1.0422222222222221e-06], mom=[[0.9, 0.999]] [2022-12-20 02:52:20,198] [INFO] [timer.py:197:stop] 0/4540, RunningAvgSamplesPerSec=12.00697773543422, CurrSamplesPerSec=11.924419825221046, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:52:26,683] [INFO] [timer.py:197:stop] 0/4541, RunningAvgSamplesPerSec=12.006979627228047, CurrSamplesPerSec=12.015570731590282, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:52:33,220] [INFO] [timer.py:197:stop] 0/4542, RunningAvgSamplesPerSec=12.006942586695429, CurrSamplesPerSec=11.841137798920014, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:52:39,735] [INFO] [timer.py:197:stop] 0/4543, RunningAvgSamplesPerSec=12.006916977327094, CurrSamplesPerSec=11.891765734274253, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:52:46,458] [INFO] [timer.py:197:stop] 0/4544, RunningAvgSamplesPerSec=12.006893189048657, CurrSamplesPerSec=11.899834009489174, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:52:53,093] [INFO] [timer.py:197:stop] 0/4545, RunningAvgSamplesPerSec=12.006892894083292, CurrSamplesPerSec=12.005553310899092, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:52:59,523] [INFO] [timer.py:197:stop] 0/4546, RunningAvgSamplesPerSec=12.006891974373934, CurrSamplesPerSec=12.002715188552102, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:53:06,089] [INFO] [timer.py:197:stop] 0/4547, RunningAvgSamplesPerSec=12.006854748870806, CurrSamplesPerSec=11.8400524916115, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:53:12,756] [INFO] [timer.py:197:stop] 0/4548, RunningAvgSamplesPerSec=12.006831944810742, CurrSamplesPerSec=11.904074700175789, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:53:19,220] [INFO] [timer.py:197:stop] 0/4549, RunningAvgSamplesPerSec=12.006802303976286, CurrSamplesPerSec=11.873550824160635, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:53:25,686] [INFO] [logging.py:68:log_dist] [Rank 0] step=4550, skipped=8, lr=[1.02e-06], mom=[[0.9, 0.999]] [2022-12-20 02:53:25,687] [INFO] [timer.py:197:stop] 0/4550, RunningAvgSamplesPerSec=12.006808356669554, CurrSamplesPerSec=12.034393195888887, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0, 'learning_rate': 1.02e-06, 'epoch': 119.74} [2022-12-20 02:53:32,295] [INFO] [timer.py:197:stop] 0/4551, RunningAvgSamplesPerSec=12.0068086401097, CurrSamplesPerSec=12.008097864343952, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:53:38,845] [INFO] [timer.py:197:stop] 0/4552, RunningAvgSamplesPerSec=12.006786858234491, CurrSamplesPerSec=11.908512294913061, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:53:45,411] [INFO] [timer.py:197:stop] 0/4553, RunningAvgSamplesPerSec=12.006759465961652, CurrSamplesPerSec=11.883405369545013, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:53:51,928] [INFO] [timer.py:197:stop] 0/4554, RunningAvgSamplesPerSec=12.006739783207395, CurrSamplesPerSec=11.917827047310144, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:53:58,394] [INFO] [timer.py:197:stop] 0/4555, RunningAvgSamplesPerSec=12.006728694541437, CurrSamplesPerSec=11.956464441588158, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:54:04,911] [INFO] [timer.py:197:stop] 0/4556, RunningAvgSamplesPerSec=12.006701123793741, CurrSamplesPerSec=11.882470620111771, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:54:11,461] [INFO] [timer.py:197:stop] 0/4557, RunningAvgSamplesPerSec=12.006664095249096, CurrSamplesPerSec=11.840372109091177, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:54:18,048] [INFO] [timer.py:197:stop] 0/4558, RunningAvgSamplesPerSec=12.006651106687757, CurrSamplesPerSec=11.947778369499446, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:54:24,726] [INFO] [timer.py:197:stop] 0/4559, RunningAvgSamplesPerSec=12.006631485925938, CurrSamplesPerSec=11.91790006633451, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:54:29,381] [INFO] [logging.py:68:log_dist] [Rank 0] step=4560, skipped=8, lr=[9.97777777777778e-07], mom=[[0.9, 0.999]] [2022-12-20 02:54:29,382] [INFO] [timer.py:197:stop] 0/4560, RunningAvgSamplesPerSec=12.007344117780661, CurrSamplesPerSec=16.459069389542993, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:54:36,010] [INFO] [timer.py:197:stop] 0/4561, RunningAvgSamplesPerSec=12.00734261089808, CurrSamplesPerSec=12.00047816751331, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:54:42,570] [INFO] [timer.py:197:stop] 0/4562, RunningAvgSamplesPerSec=12.007347865406071, CurrSamplesPerSec=12.03135106551913, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:54:49,058] [INFO] [timer.py:197:stop] 0/4563, RunningAvgSamplesPerSec=12.007333352953518, CurrSamplesPerSec=11.94151937364594, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:54:55,636] [INFO] [timer.py:197:stop] 0/4564, RunningAvgSamplesPerSec=12.007310108551806, CurrSamplesPerSec=11.902220477424395, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:55:02,069] [INFO] [timer.py:197:stop] 0/4565, RunningAvgSamplesPerSec=12.007293197727606, CurrSamplesPerSec=11.930638633309854, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:55:08,459] [INFO] [timer.py:197:stop] 0/4566, RunningAvgSamplesPerSec=12.007298510899473, CurrSamplesPerSec=12.031591575143702, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:55:14,898] [INFO] [timer.py:197:stop] 0/4567, RunningAvgSamplesPerSec=12.007278571041075, CurrSamplesPerSec=11.916957766357438, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:55:21,406] [INFO] [timer.py:197:stop] 0/4568, RunningAvgSamplesPerSec=12.007253013704723, CurrSamplesPerSec=11.891706732009846, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:55:27,873] [INFO] [timer.py:197:stop] 0/4569, RunningAvgSamplesPerSec=12.007236537610227, CurrSamplesPerSec=11.932475200599354, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:55:34,425] [INFO] [logging.py:68:log_dist] [Rank 0] step=4570, skipped=8, lr=[9.755555555555556e-07], mom=[[0.9, 0.999]] [2022-12-20 02:55:34,426] [INFO] [timer.py:197:stop] 0/4570, RunningAvgSamplesPerSec=12.007217839885337, CurrSamplesPerSec=11.922428464440369, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:55:40,906] [INFO] [timer.py:197:stop] 0/4571, RunningAvgSamplesPerSec=12.007200300747012, CurrSamplesPerSec=11.927612686388766, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:55:47,386] [INFO] [timer.py:197:stop] 0/4572, RunningAvgSamplesPerSec=12.007185959757596, CurrSamplesPerSec=11.942017684397168, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:55:53,850] [INFO] [timer.py:197:stop] 0/4573, RunningAvgSamplesPerSec=12.00718532582527, CurrSamplesPerSec=12.004288954082908, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:56:00,278] [INFO] [timer.py:197:stop] 0/4574, RunningAvgSamplesPerSec=12.007192352205754, CurrSamplesPerSec=12.039396096826657, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:56:06,763] [INFO] [timer.py:197:stop] 0/4575, RunningAvgSamplesPerSec=12.00717161895591, CurrSamplesPerSec=11.913121853053477, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0, 'learning_rate': 9.644444444444444e-07, 'epoch': 120.39} [2022-12-20 02:56:13,281] [INFO] [timer.py:197:stop] 0/4576, RunningAvgSamplesPerSec=12.007174330845183, CurrSamplesPerSec=12.01958862525755, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:56:19,727] [INFO] [timer.py:197:stop] 0/4577, RunningAvgSamplesPerSec=12.00717601046699, CurrSamplesPerSec=12.014863520405651, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:56:26,198] [INFO] [timer.py:197:stop] 0/4578, RunningAvgSamplesPerSec=12.007166960600811, CurrSamplesPerSec=11.965906129612874, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:56:32,650] [INFO] [timer.py:197:stop] 0/4579, RunningAvgSamplesPerSec=12.00717238976553, CurrSamplesPerSec=12.032067769424673, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:56:39,134] [INFO] [logging.py:68:log_dist] [Rank 0] step=4580, skipped=8, lr=[9.533333333333335e-07], mom=[[0.9, 0.999]] [2022-12-20 02:56:39,135] [INFO] [timer.py:197:stop] 0/4580, RunningAvgSamplesPerSec=12.007160086725449, CurrSamplesPerSec=11.951111983334052, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:56:45,610] [INFO] [timer.py:197:stop] 0/4581, RunningAvgSamplesPerSec=12.007153797850167, CurrSamplesPerSec=11.97843220980157, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:56:52,131] [INFO] [timer.py:197:stop] 0/4582, RunningAvgSamplesPerSec=12.007132116753585, CurrSamplesPerSec=11.908668671796743, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:56:58,646] [INFO] [timer.py:197:stop] 0/4583, RunningAvgSamplesPerSec=12.00710126330106, CurrSamplesPerSec=11.86743649218033, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:57:05,084] [INFO] [timer.py:197:stop] 0/4584, RunningAvgSamplesPerSec=12.007107765597725, CurrSamplesPerSec=12.036968881729443, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:57:11,579] [INFO] [timer.py:197:stop] 0/4585, RunningAvgSamplesPerSec=12.00707870129704, CurrSamplesPerSec=11.875367229518936, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:57:18,119] [INFO] [timer.py:197:stop] 0/4586, RunningAvgSamplesPerSec=12.007049366050488, CurrSamplesPerSec=11.874094952079492, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:57:24,643] [INFO] [timer.py:197:stop] 0/4587, RunningAvgSamplesPerSec=12.007020970642447, CurrSamplesPerSec=11.87825266370254, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:57:31,106] [INFO] [timer.py:197:stop] 0/4588, RunningAvgSamplesPerSec=12.007017710158063, CurrSamplesPerSec=11.992086982790054, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:57:37,592] [INFO] [timer.py:197:stop] 0/4589, RunningAvgSamplesPerSec=12.007017647636884, CurrSamplesPerSec=12.006730932360982, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:57:44,132] [INFO] [logging.py:68:log_dist] [Rank 0] step=4590, skipped=8, lr=[9.311111111111113e-07], mom=[[0.9, 0.999]] [2022-12-20 02:57:44,133] [INFO] [timer.py:197:stop] 0/4590, RunningAvgSamplesPerSec=12.006973477953133, CurrSamplesPerSec=11.807729918458687, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:57:50,670] [INFO] [timer.py:197:stop] 0/4591, RunningAvgSamplesPerSec=12.006956645834682, CurrSamplesPerSec=11.930224515924191, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:57:57,130] [INFO] [timer.py:197:stop] 0/4592, RunningAvgSamplesPerSec=12.006937195693586, CurrSamplesPerSec=11.91833925760241, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:58:03,558] [INFO] [timer.py:197:stop] 0/4593, RunningAvgSamplesPerSec=12.006940132508062, CurrSamplesPerSec=12.020435264982451, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:58:10,034] [INFO] [timer.py:197:stop] 0/4594, RunningAvgSamplesPerSec=12.00693607395291, CurrSamplesPerSec=11.98833212386563, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:58:16,489] [INFO] [timer.py:197:stop] 0/4595, RunningAvgSamplesPerSec=12.00691250535897, CurrSamplesPerSec=11.899652544123528, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:58:23,053] [INFO] [timer.py:197:stop] 0/4596, RunningAvgSamplesPerSec=12.006886654860963, CurrSamplesPerSec=11.889318158827043, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:58:29,558] [INFO] [timer.py:197:stop] 0/4597, RunningAvgSamplesPerSec=12.006863980671698, CurrSamplesPerSec=11.903594858698735, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:58:34,124] [INFO] [timer.py:197:stop] 0/4598, RunningAvgSamplesPerSec=12.007595206071795, CurrSamplesPerSec=16.67347157567222, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:58:40,657] [INFO] [timer.py:197:stop] 0/4599, RunningAvgSamplesPerSec=12.007567488234113, CurrSamplesPerSec=11.881513929854682, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:58:47,150] [INFO] [logging.py:68:log_dist] [Rank 0] step=4600, skipped=8, lr=[9.08888888888889e-07], mom=[[0.9, 0.999]] [2022-12-20 02:58:47,151] [INFO] [timer.py:197:stop] 0/4600, RunningAvgSamplesPerSec=12.007549342741417, CurrSamplesPerSec=11.924710111005098, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0, 'learning_rate': 9.08888888888889e-07, 'epoch': 121.05} [2022-12-20 02:58:53,581] [INFO] [timer.py:197:stop] 0/4601, RunningAvgSamplesPerSec=12.007546671427257, CurrSamplesPerSec=11.995276522973239, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:59:00,126] [INFO] [timer.py:197:stop] 0/4602, RunningAvgSamplesPerSec=12.007521057570864, CurrSamplesPerSec=11.890867593541614, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:59:06,679] [INFO] [timer.py:197:stop] 0/4603, RunningAvgSamplesPerSec=12.007486615786132, CurrSamplesPerSec=11.851118052096002, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:59:13,186] [INFO] [timer.py:197:stop] 0/4604, RunningAvgSamplesPerSec=12.007447233230527, CurrSamplesPerSec=11.828942418506202, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:59:19,727] [INFO] [timer.py:197:stop] 0/4605, RunningAvgSamplesPerSec=12.007418610490896, CurrSamplesPerSec=11.87712638407323, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:59:26,189] [INFO] [timer.py:197:stop] 0/4606, RunningAvgSamplesPerSec=12.007398778294322, CurrSamplesPerSec=11.916800113647462, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:59:32,682] [INFO] [timer.py:197:stop] 0/4607, RunningAvgSamplesPerSec=12.00738458670786, CurrSamplesPerSec=11.942400211268925, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:59:39,189] [INFO] [timer.py:197:stop] 0/4608, RunningAvgSamplesPerSec=12.007363688375023, CurrSamplesPerSec=11.911892218181954, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:59:45,738] [INFO] [timer.py:197:stop] 0/4609, RunningAvgSamplesPerSec=12.007311109759868, CurrSamplesPerSec=11.76992296191065, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:59:52,312] [INFO] [logging.py:68:log_dist] [Rank 0] step=4610, skipped=8, lr=[8.866666666666668e-07], mom=[[0.9, 0.999]] [2022-12-20 02:59:52,313] [INFO] [timer.py:197:stop] 0/4610, RunningAvgSamplesPerSec=12.00731524389846, CurrSamplesPerSec=12.026391485650969, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 02:59:58,684] [INFO] [timer.py:197:stop] 0/4611, RunningAvgSamplesPerSec=12.007317865640653, CurrSamplesPerSec=12.019411023666471, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:00:05,153] [INFO] [timer.py:197:stop] 0/4612, RunningAvgSamplesPerSec=12.00730309203446, CurrSamplesPerSec=11.939595583164422, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:00:11,780] [INFO] [timer.py:197:stop] 0/4613, RunningAvgSamplesPerSec=12.007286523378589, CurrSamplesPerSec=11.931387935395357, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:00:18,312] [INFO] [timer.py:197:stop] 0/4614, RunningAvgSamplesPerSec=12.007268052435679, CurrSamplesPerSec=11.922698530237371, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:00:24,771] [INFO] [timer.py:197:stop] 0/4615, RunningAvgSamplesPerSec=12.007263553918241, CurrSamplesPerSec=11.986552186100312, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:00:31,450] [INFO] [timer.py:197:stop] 0/4616, RunningAvgSamplesPerSec=12.007222962013381, CurrSamplesPerSec=11.822848415692116, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:00:37,937] [INFO] [timer.py:197:stop] 0/4617, RunningAvgSamplesPerSec=12.007198554868866, CurrSamplesPerSec=11.895630603840212, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:00:44,391] [INFO] [timer.py:197:stop] 0/4618, RunningAvgSamplesPerSec=12.00718069078383, CurrSamplesPerSec=11.925300261078046, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:00:50,781] [INFO] [timer.py:197:stop] 0/4619, RunningAvgSamplesPerSec=12.007179364366882, CurrSamplesPerSec=12.001059744954834, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:00:57,233] [INFO] [logging.py:68:log_dist] [Rank 0] step=4620, skipped=8, lr=[8.644444444444445e-07], mom=[[0.9, 0.999]] [2022-12-20 03:00:57,234] [INFO] [timer.py:197:stop] 0/4620, RunningAvgSamplesPerSec=12.007166190477898, CurrSamplesPerSec=11.946648969210244, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:01:03,696] [INFO] [timer.py:197:stop] 0/4621, RunningAvgSamplesPerSec=12.00715926463347, CurrSamplesPerSec=11.975260701877074, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:01:10,180] [INFO] [timer.py:197:stop] 0/4622, RunningAvgSamplesPerSec=12.007134027810288, CurrSamplesPerSec=11.891686186715923, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:01:16,774] [INFO] [timer.py:197:stop] 0/4623, RunningAvgSamplesPerSec=12.007109419986266, CurrSamplesPerSec=11.89448785019356, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:01:23,241] [INFO] [timer.py:197:stop] 0/4624, RunningAvgSamplesPerSec=12.007110157307316, CurrSamplesPerSec=12.010518285187166, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:01:30,013] [INFO] [timer.py:197:stop] 0/4625, RunningAvgSamplesPerSec=12.007092424578941, CurrSamplesPerSec=11.925687545695585, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0, 'learning_rate': 8.533333333333334e-07, 'epoch': 121.71} [2022-12-20 03:01:36,492] [INFO] [timer.py:197:stop] 0/4626, RunningAvgSamplesPerSec=12.007093990381925, CurrSamplesPerSec=12.014337065148178, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:01:42,944] [INFO] [timer.py:197:stop] 0/4627, RunningAvgSamplesPerSec=12.0070751108741, CurrSamplesPerSec=11.920406536973136, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:01:49,508] [INFO] [timer.py:197:stop] 0/4628, RunningAvgSamplesPerSec=12.007048213105524, CurrSamplesPerSec=11.883921990851828, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:01:56,017] [INFO] [timer.py:197:stop] 0/4629, RunningAvgSamplesPerSec=12.00701708647456, CurrSamplesPerSec=11.864731986619553, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:02:02,456] [INFO] [logging.py:68:log_dist] [Rank 0] step=4630, skipped=8, lr=[8.422222222222224e-07], mom=[[0.9, 0.999]] [2022-12-20 03:02:02,457] [INFO] [timer.py:197:stop] 0/4630, RunningAvgSamplesPerSec=12.00701280909004, CurrSamplesPerSec=11.987253927015216, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:02:08,955] [INFO] [timer.py:197:stop] 0/4631, RunningAvgSamplesPerSec=12.00699516498391, CurrSamplesPerSec=11.925889939730231, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:02:15,520] [INFO] [timer.py:197:stop] 0/4632, RunningAvgSamplesPerSec=12.006995275529043, CurrSamplesPerSec=12.00750701076006, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:02:21,992] [INFO] [timer.py:197:stop] 0/4633, RunningAvgSamplesPerSec=12.00697000567237, CurrSamplesPerSec=11.891099885556637, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:02:28,547] [INFO] [timer.py:197:stop] 0/4634, RunningAvgSamplesPerSec=12.006960823493602, CurrSamplesPerSec=11.964588248555469, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:02:35,013] [INFO] [timer.py:197:stop] 0/4635, RunningAvgSamplesPerSec=12.006934028174843, CurrSamplesPerSec=11.884088245143264, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:02:39,655] [INFO] [timer.py:197:stop] 0/4636, RunningAvgSamplesPerSec=12.007657629550478, CurrSamplesPerSec=16.65900267582468, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:02:46,130] [INFO] [timer.py:197:stop] 0/4637, RunningAvgSamplesPerSec=12.007644738243954, CurrSamplesPerSec=11.948202215573538, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:02:52,620] [INFO] [timer.py:197:stop] 0/4638, RunningAvgSamplesPerSec=12.007622573698736, CurrSamplesPerSec=11.90576157869771, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:02:59,129] [INFO] [timer.py:197:stop] 0/4639, RunningAvgSamplesPerSec=12.007605582554442, CurrSamplesPerSec=11.929348124126692, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:03:05,564] [INFO] [logging.py:68:log_dist] [Rank 0] step=4640, skipped=8, lr=[8.200000000000001e-07], mom=[[0.9, 0.999]] [2022-12-20 03:03:05,564] [INFO] [timer.py:197:stop] 0/4640, RunningAvgSamplesPerSec=12.007581930179681, CurrSamplesPerSec=11.898898783578089, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:03:12,122] [INFO] [timer.py:197:stop] 0/4641, RunningAvgSamplesPerSec=12.007565483903136, CurrSamplesPerSec=11.931769251215911, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:03:18,610] [INFO] [timer.py:197:stop] 0/4642, RunningAvgSamplesPerSec=12.007536971217027, CurrSamplesPerSec=11.876708091022678, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:03:25,103] [INFO] [timer.py:197:stop] 0/4643, RunningAvgSamplesPerSec=12.007536528564543, CurrSamplesPerSec=12.005482972377843, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:03:31,761] [INFO] [timer.py:197:stop] 0/4644, RunningAvgSamplesPerSec=12.00751145831812, CurrSamplesPerSec=11.8922772873772, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:03:38,228] [INFO] [timer.py:197:stop] 0/4645, RunningAvgSamplesPerSec=12.007493065267901, CurrSamplesPerSec=11.92271547595021, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:03:44,734] [INFO] [timer.py:197:stop] 0/4646, RunningAvgSamplesPerSec=12.007471034924471, CurrSamplesPerSec=11.906048316381579, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:03:51,345] [INFO] [timer.py:197:stop] 0/4647, RunningAvgSamplesPerSec=12.007474263595746, CurrSamplesPerSec=12.022486963661995, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:03:57,961] [INFO] [timer.py:197:stop] 0/4648, RunningAvgSamplesPerSec=12.00744097602738, CurrSamplesPerSec=11.854786389199438, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:04:04,493] [INFO] [timer.py:197:stop] 0/4649, RunningAvgSamplesPerSec=12.007416511163415, CurrSamplesPerSec=11.894818846650086, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:04:11,032] [INFO] [logging.py:68:log_dist] [Rank 0] step=4650, skipped=8, lr=[7.977777777777779e-07], mom=[[0.9, 0.999]] [2022-12-20 03:04:11,033] [INFO] [timer.py:197:stop] 0/4650, RunningAvgSamplesPerSec=12.007391078284957, CurrSamplesPerSec=11.890356688390131, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0, 'learning_rate': 7.977777777777779e-07, 'epoch': 122.37} [2022-12-20 03:04:17,545] [INFO] [timer.py:197:stop] 0/4651, RunningAvgSamplesPerSec=12.007367891043627, CurrSamplesPerSec=11.900552540073276, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:04:24,049] [INFO] [timer.py:197:stop] 0/4652, RunningAvgSamplesPerSec=12.007339043029537, CurrSamplesPerSec=11.874706367655874, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:04:30,530] [INFO] [timer.py:197:stop] 0/4653, RunningAvgSamplesPerSec=12.007322188694227, CurrSamplesPerSec=11.929457864867317, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:04:37,049] [INFO] [timer.py:197:stop] 0/4654, RunningAvgSamplesPerSec=12.007295219518488, CurrSamplesPerSec=11.88315864874105, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:04:43,560] [INFO] [timer.py:197:stop] 0/4655, RunningAvgSamplesPerSec=12.007271495458196, CurrSamplesPerSec=11.89791255116723, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:04:50,022] [INFO] [timer.py:197:stop] 0/4656, RunningAvgSamplesPerSec=12.007264949203428, CurrSamplesPerSec=11.976882316215837, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:04:56,477] [INFO] [timer.py:197:stop] 0/4657, RunningAvgSamplesPerSec=12.007262426048356, CurrSamplesPerSec=11.99553113767545, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:05:02,958] [INFO] [timer.py:197:stop] 0/4658, RunningAvgSamplesPerSec=12.007238018248463, CurrSamplesPerSec=11.89468497006848, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:05:09,412] [INFO] [timer.py:197:stop] 0/4659, RunningAvgSamplesPerSec=12.007237024665125, CurrSamplesPerSec=12.002612682671375, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:05:16,035] [INFO] [logging.py:68:log_dist] [Rank 0] step=4660, skipped=8, lr=[7.755555555555556e-07], mom=[[0.9, 0.999]] [2022-12-20 03:05:16,036] [INFO] [timer.py:197:stop] 0/4660, RunningAvgSamplesPerSec=12.007240261921488, CurrSamplesPerSec=12.022335121501582, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:05:22,541] [INFO] [timer.py:197:stop] 0/4661, RunningAvgSamplesPerSec=12.007242626042464, CurrSamplesPerSec=12.018264812379327, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:05:29,010] [INFO] [timer.py:197:stop] 0/4662, RunningAvgSamplesPerSec=12.007221787839358, CurrSamplesPerSec=11.910915457913543, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:05:35,445] [INFO] [timer.py:197:stop] 0/4663, RunningAvgSamplesPerSec=12.007219318693304, CurrSamplesPerSec=11.995724116018723, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:05:42,181] [INFO] [timer.py:197:stop] 0/4664, RunningAvgSamplesPerSec=12.007207258734885, CurrSamplesPerSec=11.951257774971513, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:05:48,690] [INFO] [timer.py:197:stop] 0/4665, RunningAvgSamplesPerSec=12.007184618576222, CurrSamplesPerSec=11.90255612650817, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:05:55,200] [INFO] [timer.py:197:stop] 0/4666, RunningAvgSamplesPerSec=12.00717142654244, CurrSamplesPerSec=11.945970580925152, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:06:01,585] [INFO] [timer.py:197:stop] 0/4667, RunningAvgSamplesPerSec=12.00717364255029, CurrSamplesPerSec=12.01751800923427, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:06:08,003] [INFO] [timer.py:197:stop] 0/4668, RunningAvgSamplesPerSec=12.00717492214451, CurrSamplesPerSec=12.013147198898732, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:06:14,643] [INFO] [timer.py:197:stop] 0/4669, RunningAvgSamplesPerSec=12.007181216082797, CurrSamplesPerSec=12.036620751648663, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:06:21,105] [INFO] [logging.py:68:log_dist] [Rank 0] step=4670, skipped=8, lr=[7.533333333333335e-07], mom=[[0.9, 0.999]] [2022-12-20 03:06:21,106] [INFO] [timer.py:197:stop] 0/4670, RunningAvgSamplesPerSec=12.00716548529169, CurrSamplesPerSec=11.934196136436437, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:06:27,576] [INFO] [timer.py:197:stop] 0/4671, RunningAvgSamplesPerSec=12.007168626561281, CurrSamplesPerSec=12.021850006117615, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:06:34,011] [INFO] [timer.py:197:stop] 0/4672, RunningAvgSamplesPerSec=12.007162645165302, CurrSamplesPerSec=11.979300325437812, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:06:40,502] [INFO] [timer.py:197:stop] 0/4673, RunningAvgSamplesPerSec=12.007151155931998, CurrSamplesPerSec=11.953735180205697, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:06:45,343] [INFO] [timer.py:197:stop] 0/4674, RunningAvgSamplesPerSec=12.007823851178635, CurrSamplesPerSec=16.263947173630875, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:06:51,845] [INFO] [timer.py:197:stop] 0/4675, RunningAvgSamplesPerSec=12.007794787917016, CurrSamplesPerSec=11.873529816360572, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0, 'learning_rate': 7.422222222222223e-07, 'epoch': 123.03} [2022-12-20 03:06:58,474] [INFO] [timer.py:197:stop] 0/4676, RunningAvgSamplesPerSec=12.00779520866758, CurrSamplesPerSec=12.009761698117066, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:07:04,963] [INFO] [timer.py:197:stop] 0/4677, RunningAvgSamplesPerSec=12.007798678235309, CurrSamplesPerSec=12.024037373153979, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:07:11,462] [INFO] [timer.py:197:stop] 0/4678, RunningAvgSamplesPerSec=12.007772800949523, CurrSamplesPerSec=11.888003404727957, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:07:17,946] [INFO] [timer.py:197:stop] 0/4679, RunningAvgSamplesPerSec=12.00775533227017, CurrSamplesPerSec=11.92662380817271, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:07:24,376] [INFO] [logging.py:68:log_dist] [Rank 0] step=4680, skipped=8, lr=[7.311111111111112e-07], mom=[[0.9, 0.999]] [2022-12-20 03:07:24,377] [INFO] [timer.py:197:stop] 0/4680, RunningAvgSamplesPerSec=12.007753982046863, CurrSamplesPerSec=12.001442307713802, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:07:30,790] [INFO] [timer.py:197:stop] 0/4681, RunningAvgSamplesPerSec=12.007756256543406, CurrSamplesPerSec=12.018405789957388, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:07:37,434] [INFO] [timer.py:197:stop] 0/4682, RunningAvgSamplesPerSec=12.00776228370734, CurrSamplesPerSec=12.03602978622548, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:07:43,948] [INFO] [timer.py:197:stop] 0/4683, RunningAvgSamplesPerSec=12.007744755925343, CurrSamplesPerSec=11.92627143407489, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:07:50,416] [INFO] [timer.py:197:stop] 0/4684, RunningAvgSamplesPerSec=12.00772013892374, CurrSamplesPerSec=11.893583501578284, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:07:57,078] [INFO] [timer.py:197:stop] 0/4685, RunningAvgSamplesPerSec=12.00768518324306, CurrSamplesPerSec=11.84622384411087, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:08:03,575] [INFO] [timer.py:197:stop] 0/4686, RunningAvgSamplesPerSec=12.007662439156517, CurrSamplesPerSec=11.902088545074514, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:08:10,099] [INFO] [timer.py:197:stop] 0/4687, RunningAvgSamplesPerSec=12.007662479355222, CurrSamplesPerSec=12.0078507730442, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:08:16,721] [INFO] [timer.py:197:stop] 0/4688, RunningAvgSamplesPerSec=12.007646319548405, CurrSamplesPerSec=11.932412080762692, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:08:23,177] [INFO] [timer.py:197:stop] 0/4689, RunningAvgSamplesPerSec=12.007623905009734, CurrSamplesPerSec=11.903500373021682, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:08:29,663] [INFO] [logging.py:68:log_dist] [Rank 0] step=4690, skipped=8, lr=[7.08888888888889e-07], mom=[[0.9, 0.999]] [2022-12-20 03:08:29,664] [INFO] [timer.py:197:stop] 0/4690, RunningAvgSamplesPerSec=12.007603151812326, CurrSamplesPerSec=11.911114708230434, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:08:36,253] [INFO] [timer.py:197:stop] 0/4691, RunningAvgSamplesPerSec=12.007595867052947, CurrSamplesPerSec=11.973541789367365, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:08:42,745] [INFO] [timer.py:197:stop] 0/4692, RunningAvgSamplesPerSec=12.007577794564892, CurrSamplesPerSec=11.9234298874775, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:08:49,360] [INFO] [timer.py:197:stop] 0/4693, RunningAvgSamplesPerSec=12.00757040029824, CurrSamplesPerSec=11.97299117938145, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:08:55,875] [INFO] [timer.py:197:stop] 0/4694, RunningAvgSamplesPerSec=12.0075501971284, CurrSamplesPerSec=11.913519450534425, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:09:02,321] [INFO] [timer.py:197:stop] 0/4695, RunningAvgSamplesPerSec=12.007548079973766, CurrSamplesPerSec=11.99762260339588, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:09:08,836] [INFO] [timer.py:197:stop] 0/4696, RunningAvgSamplesPerSec=12.007523313552118, CurrSamplesPerSec=11.892409002303738, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:09:15,303] [INFO] [timer.py:197:stop] 0/4697, RunningAvgSamplesPerSec=12.00752503685497, CurrSamplesPerSec=12.015619674759906, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:09:21,804] [INFO] [timer.py:197:stop] 0/4698, RunningAvgSamplesPerSec=12.007495784006828, CurrSamplesPerSec=11.871707147570639, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:09:28,307] [INFO] [timer.py:197:stop] 0/4699, RunningAvgSamplesPerSec=12.007491889733412, CurrSamplesPerSec=11.989232197359216, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:09:34,746] [INFO] [logging.py:68:log_dist] [Rank 0] step=4700, skipped=8, lr=[6.866666666666667e-07], mom=[[0.9, 0.999]] [2022-12-20 03:09:34,747] [INFO] [timer.py:197:stop] 0/4700, RunningAvgSamplesPerSec=12.007471972972619, CurrSamplesPerSec=11.914646297147183, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0, 'learning_rate': 6.866666666666667e-07, 'epoch': 123.68} [2022-12-20 03:09:41,476] [INFO] [timer.py:197:stop] 0/4701, RunningAvgSamplesPerSec=12.00745478801676, CurrSamplesPerSec=11.927259193940188, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:09:47,978] [INFO] [timer.py:197:stop] 0/4702, RunningAvgSamplesPerSec=12.007436482726803, CurrSamplesPerSec=11.922031860575064, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:09:54,430] [INFO] [timer.py:197:stop] 0/4703, RunningAvgSamplesPerSec=12.007423154505744, CurrSamplesPerSec=11.945105694437386, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:10:01,167] [INFO] [timer.py:197:stop] 0/4704, RunningAvgSamplesPerSec=12.00742297552288, CurrSamplesPerSec=12.00658163605102, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:10:07,764] [INFO] [timer.py:197:stop] 0/4705, RunningAvgSamplesPerSec=12.007425020300213, CurrSamplesPerSec=12.017047269643397, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:10:14,230] [INFO] [timer.py:197:stop] 0/4706, RunningAvgSamplesPerSec=12.00741884582983, CurrSamplesPerSec=11.978450383346798, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:10:20,710] [INFO] [timer.py:197:stop] 0/4707, RunningAvgSamplesPerSec=12.007386245070105, CurrSamplesPerSec=11.855966559636975, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:10:27,236] [INFO] [timer.py:197:stop] 0/4708, RunningAvgSamplesPerSec=12.007346685297767, CurrSamplesPerSec=11.824059734800755, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:10:33,718] [INFO] [timer.py:197:stop] 0/4709, RunningAvgSamplesPerSec=12.007318915343086, CurrSamplesPerSec=11.878040845859678, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:10:40,182] [INFO] [logging.py:68:log_dist] [Rank 0] step=4710, skipped=8, lr=[6.644444444444446e-07], mom=[[0.9, 0.999]] [2022-12-20 03:10:40,183] [INFO] [timer.py:197:stop] 0/4710, RunningAvgSamplesPerSec=12.007312355105018, CurrSamplesPerSec=11.976512538809773, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:10:46,654] [INFO] [timer.py:197:stop] 0/4711, RunningAvgSamplesPerSec=12.00729836763905, CurrSamplesPerSec=11.941804648805617, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:10:51,299] [INFO] [timer.py:197:stop] 0/4712, RunningAvgSamplesPerSec=12.007980411455224, CurrSamplesPerSec=16.392759680847522, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:10:57,814] [INFO] [timer.py:197:stop] 0/4713, RunningAvgSamplesPerSec=12.007942948382679, CurrSamplesPerSec=11.834047727675808, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:11:04,347] [INFO] [timer.py:197:stop] 0/4714, RunningAvgSamplesPerSec=12.007920930244438, CurrSamplesPerSec=11.905082018019181, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:11:10,768] [INFO] [timer.py:197:stop] 0/4715, RunningAvgSamplesPerSec=12.0079128404449, CurrSamplesPerSec=11.96991435685414, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:11:17,324] [INFO] [timer.py:197:stop] 0/4716, RunningAvgSamplesPerSec=12.00791268427385, CurrSamplesPerSec=12.00717669523392, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:11:23,858] [INFO] [timer.py:197:stop] 0/4717, RunningAvgSamplesPerSec=12.00787323869034, CurrSamplesPerSec=11.82476289283951, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:11:30,415] [INFO] [timer.py:197:stop] 0/4718, RunningAvgSamplesPerSec=12.007846822030581, CurrSamplesPerSec=11.884571250769142, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:11:36,999] [INFO] [timer.py:197:stop] 0/4719, RunningAvgSamplesPerSec=12.007823852593255, CurrSamplesPerSec=11.900468654276002, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:11:43,631] [INFO] [logging.py:68:log_dist] [Rank 0] step=4720, skipped=8, lr=[6.422222222222223e-07], mom=[[0.9, 0.999]] [2022-12-20 03:11:43,632] [INFO] [timer.py:197:stop] 0/4720, RunningAvgSamplesPerSec=12.007797212358168, CurrSamplesPerSec=11.883436933713257, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:11:50,121] [INFO] [timer.py:197:stop] 0/4721, RunningAvgSamplesPerSec=12.007774280090256, CurrSamplesPerSec=11.900546209028448, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:11:56,558] [INFO] [timer.py:197:stop] 0/4722, RunningAvgSamplesPerSec=12.007771408616044, CurrSamplesPerSec=11.994236199212247, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:12:03,071] [INFO] [timer.py:197:stop] 0/4723, RunningAvgSamplesPerSec=12.007757178403729, CurrSamplesPerSec=11.94096426831486, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:12:09,556] [INFO] [timer.py:197:stop] 0/4724, RunningAvgSamplesPerSec=12.00773821514169, CurrSamplesPerSec=11.918875326471442, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:12:16,055] [INFO] [timer.py:197:stop] 0/4725, RunningAvgSamplesPerSec=12.007720399854575, CurrSamplesPerSec=11.924181992110379, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0, 'learning_rate': 6.311111111111112e-07, 'epoch': 124.34} [2022-12-20 03:12:22,435] [INFO] [timer.py:197:stop] 0/4726, RunningAvgSamplesPerSec=12.00770470054889, CurrSamplesPerSec=11.934012029886928, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:12:28,972] [INFO] [timer.py:197:stop] 0/4727, RunningAvgSamplesPerSec=12.007661149333394, CurrSamplesPerSec=11.805391571905437, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:12:35,527] [INFO] [timer.py:197:stop] 0/4728, RunningAvgSamplesPerSec=12.007633973114041, CurrSamplesPerSec=11.88058526194735, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:12:42,025] [INFO] [timer.py:197:stop] 0/4729, RunningAvgSamplesPerSec=12.007611766472156, CurrSamplesPerSec=11.903572688740384, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:12:48,532] [INFO] [logging.py:68:log_dist] [Rank 0] step=4730, skipped=8, lr=[6.200000000000001e-07], mom=[[0.9, 0.999]] [2022-12-20 03:12:48,533] [INFO] [timer.py:197:stop] 0/4730, RunningAvgSamplesPerSec=12.007591902736403, CurrSamplesPerSec=11.914424718260408, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:12:55,014] [INFO] [timer.py:197:stop] 0/4731, RunningAvgSamplesPerSec=12.007577148680163, CurrSamplesPerSec=11.938222964274829, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:13:01,457] [INFO] [timer.py:197:stop] 0/4732, RunningAvgSamplesPerSec=12.007564122856538, CurrSamplesPerSec=11.946279461334903, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:13:07,992] [INFO] [timer.py:197:stop] 0/4733, RunningAvgSamplesPerSec=12.00753112935926, CurrSamplesPerSec=11.853474550034816, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:13:14,475] [INFO] [timer.py:197:stop] 0/4734, RunningAvgSamplesPerSec=12.00750879905668, CurrSamplesPerSec=11.90278570871771, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:13:20,939] [INFO] [timer.py:197:stop] 0/4735, RunningAvgSamplesPerSec=12.007498983686508, CurrSamplesPerSec=11.961231656984355, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:13:27,496] [INFO] [timer.py:197:stop] 0/4736, RunningAvgSamplesPerSec=12.0074777534793, CurrSamplesPerSec=11.907829250797484, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:13:33,948] [INFO] [timer.py:197:stop] 0/4737, RunningAvgSamplesPerSec=12.007458069568477, CurrSamplesPerSec=11.91499216706057, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:13:40,398] [INFO] [timer.py:197:stop] 0/4738, RunningAvgSamplesPerSec=12.00745839778998, CurrSamplesPerSec=12.009012727827386, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:13:46,898] [INFO] [timer.py:197:stop] 0/4739, RunningAvgSamplesPerSec=12.007429556748079, CurrSamplesPerSec=11.872375026918293, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:13:53,384] [INFO] [logging.py:68:log_dist] [Rank 0] step=4740, skipped=8, lr=[5.977777777777778e-07], mom=[[0.9, 0.999]] [2022-12-20 03:13:53,384] [INFO] [timer.py:197:stop] 0/4740, RunningAvgSamplesPerSec=12.007420655528287, CurrSamplesPerSec=11.96540315653249, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:13:59,995] [INFO] [timer.py:197:stop] 0/4741, RunningAvgSamplesPerSec=12.007392289623912, CurrSamplesPerSec=11.874482594856357, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:14:06,525] [INFO] [timer.py:197:stop] 0/4742, RunningAvgSamplesPerSec=12.007396197852424, CurrSamplesPerSec=12.025945911268217, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:14:13,087] [INFO] [timer.py:197:stop] 0/4743, RunningAvgSamplesPerSec=12.007368642321252, CurrSamplesPerSec=11.878161207925546, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:14:19,553] [INFO] [timer.py:197:stop] 0/4744, RunningAvgSamplesPerSec=12.007366063016685, CurrSamplesPerSec=11.995150023687739, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:14:25,962] [INFO] [timer.py:197:stop] 0/4745, RunningAvgSamplesPerSec=12.007376610650931, CurrSamplesPerSec=12.057602754016742, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:14:32,425] [INFO] [timer.py:197:stop] 0/4746, RunningAvgSamplesPerSec=12.007357411088886, CurrSamplesPerSec=11.916979457157392, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:14:38,961] [INFO] [timer.py:197:stop] 0/4747, RunningAvgSamplesPerSec=12.0073348378314, CurrSamplesPerSec=11.901194120877284, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:14:45,408] [INFO] [timer.py:197:stop] 0/4748, RunningAvgSamplesPerSec=12.007335449457143, CurrSamplesPerSec=12.010238315366887, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:14:51,895] [INFO] [timer.py:197:stop] 0/4749, RunningAvgSamplesPerSec=12.007318431775184, CurrSamplesPerSec=11.92709225951311, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:14:56,512] [INFO] [logging.py:68:log_dist] [Rank 0] step=4750, skipped=8, lr=[5.755555555555555e-07], mom=[[0.9, 0.999]] [2022-12-20 03:14:56,512] [INFO] [timer.py:197:stop] 0/4750, RunningAvgSamplesPerSec=12.008026070968972, CurrSamplesPerSec=16.67224856504998, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0, 'learning_rate': 5.755555555555555e-07, 'epoch': 125.0} [2022-12-20 03:15:03,022] [INFO] [timer.py:197:stop] 0/4751, RunningAvgSamplesPerSec=12.0079948879271, CurrSamplesPerSec=11.861741471345702, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:15:09,517] [INFO] [timer.py:197:stop] 0/4752, RunningAvgSamplesPerSec=12.007973982351334, CurrSamplesPerSec=11.909507682769604, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:15:15,977] [INFO] [timer.py:197:stop] 0/4753, RunningAvgSamplesPerSec=12.00797437621492, CurrSamplesPerSec=12.009845519829891, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:15:22,454] [INFO] [timer.py:197:stop] 0/4754, RunningAvgSamplesPerSec=12.007976966348263, CurrSamplesPerSec=12.020295316304264, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:15:28,915] [INFO] [timer.py:197:stop] 0/4755, RunningAvgSamplesPerSec=12.007958128996734, CurrSamplesPerSec=11.919105538663752, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:15:35,344] [INFO] [timer.py:197:stop] 0/4756, RunningAvgSamplesPerSec=12.007941822684163, CurrSamplesPerSec=11.930935055084172, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:15:41,890] [INFO] [timer.py:197:stop] 0/4757, RunningAvgSamplesPerSec=12.007919569886115, CurrSamplesPerSec=11.9030538297274, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:15:48,327] [INFO] [timer.py:197:stop] 0/4758, RunningAvgSamplesPerSec=12.007904071829321, CurrSamplesPerSec=11.93466040728096, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:15:55,159] [INFO] [timer.py:197:stop] 0/4759, RunningAvgSamplesPerSec=12.007876095420489, CurrSamplesPerSec=11.876278793240989, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:16:02,125] [INFO] [logging.py:68:log_dist] [Rank 0] step=4760, skipped=8, lr=[5.533333333333334e-07], mom=[[0.9, 0.999]] [2022-12-20 03:16:02,125] [INFO] [timer.py:197:stop] 0/4760, RunningAvgSamplesPerSec=12.007848702364806, CurrSamplesPerSec=11.878939152595663, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:16:09,104] [INFO] [timer.py:197:stop] 0/4761, RunningAvgSamplesPerSec=12.007838504752206, CurrSamplesPerSec=11.959513572277269, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:16:15,898] [INFO] [timer.py:197:stop] 0/4762, RunningAvgSamplesPerSec=12.007810322517404, CurrSamplesPerSec=11.875172850747113, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:16:22,602] [INFO] [timer.py:197:stop] 0/4763, RunningAvgSamplesPerSec=12.007790860639044, CurrSamplesPerSec=11.915861690211294, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:16:29,039] [INFO] [timer.py:197:stop] 0/4764, RunningAvgSamplesPerSec=12.007792851000586, CurrSamplesPerSec=12.017276447968888, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:16:35,514] [INFO] [timer.py:197:stop] 0/4765, RunningAvgSamplesPerSec=12.007797488781604, CurrSamplesPerSec=12.029923305027758, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:16:42,015] [INFO] [timer.py:197:stop] 0/4766, RunningAvgSamplesPerSec=12.007784811024974, CurrSamplesPerSec=11.947702856792892, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:16:48,426] [INFO] [timer.py:197:stop] 0/4767, RunningAvgSamplesPerSec=12.007782102655176, CurrSamplesPerSec=11.994893281222316, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:16:55,100] [INFO] [timer.py:197:stop] 0/4768, RunningAvgSamplesPerSec=12.007754442133137, CurrSamplesPerSec=11.877383365143023, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:17:01,716] [INFO] [timer.py:197:stop] 0/4769, RunningAvgSamplesPerSec=12.007735230290173, CurrSamplesPerSec=11.916864655527794, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:17:08,213] [INFO] [logging.py:68:log_dist] [Rank 0] step=4770, skipped=8, lr=[5.311111111111111e-07], mom=[[0.9, 0.999]] [2022-12-20 03:17:08,213] [INFO] [timer.py:197:stop] 0/4770, RunningAvgSamplesPerSec=12.007730112326943, CurrSamplesPerSec=11.983382261990528, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:17:14,681] [INFO] [timer.py:197:stop] 0/4771, RunningAvgSamplesPerSec=12.007715145110183, CurrSamplesPerSec=11.936773163096994, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:17:21,147] [INFO] [timer.py:197:stop] 0/4772, RunningAvgSamplesPerSec=12.007690549534663, CurrSamplesPerSec=11.891529201941909, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:17:27,669] [INFO] [timer.py:197:stop] 0/4773, RunningAvgSamplesPerSec=12.007694298780661, CurrSamplesPerSec=12.025604883291777, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:17:34,231] [INFO] [timer.py:197:stop] 0/4774, RunningAvgSamplesPerSec=12.00767820238549, CurrSamplesPerSec=11.931370434697888, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:17:40,925] [INFO] [timer.py:197:stop] 0/4775, RunningAvgSamplesPerSec=12.007628551714367, CurrSamplesPerSec=11.775281159117167, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0, 'learning_rate': 5.2e-07, 'epoch': 125.66} [2022-12-20 03:17:47,538] [INFO] [timer.py:197:stop] 0/4776, RunningAvgSamplesPerSec=12.007623436629926, CurrSamplesPerSec=11.983258688177019, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:17:53,957] [INFO] [timer.py:197:stop] 0/4777, RunningAvgSamplesPerSec=12.00762596681897, CurrSamplesPerSec=12.019717255141579, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:18:00,411] [INFO] [timer.py:197:stop] 0/4778, RunningAvgSamplesPerSec=12.0076288642664, CurrSamplesPerSec=12.021480138673013, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:18:06,857] [INFO] [timer.py:197:stop] 0/4779, RunningAvgSamplesPerSec=12.00760722295331, CurrSamplesPerSec=11.905130593208764, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:18:13,485] [INFO] [logging.py:68:log_dist] [Rank 0] step=4780, skipped=8, lr=[5.088888888888889e-07], mom=[[0.9, 0.999]] [2022-12-20 03:18:13,485] [INFO] [timer.py:197:stop] 0/4780, RunningAvgSamplesPerSec=12.007574948543922, CurrSamplesPerSec=11.855354977922318, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:18:20,178] [INFO] [timer.py:197:stop] 0/4781, RunningAvgSamplesPerSec=12.007557643136009, CurrSamplesPerSec=11.92543800648651, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:18:26,784] [INFO] [timer.py:197:stop] 0/4782, RunningAvgSamplesPerSec=12.007525906724045, CurrSamplesPerSec=11.857749827104685, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:18:33,248] [INFO] [timer.py:197:stop] 0/4783, RunningAvgSamplesPerSec=12.00750039127333, CurrSamplesPerSec=11.886763158434274, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:18:39,831] [INFO] [timer.py:197:stop] 0/4784, RunningAvgSamplesPerSec=12.00747217319631, CurrSamplesPerSec=11.874060811313228, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:18:46,328] [INFO] [timer.py:197:stop] 0/4785, RunningAvgSamplesPerSec=12.007442290597565, CurrSamplesPerSec=11.866224662502662, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:18:52,937] [INFO] [timer.py:197:stop] 0/4786, RunningAvgSamplesPerSec=12.00744052888745, CurrSamplesPerSec=11.999020179650419, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:18:59,604] [INFO] [timer.py:197:stop] 0/4787, RunningAvgSamplesPerSec=12.007437229329895, CurrSamplesPerSec=11.991672874277619, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:19:04,235] [INFO] [timer.py:197:stop] 0/4788, RunningAvgSamplesPerSec=12.00813406670647, CurrSamplesPerSec=16.624668046009813, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:19:10,867] [INFO] [timer.py:197:stop] 0/4789, RunningAvgSamplesPerSec=12.008115036064677, CurrSamplesPerSec=11.917720166003813, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:19:17,461] [INFO] [logging.py:68:log_dist] [Rank 0] step=4790, skipped=8, lr=[4.866666666666666e-07], mom=[[0.9, 0.999]] [2022-12-20 03:19:17,462] [INFO] [timer.py:197:stop] 0/4790, RunningAvgSamplesPerSec=12.008112765110903, CurrSamplesPerSec=11.997251544218647, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:19:23,983] [INFO] [timer.py:197:stop] 0/4791, RunningAvgSamplesPerSec=12.008110085360798, CurrSamplesPerSec=11.995293139603062, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:19:30,430] [INFO] [timer.py:197:stop] 0/4792, RunningAvgSamplesPerSec=12.008111549309566, CurrSamplesPerSec=12.01512649643875, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:19:36,963] [INFO] [timer.py:197:stop] 0/4793, RunningAvgSamplesPerSec=12.008103959404645, CurrSamplesPerSec=11.971858075510415, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:19:43,431] [INFO] [timer.py:197:stop] 0/4794, RunningAvgSamplesPerSec=12.008086690470938, CurrSamplesPerSec=11.92591749128188, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:19:49,927] [INFO] [timer.py:197:stop] 0/4795, RunningAvgSamplesPerSec=12.008061010696723, CurrSamplesPerSec=11.886252079597968, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:19:56,405] [INFO] [timer.py:197:stop] 0/4796, RunningAvgSamplesPerSec=12.00806195881122, CurrSamplesPerSec=12.012607992350178, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:20:02,963] [INFO] [timer.py:197:stop] 0/4797, RunningAvgSamplesPerSec=12.008035178233037, CurrSamplesPerSec=11.881007509141613, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:20:09,464] [INFO] [timer.py:197:stop] 0/4798, RunningAvgSamplesPerSec=12.008030951702304, CurrSamplesPerSec=11.987798890060368, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:20:16,031] [INFO] [timer.py:197:stop] 0/4799, RunningAvgSamplesPerSec=12.007995799742163, CurrSamplesPerSec=11.841741646002601, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:20:22,537] [INFO] [logging.py:68:log_dist] [Rank 0] step=4800, skipped=8, lr=[4.6444444444444446e-07], mom=[[0.9, 0.999]] [2022-12-20 03:20:22,538] [INFO] [timer.py:197:stop] 0/4800, RunningAvgSamplesPerSec=12.007973833558056, CurrSamplesPerSec=11.903518847756764, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0, 'learning_rate': 4.6444444444444446e-07, 'epoch': 126.32} [2022-12-20 03:20:29,069] [INFO] [timer.py:197:stop] 0/4801, RunningAvgSamplesPerSec=12.00794699119364, CurrSamplesPerSec=11.880524267324288, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:20:35,590] [INFO] [timer.py:197:stop] 0/4802, RunningAvgSamplesPerSec=12.007926846313742, CurrSamplesPerSec=11.912023839361217, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:20:42,180] [INFO] [timer.py:197:stop] 0/4803, RunningAvgSamplesPerSec=12.007921593222838, CurrSamplesPerSec=11.982759604292617, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:20:48,777] [INFO] [timer.py:197:stop] 0/4804, RunningAvgSamplesPerSec=12.007911099690824, CurrSamplesPerSec=11.957742181119473, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:20:55,276] [INFO] [timer.py:197:stop] 0/4805, RunningAvgSamplesPerSec=12.007889083842851, CurrSamplesPerSec=11.903091832107693, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:21:01,823] [INFO] [timer.py:197:stop] 0/4806, RunningAvgSamplesPerSec=12.007867585986336, CurrSamplesPerSec=11.905493863714826, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:21:08,319] [INFO] [timer.py:197:stop] 0/4807, RunningAvgSamplesPerSec=12.007851597037602, CurrSamplesPerSec=11.931529003498223, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:21:14,799] [INFO] [timer.py:197:stop] 0/4808, RunningAvgSamplesPerSec=12.007853566729846, CurrSamplesPerSec=12.017325405035356, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:21:21,536] [INFO] [timer.py:197:stop] 0/4809, RunningAvgSamplesPerSec=12.007831827771327, CurrSamplesPerSec=11.904255772524287, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:21:28,053] [INFO] [logging.py:68:log_dist] [Rank 0] step=4810, skipped=8, lr=[4.422222222222223e-07], mom=[[0.9, 0.999]] [2022-12-20 03:21:28,054] [INFO] [timer.py:197:stop] 0/4810, RunningAvgSamplesPerSec=12.00780872815964, CurrSamplesPerSec=11.897786515064093, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:21:34,478] [INFO] [timer.py:197:stop] 0/4811, RunningAvgSamplesPerSec=12.007797844437734, CurrSamplesPerSec=11.955696011974783, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:21:41,115] [INFO] [timer.py:197:stop] 0/4812, RunningAvgSamplesPerSec=12.007725695856406, CurrSamplesPerSec=11.670509049579982, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:21:47,565] [INFO] [timer.py:197:stop] 0/4813, RunningAvgSamplesPerSec=12.007704495267115, CurrSamplesPerSec=11.906588561364435, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:21:54,094] [INFO] [timer.py:197:stop] 0/4814, RunningAvgSamplesPerSec=12.007684027884652, CurrSamplesPerSec=11.910016536053877, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:22:00,594] [INFO] [timer.py:197:stop] 0/4815, RunningAvgSamplesPerSec=12.007647246815802, CurrSamplesPerSec=11.83322818256806, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:22:07,193] [INFO] [timer.py:197:stop] 0/4816, RunningAvgSamplesPerSec=12.007648405185572, CurrSamplesPerSec=12.013226229249657, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:22:13,843] [INFO] [timer.py:197:stop] 0/4817, RunningAvgSamplesPerSec=12.007632898472266, CurrSamplesPerSec=11.933444890081661, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:22:20,390] [INFO] [timer.py:197:stop] 0/4818, RunningAvgSamplesPerSec=12.007622057247733, CurrSamplesPerSec=11.955647555760452, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:22:26,910] [INFO] [timer.py:197:stop] 0/4819, RunningAvgSamplesPerSec=12.007609124238929, CurrSamplesPerSec=11.94564523707839, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:22:33,369] [INFO] [logging.py:68:log_dist] [Rank 0] step=4820, skipped=8, lr=[4.2000000000000006e-07], mom=[[0.9, 0.999]] [2022-12-20 03:22:33,370] [INFO] [timer.py:197:stop] 0/4820, RunningAvgSamplesPerSec=12.007607879302126, CurrSamplesPerSec=12.001614012814546, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:22:39,846] [INFO] [timer.py:197:stop] 0/4821, RunningAvgSamplesPerSec=12.007588699444542, CurrSamplesPerSec=11.91588602175969, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:22:46,301] [INFO] [timer.py:197:stop] 0/4822, RunningAvgSamplesPerSec=12.007565349670852, CurrSamplesPerSec=11.896087661344861, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:22:52,723] [INFO] [timer.py:197:stop] 0/4823, RunningAvgSamplesPerSec=12.007569812666915, CurrSamplesPerSec=12.029120069132313, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:22:59,298] [INFO] [timer.py:197:stop] 0/4824, RunningAvgSamplesPerSec=12.007566845935644, CurrSamplesPerSec=11.993281254042003, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:23:05,773] [INFO] [timer.py:197:stop] 0/4825, RunningAvgSamplesPerSec=12.007535985876094, CurrSamplesPerSec=11.860550718144555, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0, 'learning_rate': 4.0888888888888897e-07, 'epoch': 126.97} [2022-12-20 03:23:10,386] [INFO] [timer.py:197:stop] 0/4826, RunningAvgSamplesPerSec=12.008232026516497, CurrSamplesPerSec=16.668255613057443, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:23:16,855] [INFO] [timer.py:197:stop] 0/4827, RunningAvgSamplesPerSec=12.008228931953479, CurrSamplesPerSec=11.993319298892748, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:23:23,323] [INFO] [timer.py:197:stop] 0/4828, RunningAvgSamplesPerSec=12.008233524792338, CurrSamplesPerSec=12.03043495213335, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:23:29,820] [INFO] [timer.py:197:stop] 0/4829, RunningAvgSamplesPerSec=12.008214911708682, CurrSamplesPerSec=11.919055261678842, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:23:36,278] [INFO] [logging.py:68:log_dist] [Rank 0] step=4830, skipped=8, lr=[3.9777777777777783e-07], mom=[[0.9, 0.999]] [2022-12-20 03:23:36,279] [INFO] [timer.py:197:stop] 0/4830, RunningAvgSamplesPerSec=12.008210175823477, CurrSamplesPerSec=11.985393503194173, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:23:42,735] [INFO] [timer.py:197:stop] 0/4831, RunningAvgSamplesPerSec=12.008175531893828, CurrSamplesPerSec=11.843212866912973, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:23:49,281] [INFO] [timer.py:197:stop] 0/4832, RunningAvgSamplesPerSec=12.008175791295383, CurrSamplesPerSec=12.009428572106042, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:23:55,881] [INFO] [timer.py:197:stop] 0/4833, RunningAvgSamplesPerSec=12.008177397024902, CurrSamplesPerSec=12.015938084015408, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:24:02,388] [INFO] [timer.py:197:stop] 0/4834, RunningAvgSamplesPerSec=12.008157806460495, CurrSamplesPerSec=11.914256027688554, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:24:08,826] [INFO] [timer.py:197:stop] 0/4835, RunningAvgSamplesPerSec=12.008139282067896, CurrSamplesPerSec=11.919291831821115, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:24:15,259] [INFO] [timer.py:197:stop] 0/4836, RunningAvgSamplesPerSec=12.008140390399683, CurrSamplesPerSec=12.013499348944494, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:24:21,805] [INFO] [timer.py:197:stop] 0/4837, RunningAvgSamplesPerSec=12.008121455034622, CurrSamplesPerSec=11.91728049128937, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:24:28,223] [INFO] [timer.py:197:stop] 0/4838, RunningAvgSamplesPerSec=12.008124862552629, CurrSamplesPerSec=12.024622852321613, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:24:34,720] [INFO] [timer.py:197:stop] 0/4839, RunningAvgSamplesPerSec=12.008106035448092, CurrSamplesPerSec=11.917743446917129, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:24:41,141] [INFO] [logging.py:68:log_dist] [Rank 0] step=4840, skipped=8, lr=[3.755555555555556e-07], mom=[[0.9, 0.999]] [2022-12-20 03:24:41,142] [INFO] [timer.py:197:stop] 0/4840, RunningAvgSamplesPerSec=12.008107536449549, CurrSamplesPerSec=12.015372273812964, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:24:47,570] [INFO] [timer.py:197:stop] 0/4841, RunningAvgSamplesPerSec=12.00811122568446, CurrSamplesPerSec=12.025986318708188, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:24:54,088] [INFO] [timer.py:197:stop] 0/4842, RunningAvgSamplesPerSec=12.008096338193239, CurrSamplesPerSec=11.93648547438439, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:25:00,591] [INFO] [timer.py:197:stop] 0/4843, RunningAvgSamplesPerSec=12.008073885880679, CurrSamplesPerSec=11.900379493938923, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:25:06,978] [INFO] [timer.py:197:stop] 0/4844, RunningAvgSamplesPerSec=12.008077775898142, CurrSamplesPerSec=12.02693893541419, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:25:13,451] [INFO] [timer.py:197:stop] 0/4845, RunningAvgSamplesPerSec=12.008061500541713, CurrSamplesPerSec=11.929770133383345, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:25:19,920] [INFO] [timer.py:197:stop] 0/4846, RunningAvgSamplesPerSec=12.008046774014947, CurrSamplesPerSec=11.937147391490809, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:25:26,395] [INFO] [timer.py:197:stop] 0/4847, RunningAvgSamplesPerSec=12.00803323967971, CurrSamplesPerSec=11.942828989265598, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:25:32,935] [INFO] [timer.py:197:stop] 0/4848, RunningAvgSamplesPerSec=12.008017658428214, CurrSamplesPerSec=11.93299821922192, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:25:39,451] [INFO] [timer.py:197:stop] 0/4849, RunningAvgSamplesPerSec=12.007998877879832, CurrSamplesPerSec=11.917673075343576, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:25:45,960] [INFO] [logging.py:68:log_dist] [Rank 0] step=4850, skipped=8, lr=[3.533333333333334e-07], mom=[[0.9, 0.999]] [2022-12-20 03:25:45,960] [INFO] [timer.py:197:stop] 0/4850, RunningAvgSamplesPerSec=12.007979713905222, CurrSamplesPerSec=11.91580509329833, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0, 'learning_rate': 3.533333333333334e-07, 'epoch': 127.63} [2022-12-20 03:25:52,438] [INFO] [timer.py:197:stop] 0/4851, RunningAvgSamplesPerSec=12.007976437906116, CurrSamplesPerSec=11.992115376788824, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:25:58,893] [INFO] [timer.py:197:stop] 0/4852, RunningAvgSamplesPerSec=12.007979040180174, CurrSamplesPerSec=12.020610743698992, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:26:05,301] [INFO] [timer.py:197:stop] 0/4853, RunningAvgSamplesPerSec=12.007959814034033, CurrSamplesPerSec=11.915431672937792, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:26:11,812] [INFO] [timer.py:197:stop] 0/4854, RunningAvgSamplesPerSec=12.007941026415292, CurrSamplesPerSec=11.917488949143284, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:26:18,338] [INFO] [timer.py:197:stop] 0/4855, RunningAvgSamplesPerSec=12.007916923465928, CurrSamplesPerSec=11.892097632921272, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:26:24,929] [INFO] [timer.py:197:stop] 0/4856, RunningAvgSamplesPerSec=12.007867246366436, CurrSamplesPerSec=11.771530218710174, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:26:31,458] [INFO] [timer.py:197:stop] 0/4857, RunningAvgSamplesPerSec=12.007835731372984, CurrSamplesPerSec=11.856786638936757, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:26:37,998] [INFO] [timer.py:197:stop] 0/4858, RunningAvgSamplesPerSec=12.00781633741532, CurrSamplesPerSec=11.914391402815951, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:26:44,472] [INFO] [timer.py:197:stop] 0/4859, RunningAvgSamplesPerSec=12.007792551877213, CurrSamplesPerSec=11.893390633827345, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:26:50,952] [INFO] [logging.py:68:log_dist] [Rank 0] step=4860, skipped=8, lr=[3.3111111111111115e-07], mom=[[0.9, 0.999]] [2022-12-20 03:26:50,953] [INFO] [timer.py:197:stop] 0/4860, RunningAvgSamplesPerSec=12.007787183520776, CurrSamplesPerSec=11.981769583421135, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:26:57,358] [INFO] [timer.py:197:stop] 0/4861, RunningAvgSamplesPerSec=12.007761012749924, CurrSamplesPerSec=11.88195570280231, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:27:03,816] [INFO] [timer.py:197:stop] 0/4862, RunningAvgSamplesPerSec=12.007744963361064, CurrSamplesPerSec=11.930264282671162, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:27:10,307] [INFO] [timer.py:197:stop] 0/4863, RunningAvgSamplesPerSec=12.007713363227992, CurrSamplesPerSec=11.85607652537053, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:27:14,968] [INFO] [timer.py:197:stop] 0/4864, RunningAvgSamplesPerSec=12.008394703212264, CurrSamplesPerSec=16.582107810358114, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:27:21,478] [INFO] [timer.py:197:stop] 0/4865, RunningAvgSamplesPerSec=12.008381177794822, CurrSamplesPerSec=11.94297882972843, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:27:27,934] [INFO] [timer.py:197:stop] 0/4866, RunningAvgSamplesPerSec=12.008364423895229, CurrSamplesPerSec=11.927439382375324, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:27:34,425] [INFO] [timer.py:197:stop] 0/4867, RunningAvgSamplesPerSec=12.00836591740237, CurrSamplesPerSec=12.015634734276928, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:27:40,890] [INFO] [timer.py:197:stop] 0/4868, RunningAvgSamplesPerSec=12.008349845770836, CurrSamplesPerSec=11.930667267328717, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:27:47,408] [INFO] [timer.py:197:stop] 0/4869, RunningAvgSamplesPerSec=12.008328641805024, CurrSamplesPerSec=11.906029305720908, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:27:53,861] [INFO] [logging.py:68:log_dist] [Rank 0] step=4870, skipped=8, lr=[3.088888888888889e-07], mom=[[0.9, 0.999]] [2022-12-20 03:27:53,861] [INFO] [timer.py:197:stop] 0/4870, RunningAvgSamplesPerSec=12.008328611170365, CurrSamplesPerSec=12.008179514139151, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:28:00,326] [INFO] [timer.py:197:stop] 0/4871, RunningAvgSamplesPerSec=12.008306063321829, CurrSamplesPerSec=11.899537548588162, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:28:06,755] [INFO] [timer.py:197:stop] 0/4872, RunningAvgSamplesPerSec=12.00829043936644, CurrSamplesPerSec=11.932696390507166, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:28:13,251] [INFO] [timer.py:197:stop] 0/4873, RunningAvgSamplesPerSec=12.008282719361015, CurrSamplesPerSec=11.970803659264242, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:28:19,675] [INFO] [timer.py:197:stop] 0/4874, RunningAvgSamplesPerSec=12.008281095340413, CurrSamplesPerSec=12.000375699838182, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:28:26,277] [INFO] [timer.py:197:stop] 0/4875, RunningAvgSamplesPerSec=12.008222075109458, CurrSamplesPerSec=11.72740136849959, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0, 'learning_rate': 2.977777777777778e-07, 'epoch': 128.29} [2022-12-20 03:28:32,766] [INFO] [timer.py:197:stop] 0/4876, RunningAvgSamplesPerSec=12.008204105355416, CurrSamplesPerSec=11.921271556321168, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:28:39,293] [INFO] [timer.py:197:stop] 0/4877, RunningAvgSamplesPerSec=12.00817659136585, CurrSamplesPerSec=11.875554784798378, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:28:45,727] [INFO] [timer.py:197:stop] 0/4878, RunningAvgSamplesPerSec=12.008175816171033, CurrSamplesPerSec=12.00439793061498, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:28:52,244] [INFO] [timer.py:197:stop] 0/4879, RunningAvgSamplesPerSec=12.008155371896537, CurrSamplesPerSec=11.909289993502663, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:28:58,801] [INFO] [logging.py:68:log_dist] [Rank 0] step=4880, skipped=8, lr=[2.866666666666667e-07], mom=[[0.9, 0.999]] [2022-12-20 03:28:58,802] [INFO] [timer.py:197:stop] 0/4880, RunningAvgSamplesPerSec=12.008133710319106, CurrSamplesPerSec=11.903411695091547, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:29:05,322] [INFO] [timer.py:197:stop] 0/4881, RunningAvgSamplesPerSec=12.0081148219262, CurrSamplesPerSec=11.916678967251618, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:29:11,725] [INFO] [timer.py:197:stop] 0/4882, RunningAvgSamplesPerSec=12.008115647098359, CurrSamplesPerSec=12.012143012612695, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:29:18,262] [INFO] [timer.py:197:stop] 0/4883, RunningAvgSamplesPerSec=12.008090492924715, CurrSamplesPerSec=11.886580512444102, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:29:24,795] [INFO] [timer.py:197:stop] 0/4884, RunningAvgSamplesPerSec=12.008068384650926, CurrSamplesPerSec=11.901119195867258, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:29:31,320] [INFO] [timer.py:197:stop] 0/4885, RunningAvgSamplesPerSec=12.008045494787465, CurrSamplesPerSec=11.897327745105322, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:29:37,825] [INFO] [timer.py:197:stop] 0/4886, RunningAvgSamplesPerSec=12.008024669096367, CurrSamplesPerSec=11.907186954427669, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:29:44,307] [INFO] [timer.py:197:stop] 0/4887, RunningAvgSamplesPerSec=12.008024274805106, CurrSamplesPerSec=12.006098865126255, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:29:50,759] [INFO] [timer.py:197:stop] 0/4888, RunningAvgSamplesPerSec=12.008010013820456, CurrSamplesPerSec=11.938747017875645, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:29:57,235] [INFO] [timer.py:197:stop] 0/4889, RunningAvgSamplesPerSec=12.007993126780523, CurrSamplesPerSec=11.926046243489523, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:30:03,748] [INFO] [logging.py:68:log_dist] [Rank 0] step=4890, skipped=8, lr=[2.6444444444444447e-07], mom=[[0.9, 0.999]] [2022-12-20 03:30:03,749] [INFO] [timer.py:197:stop] 0/4890, RunningAvgSamplesPerSec=12.007968774443079, CurrSamplesPerSec=11.890127058726812, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:30:10,262] [INFO] [timer.py:197:stop] 0/4891, RunningAvgSamplesPerSec=12.007942026023347, CurrSamplesPerSec=11.878604308514047, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:30:16,778] [INFO] [timer.py:197:stop] 0/4892, RunningAvgSamplesPerSec=12.007916276622876, CurrSamplesPerSec=11.88333382471777, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:30:23,246] [INFO] [timer.py:197:stop] 0/4893, RunningAvgSamplesPerSec=12.007913833042332, CurrSamplesPerSec=11.995976605393972, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:30:29,660] [INFO] [timer.py:197:stop] 0/4894, RunningAvgSamplesPerSec=12.007902064031938, CurrSamplesPerSec=11.95061450945089, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:30:36,152] [INFO] [timer.py:197:stop] 0/4895, RunningAvgSamplesPerSec=12.007869817149109, CurrSamplesPerSec=11.85216404778299, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:30:43,166] [INFO] [timer.py:197:stop] 0/4896, RunningAvgSamplesPerSec=12.007815815545746, CurrSamplesPerSec=11.749276250014335, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:30:50,179] [INFO] [timer.py:197:stop] 0/4897, RunningAvgSamplesPerSec=12.007804147923627, CurrSamplesPerSec=11.95097311212588, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:30:57,174] [INFO] [timer.py:197:stop] 0/4898, RunningAvgSamplesPerSec=12.007792873901577, CurrSamplesPerSec=11.95285905660138, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:31:04,004] [INFO] [timer.py:197:stop] 0/4899, RunningAvgSamplesPerSec=12.007778434767992, CurrSamplesPerSec=11.937498285100693, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:31:10,487] [INFO] [logging.py:68:log_dist] [Rank 0] step=4900, skipped=8, lr=[2.4222222222222224e-07], mom=[[0.9, 0.999]] [2022-12-20 03:31:10,488] [INFO] [timer.py:197:stop] 0/4900, RunningAvgSamplesPerSec=12.007749121003041, CurrSamplesPerSec=11.865895780007767, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0, 'learning_rate': 2.4222222222222224e-07, 'epoch': 128.95} [2022-12-20 03:31:17,046] [INFO] [timer.py:197:stop] 0/4901, RunningAvgSamplesPerSec=12.007728328827085, CurrSamplesPerSec=11.90674488772877, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:31:21,781] [INFO] [timer.py:197:stop] 0/4902, RunningAvgSamplesPerSec=12.00841276460384, CurrSamplesPerSec=16.660789362634574, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:31:28,291] [INFO] [timer.py:197:stop] 0/4903, RunningAvgSamplesPerSec=12.00839103253242, CurrSamplesPerSec=11.902840070859545, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:31:34,804] [INFO] [timer.py:197:stop] 0/4904, RunningAvgSamplesPerSec=12.008375483262785, CurrSamplesPerSec=11.932649181421256, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:31:41,529] [INFO] [timer.py:197:stop] 0/4905, RunningAvgSamplesPerSec=12.008357497112145, CurrSamplesPerSec=11.920832148920022, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:31:47,956] [INFO] [timer.py:197:stop] 0/4906, RunningAvgSamplesPerSec=12.008343964938597, CurrSamplesPerSec=11.942360363576377, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:31:54,438] [INFO] [timer.py:197:stop] 0/4907, RunningAvgSamplesPerSec=12.008319706194541, CurrSamplesPerSec=11.890522069375665, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:32:00,878] [INFO] [timer.py:197:stop] 0/4908, RunningAvgSamplesPerSec=12.00831250488812, CurrSamplesPerSec=11.97309371397033, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:32:07,509] [INFO] [timer.py:197:stop] 0/4909, RunningAvgSamplesPerSec=12.008297275363606, CurrSamplesPerSec=11.934043332905205, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:32:14,127] [INFO] [logging.py:68:log_dist] [Rank 0] step=4910, skipped=8, lr=[2.2e-07], mom=[[0.9, 0.999]] [2022-12-20 03:32:14,127] [INFO] [timer.py:197:stop] 0/4910, RunningAvgSamplesPerSec=12.008300129813518, CurrSamplesPerSec=12.022323275820565, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:32:20,725] [INFO] [timer.py:197:stop] 0/4911, RunningAvgSamplesPerSec=12.008295576658654, CurrSamplesPerSec=11.985990210293302, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:32:27,205] [INFO] [timer.py:197:stop] 0/4912, RunningAvgSamplesPerSec=12.00827995655084, CurrSamplesPerSec=11.932087475007712, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:32:33,653] [INFO] [timer.py:197:stop] 0/4913, RunningAvgSamplesPerSec=12.008267326600874, CurrSamplesPerSec=11.946572939103401, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:32:40,141] [INFO] [timer.py:197:stop] 0/4914, RunningAvgSamplesPerSec=12.008250247885666, CurrSamplesPerSec=11.924958560245226, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:32:46,640] [INFO] [timer.py:197:stop] 0/4915, RunningAvgSamplesPerSec=12.008227771700211, CurrSamplesPerSec=11.898830744113273, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:32:53,124] [INFO] [timer.py:197:stop] 0/4916, RunningAvgSamplesPerSec=12.008212936165913, CurrSamplesPerSec=11.935765782554002, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:32:59,578] [INFO] [timer.py:197:stop] 0/4917, RunningAvgSamplesPerSec=12.008216759002758, CurrSamplesPerSec=12.027031618850447, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:33:06,049] [INFO] [timer.py:197:stop] 0/4918, RunningAvgSamplesPerSec=12.008219457853505, CurrSamplesPerSec=12.021498981442404, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:33:12,579] [INFO] [timer.py:197:stop] 0/4919, RunningAvgSamplesPerSec=12.008191156402466, CurrSamplesPerSec=11.870655074996579, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:33:19,095] [INFO] [logging.py:68:log_dist] [Rank 0] step=4920, skipped=8, lr=[1.9777777777777778e-07], mom=[[0.9, 0.999]] [2022-12-20 03:33:19,096] [INFO] [timer.py:197:stop] 0/4920, RunningAvgSamplesPerSec=12.008193697913676, CurrSamplesPerSec=12.020703329626395, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:33:25,657] [INFO] [timer.py:197:stop] 0/4921, RunningAvgSamplesPerSec=12.008194033671913, CurrSamplesPerSec=12.009845519829891, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:33:32,303] [INFO] [timer.py:197:stop] 0/4922, RunningAvgSamplesPerSec=12.00818348627541, CurrSamplesPerSec=11.956524088211635, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:33:38,865] [INFO] [timer.py:197:stop] 0/4923, RunningAvgSamplesPerSec=12.008165825894736, CurrSamplesPerSec=11.921901077101861, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:33:45,452] [INFO] [timer.py:197:stop] 0/4924, RunningAvgSamplesPerSec=12.008148438817217, CurrSamplesPerSec=11.923192093853508, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:33:51,891] [INFO] [timer.py:197:stop] 0/4925, RunningAvgSamplesPerSec=12.008150382130012, CurrSamplesPerSec=12.017722994227599, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0, 'learning_rate': 1.866666666666667e-07, 'epoch': 129.61} [2022-12-20 03:33:58,364] [INFO] [timer.py:197:stop] 0/4926, RunningAvgSamplesPerSec=12.008135168154919, CurrSamplesPerSec=11.9337011308578, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:34:04,934] [INFO] [timer.py:197:stop] 0/4927, RunningAvgSamplesPerSec=12.00810935076479, CurrSamplesPerSec=11.88231650841311, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:34:11,526] [INFO] [timer.py:197:stop] 0/4928, RunningAvgSamplesPerSec=12.00809240839249, CurrSamplesPerSec=11.92522715134624, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:34:18,072] [INFO] [timer.py:197:stop] 0/4929, RunningAvgSamplesPerSec=12.008094497762688, CurrSamplesPerSec=12.018395566281338, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:34:24,680] [INFO] [logging.py:68:log_dist] [Rank 0] step=4930, skipped=8, lr=[1.7555555555555558e-07], mom=[[0.9, 0.999]] [2022-12-20 03:34:24,681] [INFO] [timer.py:197:stop] 0/4930, RunningAvgSamplesPerSec=12.008098823678125, CurrSamplesPerSec=12.029450514939464, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:34:31,300] [INFO] [timer.py:197:stop] 0/4931, RunningAvgSamplesPerSec=12.008081812265713, CurrSamplesPerSec=11.924830891005715, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:34:37,788] [INFO] [timer.py:197:stop] 0/4932, RunningAvgSamplesPerSec=12.008083049039204, CurrSamplesPerSec=12.014182202516514, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:34:44,233] [INFO] [timer.py:197:stop] 0/4933, RunningAvgSamplesPerSec=12.008088521627192, CurrSamplesPerSec=12.035129147810647, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:34:50,661] [INFO] [timer.py:197:stop] 0/4934, RunningAvgSamplesPerSec=12.008068189695962, CurrSamplesPerSec=11.90864172818948, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:34:57,351] [INFO] [timer.py:197:stop] 0/4935, RunningAvgSamplesPerSec=12.008043222075518, CurrSamplesPerSec=11.886153132234142, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:35:03,959] [INFO] [timer.py:197:stop] 0/4936, RunningAvgSamplesPerSec=12.00802450461002, CurrSamplesPerSec=11.916395949244635, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:35:10,385] [INFO] [timer.py:197:stop] 0/4937, RunningAvgSamplesPerSec=12.008029313856795, CurrSamplesPerSec=12.03180512986515, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:35:16,851] [INFO] [timer.py:197:stop] 0/4938, RunningAvgSamplesPerSec=12.008008314881836, CurrSamplesPerSec=11.905265233034521, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:35:23,371] [INFO] [timer.py:197:stop] 0/4939, RunningAvgSamplesPerSec=12.007992513045728, CurrSamplesPerSec=11.930498116505838, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:35:28,029] [INFO] [logging.py:68:log_dist] [Rank 0] step=4940, skipped=8, lr=[1.5333333333333333e-07], mom=[[0.9, 0.999]] [2022-12-20 03:35:28,030] [INFO] [timer.py:197:stop] 0/4940, RunningAvgSamplesPerSec=12.008651081118746, CurrSamplesPerSec=16.46747615337931, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:35:34,505] [INFO] [timer.py:197:stop] 0/4941, RunningAvgSamplesPerSec=12.00864952877018, CurrSamplesPerSec=12.000988922547425, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:35:41,009] [INFO] [timer.py:197:stop] 0/4942, RunningAvgSamplesPerSec=12.00863608002774, CurrSamplesPerSec=11.942578200875573, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:35:47,462] [INFO] [timer.py:197:stop] 0/4943, RunningAvgSamplesPerSec=12.00862520235678, CurrSamplesPerSec=11.955128939121114, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:35:54,136] [INFO] [timer.py:197:stop] 0/4944, RunningAvgSamplesPerSec=12.008601125897082, CurrSamplesPerSec=11.890806493294674, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:36:00,694] [INFO] [timer.py:197:stop] 0/4945, RunningAvgSamplesPerSec=12.008601521839658, CurrSamplesPerSec=12.010558589019146, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:36:07,162] [INFO] [timer.py:197:stop] 0/4946, RunningAvgSamplesPerSec=12.008585335923078, CurrSamplesPerSec=11.929107973897546, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:36:13,687] [INFO] [timer.py:197:stop] 0/4947, RunningAvgSamplesPerSec=12.008575117926801, CurrSamplesPerSec=11.958269015268167, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:36:20,199] [INFO] [timer.py:197:stop] 0/4948, RunningAvgSamplesPerSec=12.008547910479841, CurrSamplesPerSec=11.875498044761086, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:36:26,667] [INFO] [timer.py:197:stop] 0/4949, RunningAvgSamplesPerSec=12.008540719848433, CurrSamplesPerSec=11.97308089705068, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:36:33,135] [INFO] [logging.py:68:log_dist] [Rank 0] step=4950, skipped=8, lr=[1.3111111111111113e-07], mom=[[0.9, 0.999]] [2022-12-20 03:36:33,136] [INFO] [timer.py:197:stop] 0/4950, RunningAvgSamplesPerSec=12.008542975135665, CurrSamplesPerSec=12.019710258466205, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0, 'learning_rate': 1.3111111111111113e-07, 'epoch': 130.26} [2022-12-20 03:36:39,765] [INFO] [timer.py:197:stop] 0/4951, RunningAvgSamplesPerSec=12.008526611752208, CurrSamplesPerSec=11.928102946727659, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:36:46,391] [INFO] [timer.py:197:stop] 0/4952, RunningAvgSamplesPerSec=12.008528960460607, CurrSamplesPerSec=12.02016398281648, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:36:53,041] [INFO] [timer.py:197:stop] 0/4953, RunningAvgSamplesPerSec=12.008514224242036, CurrSamplesPerSec=11.936010446809375, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:36:59,500] [INFO] [timer.py:197:stop] 0/4954, RunningAvgSamplesPerSec=12.008518553104436, CurrSamplesPerSec=12.029989078087217, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:37:05,982] [INFO] [timer.py:197:stop] 0/4955, RunningAvgSamplesPerSec=12.008507816400765, CurrSamplesPerSec=11.955574073459616, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:37:12,458] [INFO] [timer.py:197:stop] 0/4956, RunningAvgSamplesPerSec=12.008495633259239, CurrSamplesPerSec=11.94845430391108, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:37:18,969] [INFO] [timer.py:197:stop] 0/4957, RunningAvgSamplesPerSec=12.008467671467148, CurrSamplesPerSec=11.871524963969993, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:37:25,486] [INFO] [timer.py:197:stop] 0/4958, RunningAvgSamplesPerSec=12.00843664402419, CurrSamplesPerSec=11.856639477147093, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:37:32,117] [INFO] [timer.py:197:stop] 0/4959, RunningAvgSamplesPerSec=12.008418253593625, CurrSamplesPerSec=11.917961974469094, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:37:38,660] [INFO] [logging.py:68:log_dist] [Rank 0] step=4960, skipped=8, lr=[1.088888888888889e-07], mom=[[0.9, 0.999]] [2022-12-20 03:37:38,661] [INFO] [timer.py:197:stop] 0/4960, RunningAvgSamplesPerSec=12.00842141614929, CurrSamplesPerSec=12.024118701256736, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:37:45,265] [INFO] [timer.py:197:stop] 0/4961, RunningAvgSamplesPerSec=12.008420475184472, CurrSamplesPerSec=12.003756983752352, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:37:51,913] [INFO] [timer.py:197:stop] 0/4962, RunningAvgSamplesPerSec=12.008424730661805, CurrSamplesPerSec=12.02956480058566, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:37:58,357] [INFO] [timer.py:197:stop] 0/4963, RunningAvgSamplesPerSec=12.008423865314539, CurrSamplesPerSec=12.004133276755004, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:38:04,799] [INFO] [timer.py:197:stop] 0/4964, RunningAvgSamplesPerSec=12.00840722106865, CurrSamplesPerSec=11.926399134161821, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:38:11,334] [INFO] [timer.py:197:stop] 0/4965, RunningAvgSamplesPerSec=12.008382745640192, CurrSamplesPerSec=11.888151872603833, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:38:17,806] [INFO] [timer.py:197:stop] 0/4966, RunningAvgSamplesPerSec=12.008367631754806, CurrSamplesPerSec=11.933823154142758, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:38:24,410] [INFO] [timer.py:197:stop] 0/4967, RunningAvgSamplesPerSec=12.00836810130194, CurrSamplesPerSec=12.010699385861837, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:38:30,960] [INFO] [timer.py:197:stop] 0/4968, RunningAvgSamplesPerSec=12.008349833284846, CurrSamplesPerSec=11.918329203464383, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:38:37,458] [INFO] [timer.py:197:stop] 0/4969, RunningAvgSamplesPerSec=12.008328084912334, CurrSamplesPerSec=11.90128856994535, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:38:43,960] [INFO] [logging.py:68:log_dist] [Rank 0] step=4970, skipped=8, lr=[8.666666666666668e-08], mom=[[0.9, 0.999]] [2022-12-20 03:38:43,961] [INFO] [timer.py:197:stop] 0/4970, RunningAvgSamplesPerSec=12.00831246666597, CurrSamplesPerSec=11.931234673455371, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:38:50,472] [INFO] [timer.py:197:stop] 0/4971, RunningAvgSamplesPerSec=12.008295299518881, CurrSamplesPerSec=11.923610489863293, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:38:56,898] [INFO] [timer.py:197:stop] 0/4972, RunningAvgSamplesPerSec=12.008277537404123, CurrSamplesPerSec=11.920661688745472, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:39:03,372] [INFO] [timer.py:197:stop] 0/4973, RunningAvgSamplesPerSec=12.008264933199136, CurrSamplesPerSec=11.945947189599819, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:39:09,805] [INFO] [timer.py:197:stop] 0/4974, RunningAvgSamplesPerSec=12.0082443087022, CurrSamplesPerSec=11.906588033241784, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:39:16,278] [INFO] [timer.py:197:stop] 0/4975, RunningAvgSamplesPerSec=12.008215835506984, CurrSamplesPerSec=11.868296989042905, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0, 'learning_rate': 7.555555555555556e-08, 'epoch': 130.92} [2022-12-20 03:39:22,858] [INFO] [timer.py:197:stop] 0/4976, RunningAvgSamplesPerSec=12.008162542120141, CurrSamplesPerSec=11.748858686113625, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:39:29,379] [INFO] [timer.py:197:stop] 0/4977, RunningAvgSamplesPerSec=12.008137926051889, CurrSamplesPerSec=11.886933703273394, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:39:34,000] [INFO] [timer.py:197:stop] 0/4978, RunningAvgSamplesPerSec=12.008788725109264, CurrSamplesPerSec=16.442007651873855, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:39:40,533] [INFO] [timer.py:197:stop] 0/4979, RunningAvgSamplesPerSec=12.008754640537806, CurrSamplesPerSec=11.841512323118716, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:39:47,031] [INFO] [logging.py:68:log_dist] [Rank 0] step=4980, skipped=8, lr=[6.444444444444445e-08], mom=[[0.9, 0.999]] [2022-12-20 03:39:47,031] [INFO] [timer.py:197:stop] 0/4980, RunningAvgSamplesPerSec=12.00875388989061, CurrSamplesPerSec=12.005019080942184, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:39:53,496] [INFO] [timer.py:197:stop] 0/4981, RunningAvgSamplesPerSec=12.008748184326464, CurrSamplesPerSec=11.980412916178446, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:39:59,934] [INFO] [timer.py:197:stop] 0/4982, RunningAvgSamplesPerSec=12.008738119300284, CurrSamplesPerSec=11.958832656683786, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:40:06,394] [INFO] [timer.py:197:stop] 0/4983, RunningAvgSamplesPerSec=12.008721399193758, CurrSamplesPerSec=11.926028758458687, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:40:12,790] [INFO] [timer.py:197:stop] 0/4984, RunningAvgSamplesPerSec=12.00872402785835, CurrSamplesPerSec=12.021831700643261, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:40:19,271] [INFO] [timer.py:197:stop] 0/4985, RunningAvgSamplesPerSec=12.008705448971128, CurrSamplesPerSec=11.916853545810055, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:40:25,734] [INFO] [timer.py:197:stop] 0/4986, RunningAvgSamplesPerSec=12.008705350771514, CurrSamplesPerSec=12.008216042038594, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:40:32,183] [INFO] [timer.py:197:stop] 0/4987, RunningAvgSamplesPerSec=12.008687509492566, CurrSamplesPerSec=11.920420300068045, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:40:38,610] [INFO] [timer.py:197:stop] 0/4988, RunningAvgSamplesPerSec=12.008669093707226, CurrSamplesPerSec=11.917563022217992, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:40:45,102] [INFO] [timer.py:197:stop] 0/4989, RunningAvgSamplesPerSec=12.00864853574812, CurrSamplesPerSec=11.907014243459093, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:40:51,538] [INFO] [logging.py:68:log_dist] [Rank 0] step=4990, skipped=8, lr=[4.222222222222222e-08], mom=[[0.9, 0.999]] [2022-12-20 03:40:51,539] [INFO] [timer.py:197:stop] 0/4990, RunningAvgSamplesPerSec=12.00865085939925, CurrSamplesPerSec=12.020250102812403, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:40:58,002] [INFO] [timer.py:197:stop] 0/4991, RunningAvgSamplesPerSec=12.008650174509231, CurrSamplesPerSec=12.00523491485884, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:41:04,484] [INFO] [timer.py:197:stop] 0/4992, RunningAvgSamplesPerSec=12.00864119794204, CurrSamplesPerSec=11.964023531249076, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:41:11,029] [INFO] [timer.py:197:stop] 0/4993, RunningAvgSamplesPerSec=12.008616216765558, CurrSamplesPerSec=11.885241104591671, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:41:17,569] [INFO] [timer.py:197:stop] 0/4994, RunningAvgSamplesPerSec=12.008588982897615, CurrSamplesPerSec=11.874186345250775, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:41:24,043] [INFO] [timer.py:197:stop] 0/4995, RunningAvgSamplesPerSec=12.008567272225012, CurrSamplesPerSec=11.901157185895284, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:41:30,553] [INFO] [timer.py:197:stop] 0/4996, RunningAvgSamplesPerSec=12.008544185573175, CurrSamplesPerSec=11.89436873814182, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:41:37,062] [INFO] [timer.py:197:stop] 0/4997, RunningAvgSamplesPerSec=12.008523161674619, CurrSamplesPerSec=11.904440017973153, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:41:43,507] [INFO] [timer.py:197:stop] 0/4998, RunningAvgSamplesPerSec=12.008524828049, CurrSamplesPerSec=12.016854142583346, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:41:49,927] [INFO] [timer.py:197:stop] 0/4999, RunningAvgSamplesPerSec=12.008525593184363, CurrSamplesPerSec=12.012349426920148, MemAllocated=1.52GB, MaxMemAllocated=26.06GB [2022-12-20 03:41:56,393] [INFO] [logging.py:68:log_dist] [Rank 0] step=5000, skipped=8, lr=[2e-08], mom=[[0.9, 0.999]] [2022-12-20 03:41:56,394] [INFO] [timer.py:197:stop] 0/5000, RunningAvgSamplesPerSec=12.008528795640862, CurrSamplesPerSec=12.024552828812292, MemAllocated=1.52GB, MaxMemAllocated=26.06GB {'loss': 0.0, 'learning_rate': 2e-08, 'epoch': 131.58} {'eval_loss': 0.451171875, 'eval_wer': 17.988338192419825, 'eval_runtime': 166.9, 'eval_samples_per_second': 7.232, 'eval_steps_per_second': 0.228, 'epoch': 131.58} [2022-12-20 03:44:45,126] [INFO] [logging.py:68:log_dist] [Rank 0] [Torch] Checkpoint global_step5000 is begin to save! [2022-12-20 03:44:45,134] [INFO] [logging.py:68:log_dist] [Rank 0] Saving model checkpoint: ./checkpoint-5000/global_step5000/mp_rank_00_model_states.pt [2022-12-20 03:44:45,134] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving ./checkpoint-5000/global_step5000/mp_rank_00_model_states.pt... [2022-12-20 03:44:46,917] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved ./checkpoint-5000/global_step5000/mp_rank_00_model_states.pt. [2022-12-20 03:44:46,918] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving ./checkpoint-5000/global_step5000/zero_pp_rank_0_mp_rank_00_optim_states.pt... [2022-12-20 03:44:54,246] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved ./checkpoint-5000/global_step5000/zero_pp_rank_0_mp_rank_00_optim_states.pt. [2022-12-20 03:44:54,246] [INFO] [engine.py:3269:_save_zero_checkpoint] zero checkpoint saved ./checkpoint-5000/global_step5000/zero_pp_rank_0_mp_rank_00_optim_states.pt [2022-12-20 03:44:54,246] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step5000 is ready now! [2022-12-20 03:46:14,829] [INFO] [logging.py:68:log_dist] [Rank 0] DeepSpeed info: version=0.7.7, git-hash=unknown, git-branch=unknown [2022-12-20 03:46:14,877] [INFO] [logging.py:68:log_dist] [Rank 0] DeepSpeed Flops Profiler Enabled: False [2022-12-20 03:46:16,211] [WARNING] [cpu_adam.py:83:__init__] FP16 params for CPUAdam may not work on AMD CPUs Installed CUDA version 11.6 does not match the version torch was compiled with 11.7 but since the APIs are compatible, accepting this combination Time to load cpu_adam op: 2.9200432300567627 seconds Adam Optimizer #1 is created with AVX2 arithmetic capability. Config: alpha=0.000010, betas=(0.900000, 0.999000), weight_decay=0.000000, adam_w=1 [2022-12-20 03:46:19,653] [INFO] [logging.py:68:log_dist] [Rank 0] Using DeepSpeed Optimizer param name adamw as basic optimizer [2022-12-20 03:46:19,826] [INFO] [logging.py:68:log_dist] [Rank 0] DeepSpeed Basic Optimizer = DeepSpeedCPUAdam [2022-12-20 03:46:19,826] [INFO] [utils.py:52:is_zero_supported_optimizer] Checking ZeRO support for optimizer=DeepSpeedCPUAdam type= [2022-12-20 03:46:19,826] [INFO] [logging.py:68:log_dist] [Rank 0] Creating fp16 ZeRO stage 2 optimizer [2022-12-20 03:46:19,826] [INFO] [stage_1_and_2.py:140:__init__] Reduce bucket size 200000000 [2022-12-20 03:46:19,826] [INFO] [stage_1_and_2.py:141:__init__] Allgather bucket size 200000000 [2022-12-20 03:46:19,826] [INFO] [stage_1_and_2.py:142:__init__] CPU Offload: True [2022-12-20 03:46:19,826] [INFO] [stage_1_and_2.py:143:__init__] Round robin gradient partitioning: False Time to load utils op: 0.0003986358642578125 seconds Rank: 0 partition count [1] and sizes[(763857920, False)] [2022-12-20 03:46:21,910] [INFO] [utils.py:827:see_memory_usage] Before initializing optimizer states [2022-12-20 03:46:21,911] [INFO] [utils.py:828:see_memory_usage] MA 3.04 GB Max_MA 26.06 GB CA 31.32 GB Max_CA 31 GB [2022-12-20 03:46:21,911] [INFO] [utils.py:836:see_memory_usage] CPU Virtual Memory: used = 25.7 GB, percent = 13.1% [2022-12-20 03:46:23,916] [INFO] [utils.py:827:see_memory_usage] After initializing optimizer states [2022-12-20 03:46:23,917] [INFO] [utils.py:828:see_memory_usage] MA 3.04 GB Max_MA 3.04 GB CA 31.32 GB Max_CA 31 GB [2022-12-20 03:46:23,917] [INFO] [utils.py:836:see_memory_usage] CPU Virtual Memory: used = 35.47 GB, percent = 18.0% [2022-12-20 03:46:23,917] [INFO] [stage_1_and_2.py:525:__init__] optimizer state initialized [2022-12-20 03:46:23,992] [INFO] [utils.py:827:see_memory_usage] After initializing ZeRO optimizer [2022-12-20 03:46:23,993] [INFO] [utils.py:828:see_memory_usage] MA 3.04 GB Max_MA 3.04 GB CA 31.32 GB Max_CA 31 GB [2022-12-20 03:46:23,993] [INFO] [utils.py:836:see_memory_usage] CPU Virtual Memory: used = 35.46 GB, percent = 18.0% [2022-12-20 03:46:24,015] [INFO] [logging.py:68:log_dist] [Rank 0] DeepSpeed Final Optimizer = adamw [2022-12-20 03:46:24,015] [INFO] [logging.py:68:log_dist] [Rank 0] DeepSpeed using configured LR scheduler = WarmupDecayLR [2022-12-20 03:46:24,015] [INFO] [logging.py:68:log_dist] [Rank 0] DeepSpeed LR Scheduler = [2022-12-20 03:46:24,016] [INFO] [logging.py:68:log_dist] [Rank 0] step=0, skipped=0, lr=[1e-05], mom=[[0.9, 0.999]] [2022-12-20 03:46:24,017] [INFO] [config.py:1020:print] DeepSpeedEngine configuration: [2022-12-20 03:46:24,017] [INFO] [config.py:1024:print] activation_checkpointing_config { "partition_activations": false, "contiguous_memory_optimization": false, "cpu_checkpointing": false, "number_checkpoints": null, "synchronize_checkpoint_boundary": false, "profile": false } [2022-12-20 03:46:24,017] [INFO] [config.py:1024:print] aio_config ................... {'block_size': 1048576, 'queue_depth': 8, 'thread_count': 1, 'single_submit': False, 'overlap_events': True} [2022-12-20 03:46:24,017] [INFO] [config.py:1024:print] amp_enabled .................. False [2022-12-20 03:46:24,017] [INFO] [config.py:1024:print] amp_params ................... False [2022-12-20 03:46:24,017] [INFO] [config.py:1024:print] autotuning_config ............ { "enabled": false, "start_step": null, "end_step": null, "metric_path": null, "arg_mappings": null, "metric": "throughput", "model_info": null, "results_dir": "autotuning_results", "exps_dir": "autotuning_exps", "overwrite": true, "fast": true, "start_profile_step": 3, "end_profile_step": 5, "tuner_type": "gridsearch", "tuner_early_stopping": 5, "tuner_num_trials": 50, "model_info_path": null, "mp_size": 1, "max_train_batch_size": null, "min_train_batch_size": 1, "max_train_micro_batch_size_per_gpu": 1.024000e+03, "min_train_micro_batch_size_per_gpu": 1, "num_tuning_micro_batch_sizes": 3 } [2022-12-20 03:46:24,017] [INFO] [config.py:1024:print] bfloat16_enabled ............. False [2022-12-20 03:46:24,017] [INFO] [config.py:1024:print] checkpoint_parallel_write_pipeline False [2022-12-20 03:46:24,017] [INFO] [config.py:1024:print] checkpoint_tag_validation_enabled True [2022-12-20 03:46:24,018] [INFO] [config.py:1024:print] checkpoint_tag_validation_fail False [2022-12-20 03:46:24,018] [INFO] [config.py:1024:print] comms_config ................. [2022-12-20 03:46:24,018] [INFO] [config.py:1024:print] communication_data_type ...... None [2022-12-20 03:46:24,018] [INFO] [config.py:1024:print] compression_config ........... {'weight_quantization': {'shared_parameters': {'enabled': False, 'quantizer_kernel': False, 'schedule_offset': 0, 'quantize_groups': 1, 'quantize_verbose': False, 'quantization_type': 'symmetric', 'quantize_weight_in_forward': False, 'rounding': 'nearest', 'fp16_mixed_quantize': False, 'quantize_change_ratio': 0.001}, 'different_groups': {}}, 'activation_quantization': {'shared_parameters': {'enabled': False, 'quantization_type': 'symmetric', 'range_calibration': 'dynamic', 'schedule_offset': 1000}, 'different_groups': {}}, 'sparse_pruning': {'shared_parameters': {'enabled': False, 'method': 'l1', 'schedule_offset': 1000}, 'different_groups': {}}, 'row_pruning': {'shared_parameters': {'enabled': False, 'method': 'l1', 'schedule_offset': 1000}, 'different_groups': {}}, 'head_pruning': {'shared_parameters': {'enabled': False, 'method': 'topk', 'schedule_offset': 1000}, 'different_groups': {}}, 'channel_pruning': {'shared_parameters': {'enabled': False, 'method': 'l1', 'schedule_offset': 1000}, 'different_groups': {}}, 'layer_reduction': {'enabled': False}} [2022-12-20 03:46:24,018] [INFO] [config.py:1024:print] curriculum_enabled ........... False [2022-12-20 03:46:24,018] [INFO] [config.py:1024:print] curriculum_params ............ False [2022-12-20 03:46:24,018] [INFO] [config.py:1024:print] dataloader_drop_last ......... False [2022-12-20 03:46:24,018] [INFO] [config.py:1024:print] disable_allgather ............ False [2022-12-20 03:46:24,018] [INFO] [config.py:1024:print] dump_state ................... False [2022-12-20 03:46:24,018] [INFO] [config.py:1024:print] dynamic_loss_scale_args ...... {'init_scale': 65536, 'scale_window': 1000, 'delayed_shift': 2, 'min_scale': 1} [2022-12-20 03:46:24,018] [INFO] [config.py:1024:print] eigenvalue_enabled ........... False [2022-12-20 03:46:24,018] [INFO] [config.py:1024:print] eigenvalue_gas_boundary_resolution 1 [2022-12-20 03:46:24,018] [INFO] [config.py:1024:print] eigenvalue_layer_name ........ bert.encoder.layer [2022-12-20 03:46:24,018] [INFO] [config.py:1024:print] eigenvalue_layer_num ......... 0 [2022-12-20 03:46:24,018] [INFO] [config.py:1024:print] eigenvalue_max_iter .......... 100 [2022-12-20 03:46:24,018] [INFO] [config.py:1024:print] eigenvalue_stability ......... 1e-06 [2022-12-20 03:46:24,018] [INFO] [config.py:1024:print] eigenvalue_tol ............... 0.01 [2022-12-20 03:46:24,018] [INFO] [config.py:1024:print] eigenvalue_verbose ........... False [2022-12-20 03:46:24,018] [INFO] [config.py:1024:print] elasticity_enabled ........... False [2022-12-20 03:46:24,018] [INFO] [config.py:1024:print] flops_profiler_config ........ { "enabled": false, "profile_step": 1, "module_depth": -1, "top_modules": 1, "detailed": true, "output_file": null } [2022-12-20 03:46:24,018] [INFO] [config.py:1024:print] fp16_auto_cast ............... False [2022-12-20 03:46:24,018] [INFO] [config.py:1024:print] fp16_enabled ................. True [2022-12-20 03:46:24,018] [INFO] [config.py:1024:print] fp16_master_weights_and_gradients False [2022-12-20 03:46:24,018] [INFO] [config.py:1024:print] global_rank .................. 0 [2022-12-20 03:46:24,018] [INFO] [config.py:1024:print] grad_accum_dtype ............. None [2022-12-20 03:46:24,018] [INFO] [config.py:1024:print] gradient_accumulation_steps .. 1 [2022-12-20 03:46:24,018] [INFO] [config.py:1024:print] gradient_clipping ............ 1.0 [2022-12-20 03:46:24,018] [INFO] [config.py:1024:print] gradient_predivide_factor .... 1.0 [2022-12-20 03:46:24,018] [INFO] [config.py:1024:print] initial_dynamic_scale ........ 65536 [2022-12-20 03:46:24,018] [INFO] [config.py:1024:print] load_universal_checkpoint .... False [2022-12-20 03:46:24,018] [INFO] [config.py:1024:print] loss_scale ................... 0 [2022-12-20 03:46:24,018] [INFO] [config.py:1024:print] memory_breakdown ............. False [2022-12-20 03:46:24,018] [INFO] [config.py:1024:print] monitor_config ............... [2022-12-20 03:46:24,018] [INFO] [config.py:1024:print] nebula_config ................ { "enabled": false, "persistent_storage_path": null, "persistent_time_interval": 100, "num_of_version_in_retention": 2, "enable_nebula_load": true, "load_path": null } [2022-12-20 03:46:24,019] [INFO] [config.py:1024:print] optimizer_legacy_fusion ...... False [2022-12-20 03:46:24,019] [INFO] [config.py:1024:print] optimizer_name ............... adamw [2022-12-20 03:46:24,019] [INFO] [config.py:1024:print] optimizer_params ............. {'lr': 1e-05, 'betas': [0.9, 0.999], 'eps': 1e-08, 'weight_decay': 0.0} [2022-12-20 03:46:24,019] [INFO] [config.py:1024:print] pipeline ..................... {'stages': 'auto', 'partition': 'best', 'seed_layers': False, 'activation_checkpoint_interval': 0} [2022-12-20 03:46:24,019] [INFO] [config.py:1024:print] pld_enabled .................. False [2022-12-20 03:46:24,019] [INFO] [config.py:1024:print] pld_params ................... False [2022-12-20 03:46:24,019] [INFO] [config.py:1024:print] prescale_gradients ........... False [2022-12-20 03:46:24,019] [INFO] [config.py:1024:print] scheduler_name ............... WarmupDecayLR [2022-12-20 03:46:24,019] [INFO] [config.py:1024:print] scheduler_params ............. {'last_batch_iteration': -1, 'total_num_steps': 5000, 'warmup_min_lr': 0, 'warmup_max_lr': 1e-05, 'warmup_num_steps': 500} [2022-12-20 03:46:24,019] [INFO] [config.py:1024:print] sparse_attention ............. None [2022-12-20 03:46:24,019] [INFO] [config.py:1024:print] sparse_gradients_enabled ..... False [2022-12-20 03:46:24,019] [INFO] [config.py:1024:print] steps_per_print .............. 10 [2022-12-20 03:46:24,019] [INFO] [config.py:1024:print] train_batch_size ............. 64 [2022-12-20 03:46:24,019] [INFO] [config.py:1024:print] train_micro_batch_size_per_gpu 64 [2022-12-20 03:46:24,019] [INFO] [config.py:1024:print] use_node_local_storage ....... False [2022-12-20 03:46:24,019] [INFO] [config.py:1024:print] wall_clock_breakdown ......... False [2022-12-20 03:46:24,019] [INFO] [config.py:1024:print] world_size ................... 1 [2022-12-20 03:46:24,019] [INFO] [config.py:1024:print] zero_allow_untested_optimizer False [2022-12-20 03:46:24,019] [INFO] [config.py:1024:print] zero_config .................. stage=2 contiguous_gradients=True reduce_scatter=True reduce_bucket_size=200000000 allgather_partitions=True allgather_bucket_size=200000000 overlap_comm=True load_from_fp32_weights=True elastic_checkpoint=False offload_param=None offload_optimizer=DeepSpeedZeroOffloadOptimizerConfig(device='cpu', nvme_path=None, buffer_count=4, pin_memory=True, pipeline=False, pipeline_read=False, pipeline_write=False, fast_init=False) sub_group_size=1,000,000,000 cpu_offload_param=None cpu_offload_use_pin_memory=None cpu_offload=None prefetch_bucket_size=50,000,000 param_persistence_threshold=100,000 model_persistence_threshold=sys.maxsize max_live_parameters=1,000,000,000 max_reuse_distance=1,000,000,000 gather_16bit_weights_on_model_save=False stage3_gather_fp16_weights_on_model_save=False ignore_unused_parameters=True legacy_stage1=False round_robin_gradients=False [2022-12-20 03:46:24,019] [INFO] [config.py:1024:print] zero_enabled ................. True [2022-12-20 03:46:24,019] [INFO] [config.py:1024:print] zero_optimization_stage ...... 2 [2022-12-20 03:46:24,019] [INFO] [config.py:1009:print_user_config] json = { "fp16": { "enabled": true, "loss_scale": 0, "loss_scale_window": 1000, "initial_scale_power": 16, "hysteresis": 2, "min_loss_scale": 1 }, "optimizer": { "type": "AdamW", "params": { "lr": 1e-05, "betas": [0.9, 0.999], "eps": 1e-08, "weight_decay": 0.0 } }, "scheduler": { "type": "WarmupDecayLR", "params": { "last_batch_iteration": -1, "total_num_steps": 5.000000e+03, "warmup_min_lr": 0, "warmup_max_lr": 1e-05, "warmup_num_steps": 500 } }, "zero_optimization": { "stage": 2, "offload_optimizer": { "device": "cpu", "pin_memory": true }, "allgather_partitions": true, "allgather_bucket_size": 2.000000e+08, "overlap_comm": true, "reduce_scatter": true, "reduce_bucket_size": 2.000000e+08, "contiguous_gradients": true }, "gradient_accumulation_steps": 1, "gradient_clipping": 1.0, "train_batch_size": 64, "train_micro_batch_size_per_gpu": 64 } Time to load utils op: 0.0003142356872558594 seconds [2022-12-20 03:46:24,020] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from ./checkpoint-3000/global_step3000/mp_rank_00_model_states.pt... [2022-12-20 03:46:24,545] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from ./checkpoint-3000/global_step3000/mp_rank_00_model_states.pt. [2022-12-20 03:46:24,607] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from ./checkpoint-3000/global_step3000/mp_rank_00_model_states.pt... [2022-12-20 03:46:25,147] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from ./checkpoint-3000/global_step3000/mp_rank_00_model_states.pt. [2022-12-20 03:46:25,353] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from ./checkpoint-3000/global_step3000/zero_pp_rank_0_mp_rank_00_optim_states.pt... [2022-12-20 03:46:28,248] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from ./checkpoint-3000/global_step3000/zero_pp_rank_0_mp_rank_00_optim_states.pt. [2022-12-20 03:46:28,248] [INFO] [engine.py:2900:_get_all_zero_checkpoint_state_dicts] successfully read 1 ZeRO state_dicts for rank 0 [2022-12-20 03:46:28,935] [INFO] [engine.py:2840:_load_zero_checkpoint] loading 1 zero partition checkpoints for rank 0 {'train_runtime': 33627.3667, 'train_samples_per_second': 9.516, 'train_steps_per_second': 0.149, 'train_loss': 0.007872294521331787, 'epoch': 131.58} 12/20/2022 03:47:54 - WARNING - huggingface_hub.repository - Several commits (2) will be pushed upstream. 12/20/2022 03:47:54 - WARNING - huggingface_hub.repository - The progress bars may be unreliable. 12/20/2022 03:48:00 - WARNING - huggingface_hub.repository - remote: Scanning LFS files for validity, may be slow... remote: LFS file scan complete. To https://huggingface.co/mikr/whisper-medium-sl-cv11 ccb40c0..c72c9a2 main -> main 12/20/2022 03:48:48 - WARNING - huggingface_hub.repository - To https://huggingface.co/mikr/whisper-medium-sl-cv11 c72c9a2..f3f29e8 main -> main ***** train metrics ***** epoch = 131.58 train_loss = 0.0079 train_runtime = 9:20:27.36 train_samples_per_second = 9.516 train_steps_per_second = 0.149 12/20/2022 03:48:52 - INFO - __main__ - *** Evaluate *** ***** eval metrics ***** epoch = 131.58 eval_loss = 0.4331 eval_runtime = 0:02:48.34 eval_samples_per_second = 7.17 eval_steps_per_second = 0.226 eval_wer = 17.93 12/20/2022 03:52:36 - WARNING - huggingface_hub.repository - remote: Scanning LFS files for validity, may be slow... remote: LFS file scan complete. To https://huggingface.co/mikr/whisper-medium-sl-cv11 f3f29e8..414a767 main -> main