[2024-08-05 14:38:20,648][15372] Saving configuration to /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/config.json... [2024-08-05 14:38:20,648][15372] Rollout worker 0 uses device cpu [2024-08-05 14:38:20,648][15372] Rollout worker 1 uses device cpu [2024-08-05 14:38:20,648][15372] Rollout worker 2 uses device cpu [2024-08-05 14:38:20,648][15372] Rollout worker 3 uses device cpu [2024-08-05 14:38:20,649][15372] Rollout worker 4 uses device cpu [2024-08-05 14:38:20,649][15372] Rollout worker 5 uses device cpu [2024-08-05 14:38:20,649][15372] Rollout worker 6 uses device cpu [2024-08-05 14:38:20,649][15372] Rollout worker 7 uses device cpu [2024-08-05 14:38:20,649][15372] Rollout worker 8 uses device cpu [2024-08-05 14:38:20,649][15372] Rollout worker 9 uses device cpu [2024-08-05 14:38:20,649][15372] Rollout worker 10 uses device cpu [2024-08-05 14:38:20,649][15372] Rollout worker 11 uses device cpu [2024-08-05 14:38:20,649][15372] Rollout worker 12 uses device cpu [2024-08-05 14:38:20,649][15372] Rollout worker 13 uses device cpu [2024-08-05 14:38:20,649][15372] Rollout worker 14 uses device cpu [2024-08-05 14:38:20,649][15372] Rollout worker 15 uses device cpu [2024-08-05 14:38:20,650][15372] Rollout worker 16 uses device cpu [2024-08-05 14:38:20,650][15372] Rollout worker 17 uses device cpu [2024-08-05 14:38:20,650][15372] Rollout worker 18 uses device cpu [2024-08-05 14:38:20,650][15372] Rollout worker 19 uses device cpu [2024-08-05 14:38:21,027][15372] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-08-05 14:38:21,027][15372] InferenceWorker_p0-w0: min num requests: 6 [2024-08-05 14:38:21,071][15372] Starting all processes... [2024-08-05 14:38:21,071][15372] Starting process learner_proc0 [2024-08-05 14:38:22,485][15372] Starting all processes... [2024-08-05 14:38:22,488][15417] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-08-05 14:38:22,489][15417] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 [2024-08-05 14:38:22,490][15372] Starting process inference_proc0-0 [2024-08-05 14:38:22,492][15372] Starting process rollout_proc0 [2024-08-05 14:38:22,496][15372] Starting process rollout_proc1 [2024-08-05 14:38:22,497][15372] Starting process rollout_proc2 [2024-08-05 14:38:22,500][15372] Starting process rollout_proc3 [2024-08-05 14:38:22,513][15417] Num visible devices: 1 [2024-08-05 14:38:22,500][15372] Starting process rollout_proc4 [2024-08-05 14:38:22,500][15372] Starting process rollout_proc5 [2024-08-05 14:38:22,502][15372] Starting process rollout_proc6 [2024-08-05 14:38:22,511][15372] Starting process rollout_proc7 [2024-08-05 14:38:22,517][15372] Starting process rollout_proc8 [2024-08-05 14:38:22,520][15372] Starting process rollout_proc9 [2024-08-05 14:38:22,522][15372] Starting process rollout_proc10 [2024-08-05 14:38:22,540][15417] Starting seed is not provided [2024-08-05 14:38:22,527][15372] Starting process rollout_proc11 [2024-08-05 14:38:22,540][15417] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-08-05 14:38:22,540][15417] Initializing actor-critic model on device cuda:0 [2024-08-05 14:38:22,540][15417] RunningMeanStd input shape: (23,) [2024-08-05 14:38:22,541][15417] RunningMeanStd input shape: (3, 72, 128) [2024-08-05 14:38:22,541][15417] RunningMeanStd input shape: (1,) [2024-08-05 14:38:22,555][15417] ConvEncoder: input_channels=3 [2024-08-05 14:38:22,537][15372] Starting process rollout_proc12 [2024-08-05 14:38:22,537][15372] Starting process rollout_proc13 [2024-08-05 14:38:22,546][15372] Starting process rollout_proc14 [2024-08-05 14:38:22,819][15417] Conv encoder output size: 512 [2024-08-05 14:38:22,820][15417] Policy head output size: 640 [2024-08-05 14:38:22,864][15417] Created Actor Critic model with architecture: [2024-08-05 14:38:22,864][15417] ActorCriticSharedWeights( (obs_normalizer): ObservationNormalizer( (running_mean_std): RunningMeanStdDictInPlace( (running_mean_std): ModuleDict( (measurements): RunningMeanStdInPlace() (obs): RunningMeanStdInPlace() ) ) ) (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) (encoder): VizdoomEncoder( (basic_encoder): ConvEncoder( (enc): RecursiveScriptModule( original_name=ConvEncoderImpl (conv_head): RecursiveScriptModule( original_name=Sequential (0): RecursiveScriptModule(original_name=Conv2d) (1): RecursiveScriptModule(original_name=ELU) (2): RecursiveScriptModule(original_name=Conv2d) (3): RecursiveScriptModule(original_name=ELU) (4): RecursiveScriptModule(original_name=Conv2d) (5): RecursiveScriptModule(original_name=ELU) ) (mlp_layers): RecursiveScriptModule( original_name=Sequential (0): RecursiveScriptModule(original_name=Linear) (1): RecursiveScriptModule(original_name=ELU) ) ) ) (measurements_head): Sequential( (0): Linear(in_features=23, out_features=128, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=128, out_features=128, bias=True) (3): ELU(alpha=1.0) ) ) (core): ModelCoreRNN( (core): GRU(640, 512) ) (decoder): MlpDecoder( (mlp): Identity() ) (critic_linear): Linear(in_features=512, out_features=1, bias=True) (action_parameterization): ActionParameterizationDefault( (distribution_linear): Linear(in_features=512, out_features=21, bias=True) ) ) [2024-08-05 14:38:23,524][15417] Using optimizer [2024-08-05 14:38:24,769][15417] No checkpoints found [2024-08-05 14:38:24,769][15417] Did not load from checkpoint, starting from scratch! [2024-08-05 14:38:24,769][15417] Initialized policy 0 weights for model version 0 [2024-08-05 14:38:24,771][15417] LearnerWorker_p0 finished initialization! [2024-08-05 14:38:24,772][15417] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-08-05 14:38:25,350][15372] Starting process rollout_proc15 [2024-08-05 14:38:25,361][15460] Worker 6 uses CPU cores [6] [2024-08-05 14:38:25,517][15372] Starting process rollout_proc16 [2024-08-05 14:38:25,528][15444] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-08-05 14:38:25,532][15444] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 [2024-08-05 14:38:25,592][15372] Starting process rollout_proc17 [2024-08-05 14:38:25,605][15464] Worker 10 uses CPU cores [2] [2024-08-05 14:38:25,624][15444] Num visible devices: 1 [2024-08-05 14:38:25,841][15372] Starting process rollout_proc18 [2024-08-05 14:38:25,848][15469] Worker 14 uses CPU cores [6] [2024-08-05 14:38:25,874][15372] Starting process rollout_proc19 [2024-08-05 14:38:25,892][15467] Worker 9 uses CPU cores [1] [2024-08-05 14:38:25,956][15468] Worker 13 uses CPU cores [5] [2024-08-05 14:38:25,992][15448] Worker 1 uses CPU cores [1] [2024-08-05 14:38:26,016][15457] Worker 3 uses CPU cores [3] [2024-08-05 14:38:26,134][15466] Worker 12 uses CPU cores [4] [2024-08-05 14:38:26,137][15462] Worker 7 uses CPU cores [7] [2024-08-05 14:38:26,172][15461] Worker 5 uses CPU cores [5] [2024-08-05 14:38:26,289][15447] Worker 0 uses CPU cores [0] [2024-08-05 14:38:26,292][15465] Worker 11 uses CPU cores [3] [2024-08-05 14:38:26,327][15444] RunningMeanStd input shape: (23,) [2024-08-05 14:38:26,327][15444] RunningMeanStd input shape: (3, 72, 128) [2024-08-05 14:38:26,328][15444] RunningMeanStd input shape: (1,) [2024-08-05 14:38:26,340][15444] ConvEncoder: input_channels=3 [2024-08-05 14:38:26,457][15463] Worker 8 uses CPU cores [0] [2024-08-05 14:38:26,532][15444] Conv encoder output size: 512 [2024-08-05 14:38:26,537][15444] Policy head output size: 640 [2024-08-05 14:38:26,635][15458] Worker 4 uses CPU cores [4] [2024-08-05 14:38:26,747][15456] Worker 2 uses CPU cores [2] [2024-08-05 14:38:27,313][15588] Worker 15 uses CPU cores [7] [2024-08-05 14:38:27,508][15626] Worker 19 uses CPU cores [6, 7] [2024-08-05 14:38:27,510][15600] Worker 17 uses CPU cores [2, 3] [2024-08-05 14:38:27,520][15596] Worker 16 uses CPU cores [0, 1] [2024-08-05 14:38:27,669][15372] Inference worker 0-0 is ready! [2024-08-05 14:38:27,670][15372] All inference workers are ready! Signal rollout workers to start! [2024-08-05 14:38:27,671][15372] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-08-05 14:38:27,678][15625] Worker 18 uses CPU cores [4, 5] [2024-08-05 14:38:27,720][15457] Doom resolution: 160x120, resize resolution: (128, 72) [2024-08-05 14:38:27,722][15462] Doom resolution: 160x120, resize resolution: (128, 72) [2024-08-05 14:38:27,723][15467] Doom resolution: 160x120, resize resolution: (128, 72) [2024-08-05 14:38:27,721][15588] Doom resolution: 160x120, resize resolution: (128, 72) [2024-08-05 14:38:27,722][15465] Doom resolution: 160x120, resize resolution: (128, 72) [2024-08-05 14:38:27,722][15448] Doom resolution: 160x120, resize resolution: (128, 72) [2024-08-05 14:38:27,735][15468] Doom resolution: 160x120, resize resolution: (128, 72) [2024-08-05 14:38:27,741][15466] Doom resolution: 160x120, resize resolution: (128, 72) [2024-08-05 14:38:27,742][15463] Doom resolution: 160x120, resize resolution: (128, 72) [2024-08-05 14:38:27,743][15460] Doom resolution: 160x120, resize resolution: (128, 72) [2024-08-05 14:38:27,738][15461] Doom resolution: 160x120, resize resolution: (128, 72) [2024-08-05 14:38:27,744][15458] Doom resolution: 160x120, resize resolution: (128, 72) [2024-08-05 14:38:27,745][15456] Doom resolution: 160x120, resize resolution: (128, 72) [2024-08-05 14:38:27,747][15626] Doom resolution: 160x120, resize resolution: (128, 72) [2024-08-05 14:38:27,740][15600] Doom resolution: 160x120, resize resolution: (128, 72) [2024-08-05 14:38:27,749][15469] Doom resolution: 160x120, resize resolution: (128, 72) [2024-08-05 14:38:27,746][15447] Doom resolution: 160x120, resize resolution: (128, 72) [2024-08-05 14:38:27,746][15596] Doom resolution: 160x120, resize resolution: (128, 72) [2024-08-05 14:38:27,753][15464] Doom resolution: 160x120, resize resolution: (128, 72) [2024-08-05 14:38:27,778][15625] Doom resolution: 160x120, resize resolution: (128, 72) [2024-08-05 14:38:28,119][15372] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-08-05 14:38:28,702][15462] Decorrelating experience for 0 frames... [2024-08-05 14:38:28,704][15461] Decorrelating experience for 0 frames... [2024-08-05 14:38:28,734][15463] Decorrelating experience for 0 frames... [2024-08-05 14:38:28,776][15464] Decorrelating experience for 0 frames... [2024-08-05 14:38:28,810][15465] Decorrelating experience for 0 frames... [2024-08-05 14:38:28,813][15457] Decorrelating experience for 0 frames... [2024-08-05 14:38:28,885][15468] Decorrelating experience for 0 frames... [2024-08-05 14:38:28,902][15467] Decorrelating experience for 0 frames... [2024-08-05 14:38:28,903][15462] Decorrelating experience for 32 frames... [2024-08-05 14:38:28,904][15448] Decorrelating experience for 0 frames... [2024-08-05 14:38:28,953][15456] Decorrelating experience for 0 frames... [2024-08-05 14:38:29,053][15466] Decorrelating experience for 0 frames... [2024-08-05 14:38:29,085][15626] Decorrelating experience for 0 frames... [2024-08-05 14:38:29,109][15465] Decorrelating experience for 32 frames... [2024-08-05 14:38:29,109][15625] Decorrelating experience for 0 frames... [2024-08-05 14:38:29,132][15469] Decorrelating experience for 0 frames... [2024-08-05 14:38:29,142][15456] Decorrelating experience for 32 frames... [2024-08-05 14:38:29,161][15462] Decorrelating experience for 64 frames... [2024-08-05 14:38:29,184][15461] Decorrelating experience for 32 frames... [2024-08-05 14:38:29,262][15448] Decorrelating experience for 32 frames... [2024-08-05 14:38:29,307][15625] Decorrelating experience for 32 frames... [2024-08-05 14:38:29,352][15456] Decorrelating experience for 64 frames... [2024-08-05 14:38:29,460][15469] Decorrelating experience for 32 frames... [2024-08-05 14:38:29,467][15596] Decorrelating experience for 0 frames... [2024-08-05 14:38:29,509][15588] Decorrelating experience for 0 frames... [2024-08-05 14:38:29,524][15600] Decorrelating experience for 0 frames... [2024-08-05 14:38:29,529][15626] Decorrelating experience for 32 frames... [2024-08-05 14:38:29,544][15462] Decorrelating experience for 96 frames... [2024-08-05 14:38:29,557][15465] Decorrelating experience for 64 frames... [2024-08-05 14:38:29,669][15456] Decorrelating experience for 96 frames... [2024-08-05 14:38:29,672][15448] Decorrelating experience for 64 frames... [2024-08-05 14:38:29,700][15466] Decorrelating experience for 32 frames... [2024-08-05 14:38:29,740][15626] Decorrelating experience for 64 frames... [2024-08-05 14:38:29,744][15457] Decorrelating experience for 32 frames... [2024-08-05 14:38:29,809][15447] Decorrelating experience for 0 frames... [2024-08-05 14:38:29,813][15588] Decorrelating experience for 32 frames... [2024-08-05 14:38:29,860][15600] Decorrelating experience for 32 frames... [2024-08-05 14:38:29,903][15448] Decorrelating experience for 96 frames... [2024-08-05 14:38:30,019][15588] Decorrelating experience for 64 frames... [2024-08-05 14:38:30,036][15463] Decorrelating experience for 32 frames... [2024-08-05 14:38:30,043][15464] Decorrelating experience for 32 frames... [2024-08-05 14:38:30,083][15468] Decorrelating experience for 32 frames... [2024-08-05 14:38:30,144][15456] Decorrelating experience for 128 frames... [2024-08-05 14:38:30,148][15447] Decorrelating experience for 32 frames... [2024-08-05 14:38:30,181][15469] Decorrelating experience for 64 frames... [2024-08-05 14:38:30,204][15461] Decorrelating experience for 64 frames... [2024-08-05 14:38:30,261][15467] Decorrelating experience for 32 frames... [2024-08-05 14:38:30,267][15460] Decorrelating experience for 0 frames... [2024-08-05 14:38:30,354][15588] Decorrelating experience for 96 frames... [2024-08-05 14:38:30,369][15468] Decorrelating experience for 64 frames... [2024-08-05 14:38:30,372][15457] Decorrelating experience for 64 frames... [2024-08-05 14:38:30,440][15448] Decorrelating experience for 128 frames... [2024-08-05 14:38:30,466][15447] Decorrelating experience for 64 frames... [2024-08-05 14:38:30,477][15464] Decorrelating experience for 64 frames... [2024-08-05 14:38:30,492][15625] Decorrelating experience for 64 frames... [2024-08-05 14:38:30,534][15465] Decorrelating experience for 96 frames... [2024-08-05 14:38:30,608][15461] Decorrelating experience for 96 frames... [2024-08-05 14:38:30,620][15463] Decorrelating experience for 64 frames... [2024-08-05 14:38:30,676][15456] Decorrelating experience for 160 frames... [2024-08-05 14:38:30,699][15462] Decorrelating experience for 128 frames... [2024-08-05 14:38:30,700][15460] Decorrelating experience for 32 frames... [2024-08-05 14:38:30,747][15458] Decorrelating experience for 0 frames... [2024-08-05 14:38:30,775][15457] Decorrelating experience for 96 frames... [2024-08-05 14:38:30,777][15626] Decorrelating experience for 96 frames... [2024-08-05 14:38:30,850][15596] Decorrelating experience for 32 frames... [2024-08-05 14:38:30,888][15464] Decorrelating experience for 96 frames... [2024-08-05 14:38:30,890][15461] Decorrelating experience for 128 frames... [2024-08-05 14:38:30,929][15460] Decorrelating experience for 64 frames... [2024-08-05 14:38:30,993][15448] Decorrelating experience for 160 frames... [2024-08-05 14:38:31,042][15463] Decorrelating experience for 96 frames... [2024-08-05 14:38:31,130][15467] Decorrelating experience for 64 frames... [2024-08-05 14:38:31,137][15464] Decorrelating experience for 128 frames... [2024-08-05 14:38:31,161][15460] Decorrelating experience for 96 frames... [2024-08-05 14:38:31,169][15466] Decorrelating experience for 64 frames... [2024-08-05 14:38:31,180][15458] Decorrelating experience for 32 frames... [2024-08-05 14:38:31,190][15461] Decorrelating experience for 160 frames... [2024-08-05 14:38:31,292][15465] Decorrelating experience for 128 frames... [2024-08-05 14:38:31,367][15626] Decorrelating experience for 128 frames... [2024-08-05 14:38:31,383][15625] Decorrelating experience for 96 frames... [2024-08-05 14:38:31,396][15596] Decorrelating experience for 64 frames... [2024-08-05 14:38:31,403][15464] Decorrelating experience for 160 frames... [2024-08-05 14:38:31,422][15467] Decorrelating experience for 96 frames... [2024-08-05 14:38:31,433][15462] Decorrelating experience for 160 frames... [2024-08-05 14:38:31,448][15457] Decorrelating experience for 128 frames... [2024-08-05 14:38:31,475][15461] Decorrelating experience for 192 frames... [2024-08-05 14:38:31,618][15448] Decorrelating experience for 192 frames... [2024-08-05 14:38:31,643][15447] Decorrelating experience for 96 frames... [2024-08-05 14:38:31,655][15458] Decorrelating experience for 64 frames... [2024-08-05 14:38:31,691][15464] Decorrelating experience for 192 frames... [2024-08-05 14:38:31,815][15460] Decorrelating experience for 128 frames... [2024-08-05 14:38:31,824][15465] Decorrelating experience for 160 frames... [2024-08-05 14:38:31,875][15469] Decorrelating experience for 96 frames... [2024-08-05 14:38:31,885][15458] Decorrelating experience for 96 frames... [2024-08-05 14:38:31,887][15596] Decorrelating experience for 96 frames... [2024-08-05 14:38:31,963][15457] Decorrelating experience for 160 frames... [2024-08-05 14:38:31,988][15625] Decorrelating experience for 128 frames... [2024-08-05 14:38:32,011][15461] Decorrelating experience for 224 frames... [2024-08-05 14:38:32,117][15467] Decorrelating experience for 128 frames... [2024-08-05 14:38:32,172][15463] Decorrelating experience for 128 frames... [2024-08-05 14:38:32,206][15464] Decorrelating experience for 224 frames... [2024-08-05 14:38:32,209][15600] Decorrelating experience for 64 frames... [2024-08-05 14:38:32,312][15462] Decorrelating experience for 192 frames... [2024-08-05 14:38:32,313][15448] Decorrelating experience for 224 frames... [2024-08-05 14:38:32,357][15456] Decorrelating experience for 192 frames... [2024-08-05 14:38:32,533][15596] Decorrelating experience for 128 frames... [2024-08-05 14:38:32,534][15626] Decorrelating experience for 160 frames... [2024-08-05 14:38:32,544][15461] Decorrelating experience for 256 frames... [2024-08-05 14:38:32,584][15465] Decorrelating experience for 192 frames... [2024-08-05 14:38:32,599][15463] Decorrelating experience for 160 frames... [2024-08-05 14:38:32,618][15460] Decorrelating experience for 160 frames... [2024-08-05 14:38:32,657][15469] Decorrelating experience for 128 frames... [2024-08-05 14:38:32,760][15448] Decorrelating experience for 256 frames... [2024-08-05 14:38:32,771][15464] Decorrelating experience for 256 frames... [2024-08-05 14:38:32,889][15625] Decorrelating experience for 160 frames... [2024-08-05 14:38:32,900][15463] Decorrelating experience for 192 frames... [2024-08-05 14:38:32,985][15468] Decorrelating experience for 96 frames... [2024-08-05 14:38:33,001][15462] Decorrelating experience for 224 frames... [2024-08-05 14:38:33,005][15626] Decorrelating experience for 192 frames... [2024-08-05 14:38:33,041][15588] Decorrelating experience for 128 frames... [2024-08-05 14:38:33,118][15372] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-08-05 14:38:33,164][15464] Decorrelating experience for 288 frames... [2024-08-05 14:38:33,175][15625] Decorrelating experience for 192 frames... [2024-08-05 14:38:33,184][15461] Decorrelating experience for 288 frames... [2024-08-05 14:38:33,196][15448] Decorrelating experience for 288 frames... [2024-08-05 14:38:33,202][15463] Decorrelating experience for 224 frames... [2024-08-05 14:38:33,401][15600] Decorrelating experience for 96 frames... [2024-08-05 14:38:33,407][15460] Decorrelating experience for 192 frames... [2024-08-05 14:38:33,488][15596] Decorrelating experience for 160 frames... [2024-08-05 14:38:33,524][15468] Decorrelating experience for 128 frames... [2024-08-05 14:38:33,529][15456] Decorrelating experience for 224 frames... [2024-08-05 14:38:33,646][15588] Decorrelating experience for 160 frames... [2024-08-05 14:38:33,706][15460] Decorrelating experience for 224 frames... [2024-08-05 14:38:33,739][15625] Decorrelating experience for 224 frames... [2024-08-05 14:38:33,778][15463] Decorrelating experience for 256 frames... [2024-08-05 14:38:33,816][15600] Decorrelating experience for 128 frames... [2024-08-05 14:38:33,886][15596] Decorrelating experience for 192 frames... [2024-08-05 14:38:33,931][15448] Decorrelating experience for 320 frames... [2024-08-05 14:38:33,974][15462] Decorrelating experience for 256 frames... [2024-08-05 14:38:34,008][15465] Decorrelating experience for 224 frames... [2024-08-05 14:38:34,020][15460] Decorrelating experience for 256 frames... [2024-08-05 14:38:34,181][15468] Decorrelating experience for 160 frames... [2024-08-05 14:38:34,232][15464] Decorrelating experience for 320 frames... [2024-08-05 14:38:34,240][15457] Decorrelating experience for 192 frames... [2024-08-05 14:38:34,251][15463] Decorrelating experience for 288 frames... [2024-08-05 14:38:34,349][15596] Decorrelating experience for 224 frames... [2024-08-05 14:38:34,442][15458] Decorrelating experience for 128 frames... [2024-08-05 14:38:34,449][15460] Decorrelating experience for 288 frames... [2024-08-05 14:38:34,458][15468] Decorrelating experience for 192 frames... [2024-08-05 14:38:34,481][15456] Decorrelating experience for 256 frames... [2024-08-05 14:38:34,485][15448] Decorrelating experience for 352 frames... [2024-08-05 14:38:34,512][15465] Decorrelating experience for 256 frames... [2024-08-05 14:38:34,602][15600] Decorrelating experience for 160 frames... [2024-08-05 14:38:34,651][15588] Decorrelating experience for 192 frames... [2024-08-05 14:38:34,859][15463] Decorrelating experience for 320 frames... [2024-08-05 14:38:34,891][15625] Decorrelating experience for 256 frames... [2024-08-05 14:38:34,899][15466] Decorrelating experience for 96 frames... [2024-08-05 14:38:34,930][15458] Decorrelating experience for 160 frames... [2024-08-05 14:38:34,975][15448] Decorrelating experience for 384 frames... [2024-08-05 14:38:34,992][15469] Decorrelating experience for 160 frames... [2024-08-05 14:38:35,013][15457] Decorrelating experience for 224 frames... [2024-08-05 14:38:35,137][15465] Decorrelating experience for 288 frames... [2024-08-05 14:38:35,156][15464] Decorrelating experience for 352 frames... [2024-08-05 14:38:35,236][15456] Decorrelating experience for 288 frames... [2024-08-05 14:38:35,249][15466] Decorrelating experience for 128 frames... [2024-08-05 14:38:35,251][15447] Decorrelating experience for 128 frames... [2024-08-05 14:38:35,262][15460] Decorrelating experience for 320 frames... [2024-08-05 14:38:35,443][15448] Decorrelating experience for 416 frames... [2024-08-05 14:38:35,455][15596] Decorrelating experience for 256 frames... [2024-08-05 14:38:35,474][15588] Decorrelating experience for 224 frames... [2024-08-05 14:38:35,661][15469] Decorrelating experience for 192 frames... [2024-08-05 14:38:35,701][15458] Decorrelating experience for 192 frames... [2024-08-05 14:38:35,726][15466] Decorrelating experience for 160 frames... [2024-08-05 14:38:35,734][15626] Decorrelating experience for 224 frames... [2024-08-05 14:38:35,744][15457] Decorrelating experience for 256 frames... [2024-08-05 14:38:35,803][15463] Decorrelating experience for 352 frames... [2024-08-05 14:38:35,939][15461] Decorrelating experience for 320 frames... [2024-08-05 14:38:35,969][15600] Decorrelating experience for 192 frames... [2024-08-05 14:38:36,017][15596] Decorrelating experience for 288 frames... [2024-08-05 14:38:36,055][15588] Decorrelating experience for 256 frames... [2024-08-05 14:38:36,058][15465] Decorrelating experience for 320 frames... [2024-08-05 14:38:36,112][15448] Decorrelating experience for 448 frames... [2024-08-05 14:38:36,139][15462] Decorrelating experience for 288 frames... [2024-08-05 14:38:36,220][15456] Decorrelating experience for 320 frames... [2024-08-05 14:38:36,264][15466] Decorrelating experience for 192 frames... [2024-08-05 14:38:36,337][15469] Decorrelating experience for 224 frames... [2024-08-05 14:38:36,484][15468] Decorrelating experience for 224 frames... [2024-08-05 14:38:36,533][15625] Decorrelating experience for 288 frames... [2024-08-05 14:38:36,535][15457] Decorrelating experience for 288 frames... [2024-08-05 14:38:36,536][15464] Decorrelating experience for 384 frames... [2024-08-05 14:38:36,629][15626] Decorrelating experience for 256 frames... [2024-08-05 14:38:36,659][15461] Decorrelating experience for 352 frames... [2024-08-05 14:38:36,740][15447] Decorrelating experience for 160 frames... [2024-08-05 14:38:36,869][15600] Decorrelating experience for 224 frames... [2024-08-05 14:38:36,870][15596] Decorrelating experience for 320 frames... [2024-08-05 14:38:36,920][15588] Decorrelating experience for 288 frames... [2024-08-05 14:38:36,984][15465] Decorrelating experience for 352 frames... [2024-08-05 14:38:37,038][15447] Decorrelating experience for 192 frames... [2024-08-05 14:38:37,053][15468] Decorrelating experience for 256 frames... [2024-08-05 14:38:37,056][15458] Decorrelating experience for 224 frames... [2024-08-05 14:38:37,132][15462] Decorrelating experience for 320 frames... [2024-08-05 14:38:37,275][15448] Decorrelating experience for 480 frames... [2024-08-05 14:38:37,325][15469] Decorrelating experience for 256 frames... [2024-08-05 14:38:37,364][15464] Decorrelating experience for 416 frames... [2024-08-05 14:38:37,388][15468] Decorrelating experience for 288 frames... [2024-08-05 14:38:37,447][15457] Decorrelating experience for 320 frames... [2024-08-05 14:38:37,489][15460] Decorrelating experience for 352 frames... [2024-08-05 14:38:37,506][15456] Decorrelating experience for 352 frames... [2024-08-05 14:38:37,532][15596] Decorrelating experience for 352 frames... [2024-08-05 14:38:37,579][15625] Decorrelating experience for 320 frames... [2024-08-05 14:38:37,580][15466] Decorrelating experience for 224 frames... [2024-08-05 14:38:37,759][15626] Decorrelating experience for 288 frames... [2024-08-05 14:38:37,826][15457] Decorrelating experience for 352 frames... [2024-08-05 14:38:37,916][15469] Decorrelating experience for 288 frames... [2024-08-05 14:38:37,989][15467] Decorrelating experience for 160 frames... [2024-08-05 14:38:37,993][15596] Decorrelating experience for 384 frames... [2024-08-05 14:38:38,012][15448] Decorrelating experience for 512 frames... [2024-08-05 14:38:38,012][15461] Decorrelating experience for 384 frames... [2024-08-05 14:38:38,099][15458] Decorrelating experience for 256 frames... [2024-08-05 14:38:38,118][15372] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-08-05 14:38:38,242][15457] Decorrelating experience for 384 frames... [2024-08-05 14:38:38,321][15462] Decorrelating experience for 352 frames... [2024-08-05 14:38:38,326][15468] Decorrelating experience for 320 frames... [2024-08-05 14:38:38,425][15464] Decorrelating experience for 448 frames... [2024-08-05 14:38:38,467][15463] Decorrelating experience for 384 frames... [2024-08-05 14:38:38,518][15460] Decorrelating experience for 384 frames... [2024-08-05 14:38:38,587][15456] Decorrelating experience for 384 frames... [2024-08-05 14:38:38,692][15625] Decorrelating experience for 352 frames... [2024-08-05 14:38:38,732][15448] Decorrelating experience for 544 frames... [2024-08-05 14:38:38,766][15626] Decorrelating experience for 320 frames... [2024-08-05 14:38:38,846][15596] Decorrelating experience for 416 frames... [2024-08-05 14:38:38,877][15461] Decorrelating experience for 416 frames... [2024-08-05 14:38:38,908][15457] Decorrelating experience for 416 frames... [2024-08-05 14:38:39,009][15465] Decorrelating experience for 384 frames... [2024-08-05 14:38:39,080][15625] Decorrelating experience for 384 frames... [2024-08-05 14:38:39,161][15588] Decorrelating experience for 320 frames... [2024-08-05 14:38:39,174][15463] Decorrelating experience for 416 frames... [2024-08-05 14:38:39,196][15468] Decorrelating experience for 352 frames... [2024-08-05 14:38:39,312][15462] Decorrelating experience for 384 frames... [2024-08-05 14:38:39,379][15469] Decorrelating experience for 320 frames... [2024-08-05 14:38:39,422][15464] Decorrelating experience for 480 frames... [2024-08-05 14:38:39,467][15626] Decorrelating experience for 352 frames... [2024-08-05 14:38:39,487][15466] Decorrelating experience for 256 frames... [2024-08-05 14:38:39,509][15596] Decorrelating experience for 448 frames... [2024-08-05 14:38:39,749][15463] Decorrelating experience for 448 frames... [2024-08-05 14:38:39,754][15457] Decorrelating experience for 448 frames... [2024-08-05 14:38:39,849][15448] Decorrelating experience for 576 frames... [2024-08-05 14:38:39,905][15625] Decorrelating experience for 416 frames... [2024-08-05 14:38:39,931][15600] Decorrelating experience for 256 frames... [2024-08-05 14:38:39,946][15468] Decorrelating experience for 384 frames... [2024-08-05 14:38:40,100][15456] Decorrelating experience for 416 frames... [2024-08-05 14:38:40,206][15588] Decorrelating experience for 352 frames... [2024-08-05 14:38:40,241][15626] Decorrelating experience for 384 frames... [2024-08-05 14:38:40,323][15461] Decorrelating experience for 448 frames... [2024-08-05 14:38:40,377][15464] Decorrelating experience for 512 frames... [2024-08-05 14:38:40,434][15447] Decorrelating experience for 224 frames... [2024-08-05 14:38:40,455][15469] Decorrelating experience for 352 frames... [2024-08-05 14:38:40,536][15465] Decorrelating experience for 416 frames... [2024-08-05 14:38:40,628][15625] Decorrelating experience for 448 frames... [2024-08-05 14:38:40,680][15467] Decorrelating experience for 192 frames... [2024-08-05 14:38:40,786][15462] Decorrelating experience for 416 frames... [2024-08-05 14:38:40,978][15600] Decorrelating experience for 288 frames... [2024-08-05 14:38:40,983][15468] Decorrelating experience for 416 frames... [2024-08-05 14:38:41,012][15463] Decorrelating experience for 480 frames... [2024-08-05 14:38:41,023][15372] Heartbeat connected on Batcher_0 [2024-08-05 14:38:41,025][15372] Heartbeat connected on LearnerWorker_p0 [2024-08-05 14:38:41,041][15458] Decorrelating experience for 288 frames... [2024-08-05 14:38:41,065][15372] Heartbeat connected on InferenceWorker_p0-w0 [2024-08-05 14:38:41,080][15457] Decorrelating experience for 480 frames... [2024-08-05 14:38:41,157][15626] Decorrelating experience for 416 frames... [2024-08-05 14:38:41,164][15596] Decorrelating experience for 480 frames... [2024-08-05 14:38:41,211][15447] Decorrelating experience for 256 frames... [2024-08-05 14:38:41,311][15466] Decorrelating experience for 288 frames... [2024-08-05 14:38:41,405][15588] Decorrelating experience for 384 frames... [2024-08-05 14:38:41,503][15464] Decorrelating experience for 544 frames... [2024-08-05 14:38:41,651][15600] Decorrelating experience for 320 frames... [2024-08-05 14:38:41,686][15467] Decorrelating experience for 224 frames... [2024-08-05 14:38:41,710][15448] Decorrelating experience for 608 frames... [2024-08-05 14:38:41,718][15626] Decorrelating experience for 448 frames... [2024-08-05 14:38:41,782][15469] Decorrelating experience for 384 frames... [2024-08-05 14:38:41,797][15468] Decorrelating experience for 448 frames... [2024-08-05 14:38:41,799][15625] Decorrelating experience for 480 frames... [2024-08-05 14:38:42,020][15465] Decorrelating experience for 448 frames... [2024-08-05 14:38:42,044][15466] Decorrelating experience for 320 frames... [2024-08-05 14:38:42,089][15463] Decorrelating experience for 512 frames... [2024-08-05 14:38:42,203][15469] Decorrelating experience for 416 frames... [2024-08-05 14:38:42,211][15372] Heartbeat connected on RolloutWorker_w1 [2024-08-05 14:38:42,293][15596] Decorrelating experience for 512 frames... [2024-08-05 14:38:42,412][15462] Decorrelating experience for 448 frames... [2024-08-05 14:38:42,431][15588] Decorrelating experience for 416 frames... [2024-08-05 14:38:42,481][15464] Decorrelating experience for 576 frames... [2024-08-05 14:38:42,519][15465] Decorrelating experience for 480 frames... [2024-08-05 14:38:42,548][15466] Decorrelating experience for 352 frames... [2024-08-05 14:38:42,577][15456] Decorrelating experience for 448 frames... [2024-08-05 14:38:42,581][15461] Decorrelating experience for 480 frames... [2024-08-05 14:38:42,753][15447] Decorrelating experience for 288 frames... [2024-08-05 14:38:42,799][15469] Decorrelating experience for 448 frames... [2024-08-05 14:38:42,806][15458] Decorrelating experience for 320 frames... [2024-08-05 14:38:42,972][15626] Decorrelating experience for 480 frames... [2024-08-05 14:38:43,063][15465] Decorrelating experience for 512 frames... [2024-08-05 14:38:43,119][15372] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-08-05 14:38:43,121][15468] Decorrelating experience for 480 frames... [2024-08-05 14:38:43,336][15463] Decorrelating experience for 544 frames... [2024-08-05 14:38:43,338][15588] Decorrelating experience for 448 frames... [2024-08-05 14:38:43,416][15625] Decorrelating experience for 512 frames... [2024-08-05 14:38:43,512][15460] Decorrelating experience for 416 frames... [2024-08-05 14:38:43,521][15466] Decorrelating experience for 384 frames... [2024-08-05 14:38:43,524][15464] Decorrelating experience for 608 frames... [2024-08-05 14:38:43,678][15458] Decorrelating experience for 352 frames... [2024-08-05 14:38:43,827][15456] Decorrelating experience for 480 frames... [2024-08-05 14:38:43,850][15462] Decorrelating experience for 480 frames... [2024-08-05 14:38:43,933][15469] Decorrelating experience for 480 frames... [2024-08-05 14:38:43,937][15447] Decorrelating experience for 320 frames... [2024-08-05 14:38:43,964][15457] Decorrelating experience for 512 frames... [2024-08-05 14:38:43,998][15465] Decorrelating experience for 544 frames... [2024-08-05 14:38:44,192][15461] Decorrelating experience for 512 frames... [2024-08-05 14:38:44,256][15458] Decorrelating experience for 384 frames... [2024-08-05 14:38:44,264][15372] Heartbeat connected on RolloutWorker_w10 [2024-08-05 14:38:44,324][15626] Decorrelating experience for 512 frames... [2024-08-05 14:38:44,374][15467] Decorrelating experience for 256 frames... [2024-08-05 14:38:44,472][15463] Decorrelating experience for 576 frames... [2024-08-05 14:38:44,566][15625] Decorrelating experience for 544 frames... [2024-08-05 14:38:44,747][15460] Decorrelating experience for 448 frames... [2024-08-05 14:38:44,806][15466] Decorrelating experience for 416 frames... [2024-08-05 14:38:44,841][15596] Decorrelating experience for 544 frames... [2024-08-05 14:38:44,858][15456] Decorrelating experience for 512 frames... [2024-08-05 14:38:44,974][15447] Decorrelating experience for 352 frames... [2024-08-05 14:38:45,020][15468] Decorrelating experience for 512 frames... [2024-08-05 14:38:45,120][15588] Decorrelating experience for 480 frames... [2024-08-05 14:38:45,301][15626] Decorrelating experience for 544 frames... [2024-08-05 14:38:45,353][15458] Decorrelating experience for 416 frames... [2024-08-05 14:38:45,613][15467] Decorrelating experience for 288 frames... [2024-08-05 14:38:45,630][15457] Decorrelating experience for 544 frames... [2024-08-05 14:38:45,739][15462] Decorrelating experience for 512 frames... [2024-08-05 14:38:45,771][15596] Decorrelating experience for 576 frames... [2024-08-05 14:38:45,792][15466] Decorrelating experience for 448 frames... [2024-08-05 14:38:45,932][15461] Decorrelating experience for 544 frames... [2024-08-05 14:38:46,040][15460] Decorrelating experience for 480 frames... [2024-08-05 14:38:46,096][15600] Decorrelating experience for 352 frames... [2024-08-05 14:38:46,231][15625] Decorrelating experience for 576 frames... [2024-08-05 14:38:46,254][15469] Decorrelating experience for 512 frames... [2024-08-05 14:38:46,375][15468] Decorrelating experience for 544 frames... [2024-08-05 14:38:46,388][15458] Decorrelating experience for 448 frames... [2024-08-05 14:38:46,488][15462] Decorrelating experience for 544 frames... [2024-08-05 14:38:46,533][15463] Decorrelating experience for 608 frames... [2024-08-05 14:38:46,632][15447] Decorrelating experience for 384 frames... [2024-08-05 14:38:46,639][15457] Decorrelating experience for 576 frames... [2024-08-05 14:38:46,724][15596] Decorrelating experience for 608 frames... [2024-08-05 14:38:46,942][15458] Decorrelating experience for 480 frames... [2024-08-05 14:38:47,049][15469] Decorrelating experience for 544 frames... [2024-08-05 14:38:47,062][15588] Decorrelating experience for 512 frames... [2024-08-05 14:38:47,194][15456] Decorrelating experience for 544 frames... [2024-08-05 14:38:47,280][15467] Decorrelating experience for 320 frames... [2024-08-05 14:38:47,515][15625] Decorrelating experience for 608 frames... [2024-08-05 14:38:47,577][15466] Decorrelating experience for 480 frames... [2024-08-05 14:38:47,592][15461] Decorrelating experience for 576 frames... [2024-08-05 14:38:47,619][15457] Decorrelating experience for 608 frames... [2024-08-05 14:38:47,641][15462] Decorrelating experience for 576 frames... [2024-08-05 14:38:47,643][15372] Heartbeat connected on RolloutWorker_w8 [2024-08-05 14:38:47,764][15372] Heartbeat connected on RolloutWorker_w16 [2024-08-05 14:38:47,890][15467] Decorrelating experience for 352 frames... [2024-08-05 14:38:47,967][15469] Decorrelating experience for 576 frames... [2024-08-05 14:38:48,118][15372] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 192.2. Samples: 3930. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-08-05 14:38:48,119][15372] Avg episode reward: [(0, '0.817')] [2024-08-05 14:38:48,172][15465] Decorrelating experience for 576 frames... [2024-08-05 14:38:48,222][15588] Decorrelating experience for 544 frames... [2024-08-05 14:38:48,269][15458] Decorrelating experience for 512 frames... [2024-08-05 14:38:48,402][15372] Heartbeat connected on RolloutWorker_w18 [2024-08-05 14:38:48,407][15372] Heartbeat connected on RolloutWorker_w3 [2024-08-05 14:38:48,488][15600] Decorrelating experience for 384 frames... [2024-08-05 14:38:48,594][15467] Decorrelating experience for 384 frames... [2024-08-05 14:38:48,665][15460] Decorrelating experience for 512 frames... [2024-08-05 14:38:48,680][15468] Decorrelating experience for 576 frames... [2024-08-05 14:38:48,908][15462] Decorrelating experience for 608 frames... [2024-08-05 14:38:49,275][15469] Decorrelating experience for 608 frames... [2024-08-05 14:38:49,305][15447] Decorrelating experience for 416 frames... [2024-08-05 14:38:49,362][15468] Decorrelating experience for 608 frames... [2024-08-05 14:38:49,466][15588] Decorrelating experience for 576 frames... [2024-08-05 14:38:49,467][15417] Signal inference workers to stop experience collection... [2024-08-05 14:38:49,540][15444] InferenceWorker_p0-w0: stopping experience collection [2024-08-05 14:38:49,636][15372] Heartbeat connected on RolloutWorker_w7 [2024-08-05 14:38:49,682][15466] Decorrelating experience for 512 frames... [2024-08-05 14:38:49,804][15372] Heartbeat connected on RolloutWorker_w13 [2024-08-05 14:38:49,883][15458] Decorrelating experience for 544 frames... [2024-08-05 14:38:49,913][15600] Decorrelating experience for 416 frames... [2024-08-05 14:38:50,100][15372] Heartbeat connected on RolloutWorker_w14 [2024-08-05 14:38:50,153][15460] Decorrelating experience for 544 frames... [2024-08-05 14:38:50,174][15447] Decorrelating experience for 448 frames... [2024-08-05 14:38:50,174][15588] Decorrelating experience for 608 frames... [2024-08-05 14:38:50,206][15626] Decorrelating experience for 576 frames... [2024-08-05 14:38:50,510][15461] Decorrelating experience for 608 frames... [2024-08-05 14:38:50,530][15465] Decorrelating experience for 608 frames... [2024-08-05 14:38:50,569][15456] Decorrelating experience for 576 frames... [2024-08-05 14:38:50,656][15372] Heartbeat connected on RolloutWorker_w15 [2024-08-05 14:38:50,703][15466] Decorrelating experience for 544 frames... [2024-08-05 14:38:50,791][15447] Decorrelating experience for 480 frames... [2024-08-05 14:38:50,817][15417] Signal inference workers to resume experience collection... [2024-08-05 14:38:50,817][15444] InferenceWorker_p0-w0: resuming experience collection [2024-08-05 14:38:50,838][15600] Decorrelating experience for 448 frames... [2024-08-05 14:38:50,889][15467] Decorrelating experience for 416 frames... [2024-08-05 14:38:51,137][15372] Heartbeat connected on RolloutWorker_w11 [2024-08-05 14:38:51,237][15372] Heartbeat connected on RolloutWorker_w5 [2024-08-05 14:38:51,347][15460] Decorrelating experience for 576 frames... [2024-08-05 14:38:51,657][15458] Decorrelating experience for 576 frames... [2024-08-05 14:38:51,683][15626] Decorrelating experience for 608 frames... [2024-08-05 14:38:52,104][15456] Decorrelating experience for 608 frames... [2024-08-05 14:38:52,265][15467] Decorrelating experience for 448 frames... [2024-08-05 14:38:52,340][15466] Decorrelating experience for 576 frames... [2024-08-05 14:38:52,394][15447] Decorrelating experience for 512 frames... [2024-08-05 14:38:52,598][15600] Decorrelating experience for 480 frames... [2024-08-05 14:38:52,606][15372] Heartbeat connected on RolloutWorker_w19 [2024-08-05 14:38:52,886][15460] Decorrelating experience for 608 frames... [2024-08-05 14:38:52,986][15372] Heartbeat connected on RolloutWorker_w2 [2024-08-05 14:38:53,119][15372] Fps is (10 sec: 4095.8, 60 sec: 1609.6, 300 sec: 1609.6). Total num frames: 40960. Throughput: 0: 258.2. Samples: 6570. Policy #0 lag: (min: 0.0, avg: 1.9, max: 4.0) [2024-08-05 14:38:53,119][15372] Avg episode reward: [(0, '0.483')] [2024-08-05 14:38:53,337][15467] Decorrelating experience for 480 frames... [2024-08-05 14:38:53,604][15372] Heartbeat connected on RolloutWorker_w6 [2024-08-05 14:38:53,807][15458] Decorrelating experience for 608 frames... [2024-08-05 14:38:53,836][15447] Decorrelating experience for 544 frames... [2024-08-05 14:38:53,995][15466] Decorrelating experience for 608 frames... [2024-08-05 14:38:54,345][15467] Decorrelating experience for 512 frames... [2024-08-05 14:38:54,516][15600] Decorrelating experience for 512 frames... [2024-08-05 14:38:55,038][15372] Heartbeat connected on RolloutWorker_w4 [2024-08-05 14:38:55,185][15447] Decorrelating experience for 576 frames... [2024-08-05 14:38:55,279][15372] Heartbeat connected on RolloutWorker_w12 [2024-08-05 14:38:55,365][15444] Updated weights for policy 0, policy_version 10 (0.0020) [2024-08-05 14:38:55,860][15467] Decorrelating experience for 544 frames... [2024-08-05 14:38:56,084][15600] Decorrelating experience for 544 frames... [2024-08-05 14:38:56,118][15447] Decorrelating experience for 608 frames... [2024-08-05 14:38:56,761][15372] Heartbeat connected on RolloutWorker_w0 [2024-08-05 14:38:57,111][15467] Decorrelating experience for 576 frames... [2024-08-05 14:38:57,322][15600] Decorrelating experience for 576 frames... [2024-08-05 14:38:58,030][15467] Decorrelating experience for 608 frames... [2024-08-05 14:38:58,128][15372] Fps is (10 sec: 13094.7, 60 sec: 4303.5, 300 sec: 4303.5). Total num frames: 131072. Throughput: 0: 1077.6. Samples: 32820. Policy #0 lag: (min: 0.0, avg: 3.0, max: 6.0) [2024-08-05 14:38:58,129][15372] Avg episode reward: [(0, '0.607')] [2024-08-05 14:38:58,131][15417] Saving new best policy, reward=0.607! [2024-08-05 14:38:58,829][15372] Heartbeat connected on RolloutWorker_w9 [2024-08-05 14:38:58,880][15600] Decorrelating experience for 608 frames... [2024-08-05 14:38:59,642][15372] Heartbeat connected on RolloutWorker_w17 [2024-08-05 14:38:59,942][15444] Updated weights for policy 0, policy_version 21 (0.0010) [2024-08-05 14:39:03,119][15372] Fps is (10 sec: 18842.5, 60 sec: 6470.9, 300 sec: 6470.9). Total num frames: 229376. Throughput: 0: 1822.4. Samples: 64600. Policy #0 lag: (min: 0.0, avg: 3.5, max: 6.0) [2024-08-05 14:39:03,119][15372] Avg episode reward: [(0, '1.061')] [2024-08-05 14:39:03,250][15417] Saving new best policy, reward=1.061! [2024-08-05 14:39:04,004][15444] Updated weights for policy 0, policy_version 31 (0.0012) [2024-08-05 14:39:08,060][15444] Updated weights for policy 0, policy_version 41 (0.0039) [2024-08-05 14:39:08,119][15372] Fps is (10 sec: 20499.1, 60 sec: 8303.9, 300 sec: 8303.9). Total num frames: 335872. Throughput: 0: 1972.4. Samples: 79780. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 14:39:08,120][15372] Avg episode reward: [(0, '1.464')] [2024-08-05 14:39:08,124][15417] Saving new best policy, reward=1.464! [2024-08-05 14:39:12,614][15444] Updated weights for policy 0, policy_version 51 (0.0013) [2024-08-05 14:39:13,119][15372] Fps is (10 sec: 19660.0, 60 sec: 9373.1, 300 sec: 9373.1). Total num frames: 425984. Throughput: 0: 2396.7. Samples: 107850. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 14:39:13,120][15372] Avg episode reward: [(0, '1.484')] [2024-08-05 14:39:13,121][15417] Saving new best policy, reward=1.484! [2024-08-05 14:39:16,662][15444] Updated weights for policy 0, policy_version 61 (0.0032) [2024-08-05 14:39:18,119][15372] Fps is (10 sec: 19660.8, 60 sec: 10555.1, 300 sec: 10555.1). Total num frames: 532480. Throughput: 0: 3088.2. Samples: 138970. Policy #0 lag: (min: 1.0, avg: 4.4, max: 8.0) [2024-08-05 14:39:18,119][15372] Avg episode reward: [(0, '1.973')] [2024-08-05 14:39:18,122][15417] Saving new best policy, reward=1.973! [2024-08-05 14:39:20,064][15444] Updated weights for policy 0, policy_version 71 (0.0026) [2024-08-05 14:39:23,119][15372] Fps is (10 sec: 21299.4, 60 sec: 11524.0, 300 sec: 11524.0). Total num frames: 638976. Throughput: 0: 3431.3. Samples: 154410. Policy #0 lag: (min: 1.0, avg: 3.1, max: 8.0) [2024-08-05 14:39:23,120][15372] Avg episode reward: [(0, '1.973')] [2024-08-05 14:39:24,458][15444] Updated weights for policy 0, policy_version 81 (0.0040) [2024-08-05 14:39:28,124][15372] Fps is (10 sec: 20469.2, 60 sec: 12287.0, 300 sec: 12196.0). Total num frames: 737280. Throughput: 0: 4086.2. Samples: 183900. Policy #0 lag: (min: 0.0, avg: 3.4, max: 8.0) [2024-08-05 14:39:28,125][15372] Avg episode reward: [(0, '2.013')] [2024-08-05 14:39:28,128][15417] Saving new best policy, reward=2.013! [2024-08-05 14:39:28,832][15444] Updated weights for policy 0, policy_version 91 (0.0028) [2024-08-05 14:39:31,994][15444] Updated weights for policy 0, policy_version 101 (0.0010) [2024-08-05 14:39:33,119][15372] Fps is (10 sec: 20480.1, 60 sec: 14062.8, 300 sec: 12892.4). Total num frames: 843776. Throughput: 0: 4775.5. Samples: 218830. Policy #0 lag: (min: 1.0, avg: 3.2, max: 7.0) [2024-08-05 14:39:33,119][15372] Avg episode reward: [(0, '2.168')] [2024-08-05 14:39:33,170][15417] Saving new best policy, reward=2.168! [2024-08-05 14:39:35,455][15444] Updated weights for policy 0, policy_version 111 (0.0010) [2024-08-05 14:39:38,119][15372] Fps is (10 sec: 22950.0, 60 sec: 16110.9, 300 sec: 13721.7). Total num frames: 966656. Throughput: 0: 5104.0. Samples: 236250. Policy #0 lag: (min: 1.0, avg: 4.4, max: 8.0) [2024-08-05 14:39:38,119][15372] Avg episode reward: [(0, '2.348')] [2024-08-05 14:39:38,144][15417] Saving new best policy, reward=2.348! [2024-08-05 14:39:39,181][15444] Updated weights for policy 0, policy_version 121 (0.0012) [2024-08-05 14:39:42,598][15444] Updated weights for policy 0, policy_version 131 (0.0010) [2024-08-05 14:39:43,118][15372] Fps is (10 sec: 23757.7, 60 sec: 18022.4, 300 sec: 14332.5). Total num frames: 1081344. Throughput: 0: 5302.0. Samples: 271360. Policy #0 lag: (min: 0.0, avg: 4.5, max: 8.0) [2024-08-05 14:39:43,119][15372] Avg episode reward: [(0, '2.644')] [2024-08-05 14:39:43,119][15417] Saving new best policy, reward=2.644! [2024-08-05 14:39:46,053][15444] Updated weights for policy 0, policy_version 141 (0.0027) [2024-08-05 14:39:48,119][15372] Fps is (10 sec: 22937.5, 60 sec: 19933.8, 300 sec: 14867.3). Total num frames: 1196032. Throughput: 0: 5365.3. Samples: 306040. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 14:39:48,119][15372] Avg episode reward: [(0, '2.817')] [2024-08-05 14:39:48,121][15417] Saving new best policy, reward=2.817! [2024-08-05 14:39:49,744][15444] Updated weights for policy 0, policy_version 151 (0.0021) [2024-08-05 14:39:53,071][15444] Updated weights for policy 0, policy_version 161 (0.0014) [2024-08-05 14:39:53,119][15372] Fps is (10 sec: 23755.7, 60 sec: 21299.3, 300 sec: 15435.3). Total num frames: 1318912. Throughput: 0: 5411.3. Samples: 323290. Policy #0 lag: (min: 1.0, avg: 3.7, max: 7.0) [2024-08-05 14:39:53,120][15372] Avg episode reward: [(0, '2.468')] [2024-08-05 14:39:56,599][15444] Updated weights for policy 0, policy_version 171 (0.0022) [2024-08-05 14:39:58,118][15372] Fps is (10 sec: 23757.4, 60 sec: 21712.3, 300 sec: 15850.2). Total num frames: 1433600. Throughput: 0: 5569.0. Samples: 358450. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 14:39:58,119][15372] Avg episode reward: [(0, '3.076')] [2024-08-05 14:39:58,121][15417] Saving new best policy, reward=3.076! [2024-08-05 14:40:00,047][15444] Updated weights for policy 0, policy_version 181 (0.0029) [2024-08-05 14:40:03,120][15372] Fps is (10 sec: 22935.0, 60 sec: 21981.3, 300 sec: 16221.1). Total num frames: 1548288. Throughput: 0: 5665.2. Samples: 393910. Policy #0 lag: (min: 1.0, avg: 3.8, max: 8.0) [2024-08-05 14:40:03,121][15372] Avg episode reward: [(0, '3.147')] [2024-08-05 14:40:03,150][15417] Saving new best policy, reward=3.147! [2024-08-05 14:40:04,033][15444] Updated weights for policy 0, policy_version 191 (0.0023) [2024-08-05 14:40:07,783][15444] Updated weights for policy 0, policy_version 201 (0.0024) [2024-08-05 14:40:08,118][15372] Fps is (10 sec: 21299.0, 60 sec: 21845.4, 300 sec: 16392.6). Total num frames: 1646592. Throughput: 0: 5667.2. Samples: 409430. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 14:40:08,119][15372] Avg episode reward: [(0, '2.645')] [2024-08-05 14:40:11,544][15444] Updated weights for policy 0, policy_version 211 (0.0030) [2024-08-05 14:40:13,120][15372] Fps is (10 sec: 21299.2, 60 sec: 22254.5, 300 sec: 16702.7). Total num frames: 1761280. Throughput: 0: 5690.9. Samples: 439970. Policy #0 lag: (min: 1.0, avg: 3.4, max: 8.0) [2024-08-05 14:40:13,120][15372] Avg episode reward: [(0, '3.092')] [2024-08-05 14:40:15,623][15444] Updated weights for policy 0, policy_version 221 (0.0015) [2024-08-05 14:40:16,715][15417] Signal inference workers to stop experience collection... (50 times) [2024-08-05 14:40:16,716][15417] Signal inference workers to resume experience collection... (50 times) [2024-08-05 14:40:16,756][15444] InferenceWorker_p0-w0: stopping experience collection (50 times) [2024-08-05 14:40:16,756][15444] InferenceWorker_p0-w0: resuming experience collection (50 times) [2024-08-05 14:40:18,119][15372] Fps is (10 sec: 20479.0, 60 sec: 21981.8, 300 sec: 16762.6). Total num frames: 1851392. Throughput: 0: 5601.3. Samples: 470890. Policy #0 lag: (min: 0.0, avg: 3.0, max: 7.0) [2024-08-05 14:40:18,120][15372] Avg episode reward: [(0, '3.247')] [2024-08-05 14:40:18,122][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000000226_1851392.pth... [2024-08-05 14:40:18,259][15417] Saving new best policy, reward=3.247! [2024-08-05 14:40:19,382][15444] Updated weights for policy 0, policy_version 231 (0.0023) [2024-08-05 14:40:23,118][15372] Fps is (10 sec: 20483.3, 60 sec: 22118.6, 300 sec: 17030.1). Total num frames: 1966080. Throughput: 0: 5598.2. Samples: 488170. Policy #0 lag: (min: 1.0, avg: 3.2, max: 8.0) [2024-08-05 14:40:23,119][15372] Avg episode reward: [(0, '3.433')] [2024-08-05 14:40:23,119][15417] Saving new best policy, reward=3.433! [2024-08-05 14:40:23,420][15444] Updated weights for policy 0, policy_version 241 (0.0020) [2024-08-05 14:40:26,735][15444] Updated weights for policy 0, policy_version 251 (0.0020) [2024-08-05 14:40:28,119][15372] Fps is (10 sec: 22938.4, 60 sec: 22393.5, 300 sec: 17275.3). Total num frames: 2080768. Throughput: 0: 5544.6. Samples: 520870. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 14:40:28,119][15372] Avg episode reward: [(0, '3.599')] [2024-08-05 14:40:28,121][15417] Saving new best policy, reward=3.599! [2024-08-05 14:40:30,502][15444] Updated weights for policy 0, policy_version 261 (0.0015) [2024-08-05 14:40:33,132][15372] Fps is (10 sec: 22905.6, 60 sec: 22522.9, 300 sec: 17499.1). Total num frames: 2195456. Throughput: 0: 5523.6. Samples: 554680. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 14:40:33,133][15372] Avg episode reward: [(0, '3.199')] [2024-08-05 14:40:34,165][15444] Updated weights for policy 0, policy_version 271 (0.0028) [2024-08-05 14:40:37,793][15444] Updated weights for policy 0, policy_version 281 (0.0024) [2024-08-05 14:40:38,119][15372] Fps is (10 sec: 22937.6, 60 sec: 22391.5, 300 sec: 17709.4). Total num frames: 2310144. Throughput: 0: 5523.4. Samples: 571840. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 14:40:38,119][15372] Avg episode reward: [(0, '3.541')] [2024-08-05 14:40:41,288][15444] Updated weights for policy 0, policy_version 291 (0.0017) [2024-08-05 14:40:43,118][15372] Fps is (10 sec: 22969.9, 60 sec: 22391.5, 300 sec: 17902.4). Total num frames: 2424832. Throughput: 0: 5470.0. Samples: 604600. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 14:40:43,119][15372] Avg episode reward: [(0, '3.211')] [2024-08-05 14:40:44,973][15444] Updated weights for policy 0, policy_version 301 (0.0019) [2024-08-05 14:40:48,118][15372] Fps is (10 sec: 22118.8, 60 sec: 22255.0, 300 sec: 18023.3). Total num frames: 2531328. Throughput: 0: 5460.2. Samples: 639610. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 14:40:48,119][15372] Avg episode reward: [(0, '3.700')] [2024-08-05 14:40:48,169][15417] Saving new best policy, reward=3.700! [2024-08-05 14:40:48,508][15444] Updated weights for policy 0, policy_version 311 (0.0025) [2024-08-05 14:40:51,986][15444] Updated weights for policy 0, policy_version 321 (0.0011) [2024-08-05 14:40:53,119][15372] Fps is (10 sec: 22117.8, 60 sec: 22118.5, 300 sec: 18192.3). Total num frames: 2646016. Throughput: 0: 5503.8. Samples: 657100. Policy #0 lag: (min: 1.0, avg: 3.3, max: 7.0) [2024-08-05 14:40:53,119][15372] Avg episode reward: [(0, '3.499')] [2024-08-05 14:40:55,712][15444] Updated weights for policy 0, policy_version 331 (0.0015) [2024-08-05 14:40:58,119][15372] Fps is (10 sec: 22937.3, 60 sec: 22118.3, 300 sec: 18350.0). Total num frames: 2760704. Throughput: 0: 5596.0. Samples: 691780. Policy #0 lag: (min: 1.0, avg: 4.4, max: 8.0) [2024-08-05 14:40:58,119][15372] Avg episode reward: [(0, '3.504')] [2024-08-05 14:40:59,309][15444] Updated weights for policy 0, policy_version 341 (0.0013) [2024-08-05 14:41:02,805][15444] Updated weights for policy 0, policy_version 351 (0.0015) [2024-08-05 14:41:03,118][15372] Fps is (10 sec: 23757.3, 60 sec: 22255.5, 300 sec: 18550.3). Total num frames: 2883584. Throughput: 0: 5681.6. Samples: 726560. Policy #0 lag: (min: 0.0, avg: 4.3, max: 8.0) [2024-08-05 14:41:03,119][15372] Avg episode reward: [(0, '3.405')] [2024-08-05 14:41:05,260][15417] Signal inference workers to stop experience collection... (100 times) [2024-08-05 14:41:05,261][15417] Signal inference workers to resume experience collection... (100 times) [2024-08-05 14:41:05,293][15444] InferenceWorker_p0-w0: stopping experience collection (100 times) [2024-08-05 14:41:05,302][15444] InferenceWorker_p0-w0: resuming experience collection (100 times) [2024-08-05 14:41:06,488][15444] Updated weights for policy 0, policy_version 361 (0.0012) [2024-08-05 14:41:08,119][15372] Fps is (10 sec: 23756.8, 60 sec: 22528.0, 300 sec: 18687.0). Total num frames: 2998272. Throughput: 0: 5678.9. Samples: 743720. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 14:41:08,119][15372] Avg episode reward: [(0, '4.062')] [2024-08-05 14:41:08,121][15417] Saving new best policy, reward=4.062! [2024-08-05 14:41:09,666][15444] Updated weights for policy 0, policy_version 371 (0.0020) [2024-08-05 14:41:13,118][15372] Fps is (10 sec: 22118.4, 60 sec: 22392.1, 300 sec: 18765.9). Total num frames: 3104768. Throughput: 0: 5706.2. Samples: 777650. Policy #0 lag: (min: 0.0, avg: 3.0, max: 7.0) [2024-08-05 14:41:13,119][15372] Avg episode reward: [(0, '4.227')] [2024-08-05 14:41:13,119][15417] Saving new best policy, reward=4.227! [2024-08-05 14:41:13,953][15444] Updated weights for policy 0, policy_version 381 (0.0028) [2024-08-05 14:41:17,577][15444] Updated weights for policy 0, policy_version 391 (0.0020) [2024-08-05 14:41:18,119][15372] Fps is (10 sec: 21299.1, 60 sec: 22664.7, 300 sec: 18840.2). Total num frames: 3211264. Throughput: 0: 5643.5. Samples: 808560. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 14:41:18,119][15372] Avg episode reward: [(0, '4.207')] [2024-08-05 14:41:21,317][15444] Updated weights for policy 0, policy_version 401 (0.0020) [2024-08-05 14:41:23,118][15372] Fps is (10 sec: 22118.4, 60 sec: 22664.5, 300 sec: 18957.0). Total num frames: 3325952. Throughput: 0: 5656.5. Samples: 826380. Policy #0 lag: (min: 1.0, avg: 3.1, max: 7.0) [2024-08-05 14:41:23,119][15372] Avg episode reward: [(0, '4.055')] [2024-08-05 14:41:24,534][15444] Updated weights for policy 0, policy_version 411 (0.0013) [2024-08-05 14:41:28,119][15372] Fps is (10 sec: 22937.7, 60 sec: 22664.6, 300 sec: 19067.3). Total num frames: 3440640. Throughput: 0: 5709.1. Samples: 861510. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 14:41:28,126][15372] Avg episode reward: [(0, '4.325')] [2024-08-05 14:41:28,133][15417] Saving new best policy, reward=4.325! [2024-08-05 14:41:28,422][15444] Updated weights for policy 0, policy_version 421 (0.0026) [2024-08-05 14:41:32,116][15444] Updated weights for policy 0, policy_version 431 (0.0019) [2024-08-05 14:41:33,118][15372] Fps is (10 sec: 22937.6, 60 sec: 22669.8, 300 sec: 19171.7). Total num frames: 3555328. Throughput: 0: 5664.4. Samples: 894510. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 14:41:33,119][15372] Avg episode reward: [(0, '4.578')] [2024-08-05 14:41:33,154][15417] Saving new best policy, reward=4.578! [2024-08-05 14:41:35,438][15444] Updated weights for policy 0, policy_version 441 (0.0027) [2024-08-05 14:41:38,118][15372] Fps is (10 sec: 22937.9, 60 sec: 22664.6, 300 sec: 19270.5). Total num frames: 3670016. Throughput: 0: 5651.6. Samples: 911420. Policy #0 lag: (min: 0.0, avg: 4.3, max: 7.0) [2024-08-05 14:41:38,119][15372] Avg episode reward: [(0, '4.568')] [2024-08-05 14:41:39,407][15444] Updated weights for policy 0, policy_version 451 (0.0029) [2024-08-05 14:41:42,884][15444] Updated weights for policy 0, policy_version 461 (0.0016) [2024-08-05 14:41:43,119][15372] Fps is (10 sec: 22117.3, 60 sec: 22527.8, 300 sec: 19322.4). Total num frames: 3776512. Throughput: 0: 5640.2. Samples: 945590. Policy #0 lag: (min: 0.0, avg: 3.1, max: 7.0) [2024-08-05 14:41:43,119][15372] Avg episode reward: [(0, '4.756')] [2024-08-05 14:41:43,120][15417] Saving new best policy, reward=4.756! [2024-08-05 14:41:47,049][15444] Updated weights for policy 0, policy_version 471 (0.0016) [2024-08-05 14:41:48,118][15372] Fps is (10 sec: 20479.9, 60 sec: 22391.5, 300 sec: 19330.9). Total num frames: 3874816. Throughput: 0: 5533.1. Samples: 975550. Policy #0 lag: (min: 0.0, avg: 3.3, max: 8.0) [2024-08-05 14:41:48,119][15372] Avg episode reward: [(0, '4.799')] [2024-08-05 14:41:48,123][15417] Saving new best policy, reward=4.799! [2024-08-05 14:41:49,531][15417] Signal inference workers to stop experience collection... (150 times) [2024-08-05 14:41:49,531][15417] Signal inference workers to resume experience collection... (150 times) [2024-08-05 14:41:49,567][15444] InferenceWorker_p0-w0: stopping experience collection (150 times) [2024-08-05 14:41:49,629][15444] InferenceWorker_p0-w0: resuming experience collection (150 times) [2024-08-05 14:41:50,906][15444] Updated weights for policy 0, policy_version 481 (0.0013) [2024-08-05 14:41:53,118][15372] Fps is (10 sec: 21300.3, 60 sec: 22391.5, 300 sec: 19418.6). Total num frames: 3989504. Throughput: 0: 5540.2. Samples: 993030. Policy #0 lag: (min: 1.0, avg: 3.3, max: 8.0) [2024-08-05 14:41:53,119][15372] Avg episode reward: [(0, '5.013')] [2024-08-05 14:41:53,120][15417] Saving new best policy, reward=5.013! [2024-08-05 14:41:54,269][15444] Updated weights for policy 0, policy_version 491 (0.0016) [2024-08-05 14:41:58,118][15372] Fps is (10 sec: 22118.5, 60 sec: 22255.0, 300 sec: 19463.3). Total num frames: 4096000. Throughput: 0: 5526.4. Samples: 1026340. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 14:41:58,119][15372] Avg episode reward: [(0, '4.667')] [2024-08-05 14:41:58,335][15444] Updated weights for policy 0, policy_version 501 (0.0018) [2024-08-05 14:42:01,570][15444] Updated weights for policy 0, policy_version 511 (0.0013) [2024-08-05 14:42:03,121][15372] Fps is (10 sec: 22113.7, 60 sec: 22117.6, 300 sec: 19543.8). Total num frames: 4210688. Throughput: 0: 5575.1. Samples: 1059450. Policy #0 lag: (min: 2.0, avg: 4.9, max: 9.0) [2024-08-05 14:42:03,121][15372] Avg episode reward: [(0, '4.781')] [2024-08-05 14:42:05,526][15444] Updated weights for policy 0, policy_version 521 (0.0013) [2024-08-05 14:42:08,118][15372] Fps is (10 sec: 22937.5, 60 sec: 22118.4, 300 sec: 19620.9). Total num frames: 4325376. Throughput: 0: 5547.3. Samples: 1076010. Policy #0 lag: (min: 0.0, avg: 3.3, max: 9.0) [2024-08-05 14:42:08,119][15372] Avg episode reward: [(0, '5.122')] [2024-08-05 14:42:08,122][15417] Saving new best policy, reward=5.122! [2024-08-05 14:42:09,539][15444] Updated weights for policy 0, policy_version 531 (0.0020) [2024-08-05 14:42:12,854][15444] Updated weights for policy 0, policy_version 541 (0.0021) [2024-08-05 14:42:13,118][15372] Fps is (10 sec: 22942.6, 60 sec: 22254.9, 300 sec: 19694.5). Total num frames: 4440064. Throughput: 0: 5509.4. Samples: 1109430. Policy #0 lag: (min: 0.0, avg: 3.0, max: 9.0) [2024-08-05 14:42:13,119][15372] Avg episode reward: [(0, '5.519')] [2024-08-05 14:42:13,125][15417] Saving new best policy, reward=5.519! [2024-08-05 14:42:16,984][15444] Updated weights for policy 0, policy_version 551 (0.0027) [2024-08-05 14:42:18,118][15372] Fps is (10 sec: 22118.4, 60 sec: 22255.0, 300 sec: 19729.3). Total num frames: 4546560. Throughput: 0: 5496.7. Samples: 1141860. Policy #0 lag: (min: 1.0, avg: 3.4, max: 7.0) [2024-08-05 14:42:18,119][15372] Avg episode reward: [(0, '5.501')] [2024-08-05 14:42:18,122][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000000555_4546560.pth... [2024-08-05 14:42:20,099][15444] Updated weights for policy 0, policy_version 561 (0.0015) [2024-08-05 14:42:23,118][15372] Fps is (10 sec: 21299.2, 60 sec: 22118.4, 300 sec: 19762.6). Total num frames: 4653056. Throughput: 0: 5497.8. Samples: 1158820. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 14:42:23,119][15372] Avg episode reward: [(0, '5.700')] [2024-08-05 14:42:23,120][15417] Saving new best policy, reward=5.700! [2024-08-05 14:42:24,200][15444] Updated weights for policy 0, policy_version 571 (0.0046) [2024-08-05 14:42:27,429][15444] Updated weights for policy 0, policy_version 581 (0.0026) [2024-08-05 14:42:28,119][15372] Fps is (10 sec: 21298.5, 60 sec: 21981.8, 300 sec: 19794.6). Total num frames: 4759552. Throughput: 0: 5489.1. Samples: 1192600. Policy #0 lag: (min: 1.0, avg: 4.7, max: 9.0) [2024-08-05 14:42:28,119][15372] Avg episode reward: [(0, '6.577')] [2024-08-05 14:42:28,122][15417] Saving new best policy, reward=6.577! [2024-08-05 14:42:29,295][15417] Signal inference workers to stop experience collection... (200 times) [2024-08-05 14:42:29,295][15417] Signal inference workers to resume experience collection... (200 times) [2024-08-05 14:42:29,364][15444] InferenceWorker_p0-w0: stopping experience collection (200 times) [2024-08-05 14:42:29,368][15444] InferenceWorker_p0-w0: resuming experience collection (200 times) [2024-08-05 14:42:31,358][15444] Updated weights for policy 0, policy_version 591 (0.0029) [2024-08-05 14:42:33,122][15372] Fps is (10 sec: 22929.1, 60 sec: 22117.1, 300 sec: 19891.7). Total num frames: 4882432. Throughput: 0: 5570.4. Samples: 1226240. Policy #0 lag: (min: 0.0, avg: 2.7, max: 8.0) [2024-08-05 14:42:33,122][15372] Avg episode reward: [(0, '6.622')] [2024-08-05 14:42:33,123][15417] Saving new best policy, reward=6.622! [2024-08-05 14:42:35,079][15444] Updated weights for policy 0, policy_version 601 (0.0016) [2024-08-05 14:42:38,118][15372] Fps is (10 sec: 23757.7, 60 sec: 22118.4, 300 sec: 19952.8). Total num frames: 4997120. Throughput: 0: 5565.1. Samples: 1243460. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 14:42:38,119][15372] Avg episode reward: [(0, '6.531')] [2024-08-05 14:42:38,516][15444] Updated weights for policy 0, policy_version 611 (0.0029) [2024-08-05 14:42:42,340][15444] Updated weights for policy 0, policy_version 621 (0.0028) [2024-08-05 14:42:43,118][15372] Fps is (10 sec: 22126.5, 60 sec: 22118.6, 300 sec: 19979.1). Total num frames: 5103616. Throughput: 0: 5558.4. Samples: 1276470. Policy #0 lag: (min: 0.0, avg: 4.0, max: 9.0) [2024-08-05 14:42:43,119][15372] Avg episode reward: [(0, '6.991')] [2024-08-05 14:42:43,119][15417] Saving new best policy, reward=6.991! [2024-08-05 14:42:46,367][15444] Updated weights for policy 0, policy_version 631 (0.0040) [2024-08-05 14:42:48,118][15372] Fps is (10 sec: 20479.9, 60 sec: 22118.4, 300 sec: 19973.0). Total num frames: 5201920. Throughput: 0: 5491.1. Samples: 1306540. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 14:42:48,119][15372] Avg episode reward: [(0, '6.949')] [2024-08-05 14:42:50,220][15444] Updated weights for policy 0, policy_version 641 (0.0014) [2024-08-05 14:42:53,118][15372] Fps is (10 sec: 20479.9, 60 sec: 21981.9, 300 sec: 19998.0). Total num frames: 5308416. Throughput: 0: 5486.7. Samples: 1322910. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 14:42:53,126][15372] Avg episode reward: [(0, '7.850')] [2024-08-05 14:42:53,127][15417] Saving new best policy, reward=7.850! [2024-08-05 14:42:54,056][15444] Updated weights for policy 0, policy_version 651 (0.0019) [2024-08-05 14:42:58,041][15444] Updated weights for policy 0, policy_version 661 (0.0014) [2024-08-05 14:42:58,118][15372] Fps is (10 sec: 21299.2, 60 sec: 21981.9, 300 sec: 20022.1). Total num frames: 5414912. Throughput: 0: 5486.7. Samples: 1356330. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 14:42:58,119][15372] Avg episode reward: [(0, '8.532')] [2024-08-05 14:42:58,123][15417] Saving new best policy, reward=8.532! [2024-08-05 14:43:02,371][15444] Updated weights for policy 0, policy_version 671 (0.0026) [2024-08-05 14:43:03,118][15372] Fps is (10 sec: 19660.8, 60 sec: 21573.0, 300 sec: 19985.8). Total num frames: 5505024. Throughput: 0: 5347.1. Samples: 1382480. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 14:43:03,119][15372] Avg episode reward: [(0, '8.240')] [2024-08-05 14:43:06,260][15444] Updated weights for policy 0, policy_version 681 (0.0034) [2024-08-05 14:43:08,118][15372] Fps is (10 sec: 20480.1, 60 sec: 21572.3, 300 sec: 20038.4). Total num frames: 5619712. Throughput: 0: 5368.7. Samples: 1400410. Policy #0 lag: (min: 1.0, avg: 4.6, max: 9.0) [2024-08-05 14:43:08,119][15372] Avg episode reward: [(0, '8.533')] [2024-08-05 14:43:09,352][15444] Updated weights for policy 0, policy_version 691 (0.0012) [2024-08-05 14:43:12,800][15444] Updated weights for policy 0, policy_version 701 (0.0029) [2024-08-05 14:43:13,118][15372] Fps is (10 sec: 24576.1, 60 sec: 21845.3, 300 sec: 20146.6). Total num frames: 5750784. Throughput: 0: 5429.8. Samples: 1436940. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 14:43:13,119][15372] Avg episode reward: [(0, '8.946')] [2024-08-05 14:43:13,119][15417] Saving new best policy, reward=8.946! [2024-08-05 14:43:16,695][15444] Updated weights for policy 0, policy_version 711 (0.0026) [2024-08-05 14:43:18,119][15372] Fps is (10 sec: 23756.0, 60 sec: 21845.2, 300 sec: 20166.4). Total num frames: 5857280. Throughput: 0: 5430.0. Samples: 1470570. Policy #0 lag: (min: 0.0, avg: 3.2, max: 8.0) [2024-08-05 14:43:18,119][15372] Avg episode reward: [(0, '8.869')] [2024-08-05 14:43:19,886][15444] Updated weights for policy 0, policy_version 721 (0.0021) [2024-08-05 14:43:21,976][15417] Signal inference workers to stop experience collection... (250 times) [2024-08-05 14:43:21,978][15417] Signal inference workers to resume experience collection... (250 times) [2024-08-05 14:43:21,999][15444] InferenceWorker_p0-w0: stopping experience collection (250 times) [2024-08-05 14:43:22,004][15444] InferenceWorker_p0-w0: resuming experience collection (250 times) [2024-08-05 14:43:23,119][15372] Fps is (10 sec: 22118.0, 60 sec: 21981.8, 300 sec: 20244.0). Total num frames: 5971968. Throughput: 0: 5459.5. Samples: 1489140. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 14:43:23,119][15372] Avg episode reward: [(0, '9.489')] [2024-08-05 14:43:23,148][15417] Saving new best policy, reward=9.489! [2024-08-05 14:43:23,454][15444] Updated weights for policy 0, policy_version 731 (0.0013) [2024-08-05 14:43:26,980][15444] Updated weights for policy 0, policy_version 741 (0.0021) [2024-08-05 14:43:28,118][15372] Fps is (10 sec: 24576.8, 60 sec: 22391.6, 300 sec: 20688.3). Total num frames: 6103040. Throughput: 0: 5534.7. Samples: 1525530. Policy #0 lag: (min: 1.0, avg: 4.4, max: 8.0) [2024-08-05 14:43:28,119][15372] Avg episode reward: [(0, '10.514')] [2024-08-05 14:43:28,122][15417] Saving new best policy, reward=10.514! [2024-08-05 14:43:30,023][15444] Updated weights for policy 0, policy_version 751 (0.0013) [2024-08-05 14:43:33,119][15372] Fps is (10 sec: 24574.0, 60 sec: 22255.9, 300 sec: 21077.0). Total num frames: 6217728. Throughput: 0: 5655.2. Samples: 1561030. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 14:43:33,127][15372] Avg episode reward: [(0, '10.529')] [2024-08-05 14:43:33,214][15417] Saving new best policy, reward=10.529! [2024-08-05 14:43:33,583][15444] Updated weights for policy 0, policy_version 761 (0.0013) [2024-08-05 14:43:37,394][15444] Updated weights for policy 0, policy_version 771 (0.0015) [2024-08-05 14:43:38,118][15372] Fps is (10 sec: 22937.4, 60 sec: 22254.9, 300 sec: 21465.8). Total num frames: 6332416. Throughput: 0: 5647.1. Samples: 1577030. Policy #0 lag: (min: 1.0, avg: 4.1, max: 7.0) [2024-08-05 14:43:38,119][15372] Avg episode reward: [(0, '11.346')] [2024-08-05 14:43:38,121][15417] Saving new best policy, reward=11.346! [2024-08-05 14:43:41,057][15444] Updated weights for policy 0, policy_version 781 (0.0021) [2024-08-05 14:43:43,118][15372] Fps is (10 sec: 22939.9, 60 sec: 22391.5, 300 sec: 21854.6). Total num frames: 6447104. Throughput: 0: 5692.4. Samples: 1612490. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 14:43:43,119][15372] Avg episode reward: [(0, '11.327')] [2024-08-05 14:43:44,053][15444] Updated weights for policy 0, policy_version 791 (0.0018) [2024-08-05 14:43:47,782][15444] Updated weights for policy 0, policy_version 801 (0.0020) [2024-08-05 14:43:48,118][15372] Fps is (10 sec: 23757.0, 60 sec: 22801.1, 300 sec: 22132.3). Total num frames: 6569984. Throughput: 0: 5923.8. Samples: 1649050. Policy #0 lag: (min: 1.0, avg: 4.4, max: 9.0) [2024-08-05 14:43:48,119][15372] Avg episode reward: [(0, '12.734')] [2024-08-05 14:43:48,121][15417] Saving new best policy, reward=12.734! [2024-08-05 14:43:50,959][15444] Updated weights for policy 0, policy_version 811 (0.0012) [2024-08-05 14:43:53,119][15372] Fps is (10 sec: 23756.5, 60 sec: 22937.6, 300 sec: 22216.3). Total num frames: 6684672. Throughput: 0: 5914.2. Samples: 1666550. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 14:43:53,119][15372] Avg episode reward: [(0, '13.152')] [2024-08-05 14:43:53,120][15417] Saving new best policy, reward=13.152! [2024-08-05 14:43:54,735][15444] Updated weights for policy 0, policy_version 821 (0.0014) [2024-08-05 14:43:58,119][15372] Fps is (10 sec: 22937.2, 60 sec: 23074.1, 300 sec: 22271.1). Total num frames: 6799360. Throughput: 0: 5857.3. Samples: 1700520. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 14:43:58,119][15372] Avg episode reward: [(0, '13.464')] [2024-08-05 14:43:58,122][15417] Saving new best policy, reward=13.464! [2024-08-05 14:43:58,461][15444] Updated weights for policy 0, policy_version 831 (0.0014) [2024-08-05 14:44:01,723][15444] Updated weights for policy 0, policy_version 841 (0.0034) [2024-08-05 14:44:03,118][15372] Fps is (10 sec: 23757.0, 60 sec: 23620.3, 300 sec: 22326.7). Total num frames: 6922240. Throughput: 0: 5880.7. Samples: 1735200. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 14:44:03,119][15372] Avg episode reward: [(0, '14.598')] [2024-08-05 14:44:03,123][15417] Saving new best policy, reward=14.598! [2024-08-05 14:44:05,262][15444] Updated weights for policy 0, policy_version 851 (0.0011) [2024-08-05 14:44:07,888][15417] Signal inference workers to stop experience collection... (300 times) [2024-08-05 14:44:07,889][15417] Signal inference workers to resume experience collection... (300 times) [2024-08-05 14:44:07,960][15444] InferenceWorker_p0-w0: stopping experience collection (300 times) [2024-08-05 14:44:07,961][15444] InferenceWorker_p0-w0: resuming experience collection (300 times) [2024-08-05 14:44:08,119][15372] Fps is (10 sec: 23756.2, 60 sec: 23620.1, 300 sec: 22410.0). Total num frames: 7036928. Throughput: 0: 5872.0. Samples: 1753380. Policy #0 lag: (min: 1.0, avg: 3.8, max: 7.0) [2024-08-05 14:44:08,119][15372] Avg episode reward: [(0, '14.637')] [2024-08-05 14:44:08,122][15417] Saving new best policy, reward=14.637! [2024-08-05 14:44:08,697][15444] Updated weights for policy 0, policy_version 861 (0.0017) [2024-08-05 14:44:12,285][15444] Updated weights for policy 0, policy_version 871 (0.0014) [2024-08-05 14:44:13,119][15372] Fps is (10 sec: 22936.3, 60 sec: 23347.0, 300 sec: 22437.7). Total num frames: 7151616. Throughput: 0: 5855.7. Samples: 1789040. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 14:44:13,120][15372] Avg episode reward: [(0, '14.524')] [2024-08-05 14:44:15,446][15444] Updated weights for policy 0, policy_version 881 (0.0018) [2024-08-05 14:44:18,118][15372] Fps is (10 sec: 23757.8, 60 sec: 23620.4, 300 sec: 22493.3). Total num frames: 7274496. Throughput: 0: 5859.7. Samples: 1824710. Policy #0 lag: (min: 0.0, avg: 3.2, max: 7.0) [2024-08-05 14:44:18,119][15372] Avg episode reward: [(0, '15.049')] [2024-08-05 14:44:18,122][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000000888_7274496.pth... [2024-08-05 14:44:18,231][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000000226_1851392.pth [2024-08-05 14:44:18,242][15417] Saving new best policy, reward=15.049! [2024-08-05 14:44:19,121][15444] Updated weights for policy 0, policy_version 891 (0.0011) [2024-08-05 14:44:22,716][15444] Updated weights for policy 0, policy_version 901 (0.0012) [2024-08-05 14:44:23,118][15372] Fps is (10 sec: 23758.2, 60 sec: 23620.3, 300 sec: 22549.3). Total num frames: 7389184. Throughput: 0: 5895.3. Samples: 1842320. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 14:44:23,119][15372] Avg episode reward: [(0, '15.907')] [2024-08-05 14:44:23,119][15417] Saving new best policy, reward=15.907! [2024-08-05 14:44:26,056][15444] Updated weights for policy 0, policy_version 911 (0.0014) [2024-08-05 14:44:28,119][15372] Fps is (10 sec: 22936.4, 60 sec: 23347.0, 300 sec: 22576.6). Total num frames: 7503872. Throughput: 0: 5873.0. Samples: 1876780. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 14:44:28,120][15372] Avg episode reward: [(0, '16.605')] [2024-08-05 14:44:28,192][15417] Saving new best policy, reward=16.605! [2024-08-05 14:44:29,841][15444] Updated weights for policy 0, policy_version 921 (0.0021) [2024-08-05 14:44:32,959][15444] Updated weights for policy 0, policy_version 931 (0.0021) [2024-08-05 14:44:33,118][15372] Fps is (10 sec: 23756.8, 60 sec: 23484.1, 300 sec: 22576.6). Total num frames: 7626752. Throughput: 0: 5843.3. Samples: 1912000. Policy #0 lag: (min: 1.0, avg: 3.4, max: 8.0) [2024-08-05 14:44:33,119][15372] Avg episode reward: [(0, '16.917')] [2024-08-05 14:44:33,119][15417] Saving new best policy, reward=16.917! [2024-08-05 14:44:36,751][15444] Updated weights for policy 0, policy_version 941 (0.0014) [2024-08-05 14:44:38,124][15372] Fps is (10 sec: 23745.1, 60 sec: 23481.6, 300 sec: 22576.2). Total num frames: 7741440. Throughput: 0: 5852.0. Samples: 1929920. Policy #0 lag: (min: 0.0, avg: 3.3, max: 8.0) [2024-08-05 14:44:38,124][15372] Avg episode reward: [(0, '16.174')] [2024-08-05 14:44:40,088][15444] Updated weights for policy 0, policy_version 951 (0.0014) [2024-08-05 14:44:43,119][15372] Fps is (10 sec: 22937.4, 60 sec: 23483.7, 300 sec: 22576.6). Total num frames: 7856128. Throughput: 0: 5890.5. Samples: 1965590. Policy #0 lag: (min: 1.0, avg: 4.8, max: 8.0) [2024-08-05 14:44:43,119][15372] Avg episode reward: [(0, '16.652')] [2024-08-05 14:44:43,441][15444] Updated weights for policy 0, policy_version 961 (0.0014) [2024-08-05 14:44:45,399][15417] Signal inference workers to stop experience collection... (350 times) [2024-08-05 14:44:45,402][15417] Signal inference workers to resume experience collection... (350 times) [2024-08-05 14:44:45,441][15444] InferenceWorker_p0-w0: stopping experience collection (350 times) [2024-08-05 14:44:45,446][15444] InferenceWorker_p0-w0: resuming experience collection (350 times) [2024-08-05 14:44:46,859][15444] Updated weights for policy 0, policy_version 971 (0.0021) [2024-08-05 14:44:48,119][15372] Fps is (10 sec: 24588.2, 60 sec: 23620.1, 300 sec: 22604.4). Total num frames: 7987200. Throughput: 0: 5927.7. Samples: 2001950. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 14:44:48,119][15372] Avg episode reward: [(0, '17.621')] [2024-08-05 14:44:48,122][15417] Saving new best policy, reward=17.621! [2024-08-05 14:44:50,122][15444] Updated weights for policy 0, policy_version 981 (0.0029) [2024-08-05 14:44:53,118][15372] Fps is (10 sec: 24576.2, 60 sec: 23620.3, 300 sec: 22604.4). Total num frames: 8101888. Throughput: 0: 5922.1. Samples: 2019870. Policy #0 lag: (min: 0.0, avg: 4.7, max: 8.0) [2024-08-05 14:44:53,119][15372] Avg episode reward: [(0, '16.182')] [2024-08-05 14:44:53,872][15444] Updated weights for policy 0, policy_version 991 (0.0027) [2024-08-05 14:44:56,878][15444] Updated weights for policy 0, policy_version 1001 (0.0013) [2024-08-05 14:44:58,119][15372] Fps is (10 sec: 22938.1, 60 sec: 23620.2, 300 sec: 22604.5). Total num frames: 8216576. Throughput: 0: 5904.9. Samples: 2054760. Policy #0 lag: (min: 0.0, avg: 3.0, max: 7.0) [2024-08-05 14:44:58,119][15372] Avg episode reward: [(0, '17.529')] [2024-08-05 14:45:00,595][15444] Updated weights for policy 0, policy_version 1011 (0.0019) [2024-08-05 14:45:03,119][15372] Fps is (10 sec: 24575.9, 60 sec: 23756.8, 300 sec: 22715.4). Total num frames: 8347648. Throughput: 0: 5929.8. Samples: 2091550. Policy #0 lag: (min: 2.0, avg: 4.8, max: 9.0) [2024-08-05 14:45:03,119][15372] Avg episode reward: [(0, '16.849')] [2024-08-05 14:45:04,192][15444] Updated weights for policy 0, policy_version 1021 (0.0025) [2024-08-05 14:45:07,366][15444] Updated weights for policy 0, policy_version 1031 (0.0030) [2024-08-05 14:45:08,118][15372] Fps is (10 sec: 23757.4, 60 sec: 23620.4, 300 sec: 22687.8). Total num frames: 8454144. Throughput: 0: 5921.1. Samples: 2108770. Policy #0 lag: (min: 1.0, avg: 3.5, max: 7.0) [2024-08-05 14:45:08,119][15372] Avg episode reward: [(0, '16.591')] [2024-08-05 14:45:10,975][15444] Updated weights for policy 0, policy_version 1041 (0.0025) [2024-08-05 14:45:13,119][15372] Fps is (10 sec: 22937.7, 60 sec: 23757.0, 300 sec: 22798.8). Total num frames: 8577024. Throughput: 0: 5941.4. Samples: 2144140. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 14:45:13,119][15372] Avg episode reward: [(0, '17.971')] [2024-08-05 14:45:13,119][15417] Saving new best policy, reward=17.971! [2024-08-05 14:45:14,554][15444] Updated weights for policy 0, policy_version 1051 (0.0039) [2024-08-05 14:45:17,839][15444] Updated weights for policy 0, policy_version 1061 (0.0021) [2024-08-05 14:45:18,118][15372] Fps is (10 sec: 24576.1, 60 sec: 23756.8, 300 sec: 22826.5). Total num frames: 8699904. Throughput: 0: 5948.2. Samples: 2179670. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 14:45:18,119][15372] Avg episode reward: [(0, '18.707')] [2024-08-05 14:45:18,121][15417] Saving new best policy, reward=18.707! [2024-08-05 14:45:21,461][15444] Updated weights for policy 0, policy_version 1071 (0.0017) [2024-08-05 14:45:23,118][15372] Fps is (10 sec: 22937.7, 60 sec: 23620.3, 300 sec: 22798.8). Total num frames: 8806400. Throughput: 0: 5941.4. Samples: 2197250. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 14:45:23,119][15372] Avg episode reward: [(0, '19.145')] [2024-08-05 14:45:23,120][15417] Saving new best policy, reward=19.145! [2024-08-05 14:45:24,679][15444] Updated weights for policy 0, policy_version 1081 (0.0019) [2024-08-05 14:45:26,382][15417] Signal inference workers to stop experience collection... (400 times) [2024-08-05 14:45:26,382][15417] Signal inference workers to resume experience collection... (400 times) [2024-08-05 14:45:26,427][15444] InferenceWorker_p0-w0: stopping experience collection (400 times) [2024-08-05 14:45:26,428][15444] InferenceWorker_p0-w0: resuming experience collection (400 times) [2024-08-05 14:45:28,118][15372] Fps is (10 sec: 22937.4, 60 sec: 23757.0, 300 sec: 22827.6). Total num frames: 8929280. Throughput: 0: 5934.4. Samples: 2232640. Policy #0 lag: (min: 1.0, avg: 4.6, max: 8.0) [2024-08-05 14:45:28,119][15372] Avg episode reward: [(0, '19.572')] [2024-08-05 14:45:28,122][15417] Saving new best policy, reward=19.572! [2024-08-05 14:45:28,359][15444] Updated weights for policy 0, policy_version 1091 (0.0023) [2024-08-05 14:45:31,859][15444] Updated weights for policy 0, policy_version 1101 (0.0017) [2024-08-05 14:45:33,119][15372] Fps is (10 sec: 24574.4, 60 sec: 23756.5, 300 sec: 22854.3). Total num frames: 9052160. Throughput: 0: 5925.5. Samples: 2268600. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 14:45:33,119][15372] Avg episode reward: [(0, '19.662')] [2024-08-05 14:45:33,156][15417] Saving new best policy, reward=19.662! [2024-08-05 14:45:35,073][15444] Updated weights for policy 0, policy_version 1111 (0.0020) [2024-08-05 14:45:38,118][15372] Fps is (10 sec: 23757.0, 60 sec: 23759.0, 300 sec: 22854.3). Total num frames: 9166848. Throughput: 0: 5925.1. Samples: 2286500. Policy #0 lag: (min: 0.0, avg: 4.3, max: 9.0) [2024-08-05 14:45:38,126][15372] Avg episode reward: [(0, '19.628')] [2024-08-05 14:45:38,886][15444] Updated weights for policy 0, policy_version 1121 (0.0015) [2024-08-05 14:45:41,944][15444] Updated weights for policy 0, policy_version 1131 (0.0022) [2024-08-05 14:45:43,119][15372] Fps is (10 sec: 23757.0, 60 sec: 23893.1, 300 sec: 22909.8). Total num frames: 9289728. Throughput: 0: 5937.1. Samples: 2321930. Policy #0 lag: (min: 1.0, avg: 4.2, max: 9.0) [2024-08-05 14:45:43,119][15372] Avg episode reward: [(0, '19.396')] [2024-08-05 14:45:45,649][15444] Updated weights for policy 0, policy_version 1141 (0.0019) [2024-08-05 14:45:48,119][15372] Fps is (10 sec: 23756.6, 60 sec: 23620.4, 300 sec: 22909.8). Total num frames: 9404416. Throughput: 0: 5926.0. Samples: 2358220. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 14:45:48,119][15372] Avg episode reward: [(0, '20.439')] [2024-08-05 14:45:48,124][15417] Saving new best policy, reward=20.439! [2024-08-05 14:45:49,003][15444] Updated weights for policy 0, policy_version 1151 (0.0012) [2024-08-05 14:45:52,428][15444] Updated weights for policy 0, policy_version 1161 (0.0021) [2024-08-05 14:45:53,118][15372] Fps is (10 sec: 23758.1, 60 sec: 23756.8, 300 sec: 22937.6). Total num frames: 9527296. Throughput: 0: 5927.3. Samples: 2375500. Policy #0 lag: (min: 0.0, avg: 3.3, max: 8.0) [2024-08-05 14:45:53,126][15372] Avg episode reward: [(0, '20.080')] [2024-08-05 14:45:55,691][15444] Updated weights for policy 0, policy_version 1171 (0.0031) [2024-08-05 14:45:58,118][15372] Fps is (10 sec: 23757.0, 60 sec: 23756.9, 300 sec: 22909.8). Total num frames: 9641984. Throughput: 0: 5926.9. Samples: 2410850. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 14:45:58,119][15372] Avg episode reward: [(0, '19.735')] [2024-08-05 14:45:59,447][15444] Updated weights for policy 0, policy_version 1181 (0.0011) [2024-08-05 14:46:02,744][15444] Updated weights for policy 0, policy_version 1191 (0.0023) [2024-08-05 14:46:03,119][15372] Fps is (10 sec: 23756.2, 60 sec: 23620.2, 300 sec: 22937.6). Total num frames: 9764864. Throughput: 0: 5917.7. Samples: 2445970. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 14:46:03,119][15372] Avg episode reward: [(0, '19.428')] [2024-08-05 14:46:06,395][15444] Updated weights for policy 0, policy_version 1201 (0.0016) [2024-08-05 14:46:08,119][15372] Fps is (10 sec: 23754.9, 60 sec: 23756.5, 300 sec: 22965.3). Total num frames: 9879552. Throughput: 0: 5941.7. Samples: 2464630. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 14:46:08,120][15372] Avg episode reward: [(0, '20.694')] [2024-08-05 14:46:08,122][15417] Saving new best policy, reward=20.694! [2024-08-05 14:46:09,563][15444] Updated weights for policy 0, policy_version 1211 (0.0030) [2024-08-05 14:46:13,119][15372] Fps is (10 sec: 22937.9, 60 sec: 23620.2, 300 sec: 22993.1). Total num frames: 9994240. Throughput: 0: 5952.0. Samples: 2500480. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 14:46:13,127][15372] Avg episode reward: [(0, '20.196')] [2024-08-05 14:46:13,187][15444] Updated weights for policy 0, policy_version 1221 (0.0021) [2024-08-05 14:46:16,609][15444] Updated weights for policy 0, policy_version 1231 (0.0012) [2024-08-05 14:46:17,200][15417] Signal inference workers to stop experience collection... (450 times) [2024-08-05 14:46:17,208][15417] Signal inference workers to resume experience collection... (450 times) [2024-08-05 14:46:17,272][15444] InferenceWorker_p0-w0: stopping experience collection (450 times) [2024-08-05 14:46:17,272][15444] InferenceWorker_p0-w0: resuming experience collection (450 times) [2024-08-05 14:46:18,119][15372] Fps is (10 sec: 23756.3, 60 sec: 23619.9, 300 sec: 23020.8). Total num frames: 10117120. Throughput: 0: 5935.7. Samples: 2535710. Policy #0 lag: (min: 1.0, avg: 3.3, max: 7.0) [2024-08-05 14:46:18,120][15372] Avg episode reward: [(0, '20.294')] [2024-08-05 14:46:18,123][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000001235_10117120.pth... [2024-08-05 14:46:18,250][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000000555_4546560.pth [2024-08-05 14:46:19,943][15444] Updated weights for policy 0, policy_version 1241 (0.0012) [2024-08-05 14:46:23,119][15372] Fps is (10 sec: 24575.8, 60 sec: 23893.2, 300 sec: 23048.7). Total num frames: 10240000. Throughput: 0: 5937.5. Samples: 2553690. Policy #0 lag: (min: 1.0, avg: 4.5, max: 7.0) [2024-08-05 14:46:23,119][15372] Avg episode reward: [(0, '20.383')] [2024-08-05 14:46:23,726][15444] Updated weights for policy 0, policy_version 1251 (0.0017) [2024-08-05 14:46:26,839][15444] Updated weights for policy 0, policy_version 1261 (0.0030) [2024-08-05 14:46:28,118][15372] Fps is (10 sec: 23759.3, 60 sec: 23756.8, 300 sec: 23048.7). Total num frames: 10354688. Throughput: 0: 5930.5. Samples: 2588800. Policy #0 lag: (min: 1.0, avg: 4.0, max: 8.0) [2024-08-05 14:46:28,119][15372] Avg episode reward: [(0, '20.281')] [2024-08-05 14:46:30,427][15444] Updated weights for policy 0, policy_version 1271 (0.0021) [2024-08-05 14:46:33,118][15372] Fps is (10 sec: 23757.3, 60 sec: 23757.1, 300 sec: 23076.4). Total num frames: 10477568. Throughput: 0: 5938.7. Samples: 2625460. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 14:46:33,119][15372] Avg episode reward: [(0, '20.753')] [2024-08-05 14:46:33,120][15417] Saving new best policy, reward=20.753! [2024-08-05 14:46:33,655][15444] Updated weights for policy 0, policy_version 1281 (0.0013) [2024-08-05 14:46:37,354][15444] Updated weights for policy 0, policy_version 1291 (0.0024) [2024-08-05 14:46:38,119][15372] Fps is (10 sec: 23756.2, 60 sec: 23756.7, 300 sec: 23104.2). Total num frames: 10592256. Throughput: 0: 5946.4. Samples: 2643090. Policy #0 lag: (min: 1.0, avg: 3.5, max: 7.0) [2024-08-05 14:46:38,119][15372] Avg episode reward: [(0, '19.763')] [2024-08-05 14:46:40,690][15444] Updated weights for policy 0, policy_version 1301 (0.0026) [2024-08-05 14:46:43,118][15372] Fps is (10 sec: 23756.7, 60 sec: 23757.0, 300 sec: 23187.5). Total num frames: 10715136. Throughput: 0: 5934.2. Samples: 2677890. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 14:46:43,119][15372] Avg episode reward: [(0, '20.304')] [2024-08-05 14:46:44,218][15444] Updated weights for policy 0, policy_version 1311 (0.0030) [2024-08-05 14:46:47,676][15444] Updated weights for policy 0, policy_version 1321 (0.0023) [2024-08-05 14:46:48,119][15372] Fps is (10 sec: 22938.0, 60 sec: 23620.3, 300 sec: 23159.8). Total num frames: 10821632. Throughput: 0: 5926.0. Samples: 2712640. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 14:46:48,119][15372] Avg episode reward: [(0, '20.595')] [2024-08-05 14:46:51,225][15444] Updated weights for policy 0, policy_version 1331 (0.0019) [2024-08-05 14:46:53,119][15372] Fps is (10 sec: 22937.2, 60 sec: 23620.2, 300 sec: 23215.3). Total num frames: 10944512. Throughput: 0: 5920.3. Samples: 2731040. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 14:46:53,119][15372] Avg episode reward: [(0, '21.049')] [2024-08-05 14:46:53,120][15417] Saving new best policy, reward=21.049! [2024-08-05 14:46:54,835][15444] Updated weights for policy 0, policy_version 1341 (0.0029) [2024-08-05 14:46:57,966][15444] Updated weights for policy 0, policy_version 1351 (0.0018) [2024-08-05 14:46:58,118][15372] Fps is (10 sec: 24576.0, 60 sec: 23756.8, 300 sec: 23243.2). Total num frames: 11067392. Throughput: 0: 5913.3. Samples: 2766580. Policy #0 lag: (min: 0.0, avg: 3.2, max: 8.0) [2024-08-05 14:46:58,119][15372] Avg episode reward: [(0, '21.981')] [2024-08-05 14:46:58,121][15417] Saving new best policy, reward=21.981! [2024-08-05 14:47:01,805][15444] Updated weights for policy 0, policy_version 1361 (0.0018) [2024-08-05 14:47:03,119][15372] Fps is (10 sec: 23756.7, 60 sec: 23620.3, 300 sec: 23243.0). Total num frames: 11182080. Throughput: 0: 5896.1. Samples: 2801030. Policy #0 lag: (min: 1.0, avg: 4.3, max: 7.0) [2024-08-05 14:47:03,119][15372] Avg episode reward: [(0, '21.836')] [2024-08-05 14:47:04,947][15444] Updated weights for policy 0, policy_version 1371 (0.0021) [2024-08-05 14:47:08,119][15372] Fps is (10 sec: 22937.3, 60 sec: 23620.5, 300 sec: 23243.1). Total num frames: 11296768. Throughput: 0: 5907.1. Samples: 2819510. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 14:47:08,119][15372] Avg episode reward: [(0, '21.967')] [2024-08-05 14:47:08,683][15444] Updated weights for policy 0, policy_version 1381 (0.0019) [2024-08-05 14:47:11,038][15417] Signal inference workers to stop experience collection... (500 times) [2024-08-05 14:47:11,039][15417] Signal inference workers to resume experience collection... (500 times) [2024-08-05 14:47:11,125][15444] InferenceWorker_p0-w0: stopping experience collection (500 times) [2024-08-05 14:47:11,132][15444] InferenceWorker_p0-w0: resuming experience collection (500 times) [2024-08-05 14:47:12,052][15444] Updated weights for policy 0, policy_version 1391 (0.0014) [2024-08-05 14:47:13,118][15372] Fps is (10 sec: 23757.2, 60 sec: 23756.8, 300 sec: 23298.6). Total num frames: 11419648. Throughput: 0: 5906.4. Samples: 2854590. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 14:47:13,119][15372] Avg episode reward: [(0, '21.780')] [2024-08-05 14:47:15,338][15444] Updated weights for policy 0, policy_version 1401 (0.0011) [2024-08-05 14:47:18,119][15372] Fps is (10 sec: 23756.8, 60 sec: 23620.6, 300 sec: 23326.4). Total num frames: 11534336. Throughput: 0: 5909.8. Samples: 2891400. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 14:47:18,119][15372] Avg episode reward: [(0, '21.589')] [2024-08-05 14:47:18,780][15444] Updated weights for policy 0, policy_version 1411 (0.0028) [2024-08-05 14:47:22,392][15444] Updated weights for policy 0, policy_version 1421 (0.0022) [2024-08-05 14:47:23,119][15372] Fps is (10 sec: 23756.4, 60 sec: 23620.3, 300 sec: 23381.9). Total num frames: 11657216. Throughput: 0: 5913.6. Samples: 2909200. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 14:47:23,119][15372] Avg episode reward: [(0, '21.143')] [2024-08-05 14:47:25,786][15444] Updated weights for policy 0, policy_version 1431 (0.0017) [2024-08-05 14:47:28,119][15372] Fps is (10 sec: 24576.2, 60 sec: 23756.8, 300 sec: 23382.2). Total num frames: 11780096. Throughput: 0: 5921.5. Samples: 2944360. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 14:47:28,119][15372] Avg episode reward: [(0, '20.526')] [2024-08-05 14:47:29,237][15444] Updated weights for policy 0, policy_version 1441 (0.0011) [2024-08-05 14:47:32,548][15444] Updated weights for policy 0, policy_version 1451 (0.0013) [2024-08-05 14:47:33,118][15372] Fps is (10 sec: 23757.3, 60 sec: 23620.3, 300 sec: 23381.9). Total num frames: 11894784. Throughput: 0: 5926.5. Samples: 2979330. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 14:47:33,119][15372] Avg episode reward: [(0, '21.319')] [2024-08-05 14:47:36,043][15444] Updated weights for policy 0, policy_version 1461 (0.0018) [2024-08-05 14:47:38,119][15372] Fps is (10 sec: 22937.1, 60 sec: 23620.2, 300 sec: 23409.7). Total num frames: 12009472. Throughput: 0: 5921.8. Samples: 2997520. Policy #0 lag: (min: 0.0, avg: 3.7, max: 9.0) [2024-08-05 14:47:38,119][15372] Avg episode reward: [(0, '22.378')] [2024-08-05 14:47:38,200][15417] Saving new best policy, reward=22.378! [2024-08-05 14:47:39,724][15444] Updated weights for policy 0, policy_version 1471 (0.0013) [2024-08-05 14:47:43,119][15372] Fps is (10 sec: 22937.4, 60 sec: 23483.7, 300 sec: 23465.2). Total num frames: 12124160. Throughput: 0: 5898.2. Samples: 3032000. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 14:47:43,119][15372] Avg episode reward: [(0, '21.583')] [2024-08-05 14:47:43,339][15444] Updated weights for policy 0, policy_version 1481 (0.0013) [2024-08-05 14:47:46,789][15444] Updated weights for policy 0, policy_version 1491 (0.0024) [2024-08-05 14:47:48,119][15372] Fps is (10 sec: 23757.3, 60 sec: 23756.8, 300 sec: 23520.8). Total num frames: 12247040. Throughput: 0: 5904.9. Samples: 3066750. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 14:47:48,119][15372] Avg episode reward: [(0, '21.282')] [2024-08-05 14:47:50,193][15444] Updated weights for policy 0, policy_version 1501 (0.0016) [2024-08-05 14:47:53,119][15372] Fps is (10 sec: 23756.3, 60 sec: 23620.2, 300 sec: 23548.5). Total num frames: 12361728. Throughput: 0: 5899.5. Samples: 3084990. Policy #0 lag: (min: 1.0, avg: 3.9, max: 7.0) [2024-08-05 14:47:53,127][15372] Avg episode reward: [(0, '21.540')] [2024-08-05 14:47:53,631][15444] Updated weights for policy 0, policy_version 1511 (0.0014) [2024-08-05 14:47:57,195][15444] Updated weights for policy 0, policy_version 1521 (0.0015) [2024-08-05 14:47:58,119][15372] Fps is (10 sec: 22937.5, 60 sec: 23483.7, 300 sec: 23631.8). Total num frames: 12476416. Throughput: 0: 5897.3. Samples: 3119970. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 14:47:58,119][15372] Avg episode reward: [(0, '22.001')] [2024-08-05 14:48:00,677][15444] Updated weights for policy 0, policy_version 1531 (0.0025) [2024-08-05 14:48:03,119][15372] Fps is (10 sec: 23756.6, 60 sec: 23620.2, 300 sec: 23659.6). Total num frames: 12599296. Throughput: 0: 5889.5. Samples: 3156430. Policy #0 lag: (min: 0.0, avg: 4.4, max: 8.0) [2024-08-05 14:48:03,119][15372] Avg episode reward: [(0, '22.148')] [2024-08-05 14:48:04,080][15444] Updated weights for policy 0, policy_version 1541 (0.0014) [2024-08-05 14:48:07,426][15444] Updated weights for policy 0, policy_version 1551 (0.0023) [2024-08-05 14:48:08,119][15372] Fps is (10 sec: 23756.9, 60 sec: 23620.3, 300 sec: 23604.1). Total num frames: 12713984. Throughput: 0: 5882.7. Samples: 3173920. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 14:48:08,119][15372] Avg episode reward: [(0, '22.671')] [2024-08-05 14:48:08,121][15417] Saving new best policy, reward=22.671! [2024-08-05 14:48:09,152][15417] Signal inference workers to stop experience collection... (550 times) [2024-08-05 14:48:09,152][15417] Signal inference workers to resume experience collection... (550 times) [2024-08-05 14:48:09,204][15444] InferenceWorker_p0-w0: stopping experience collection (550 times) [2024-08-05 14:48:09,205][15444] InferenceWorker_p0-w0: resuming experience collection (550 times) [2024-08-05 14:48:10,993][15444] Updated weights for policy 0, policy_version 1561 (0.0012) [2024-08-05 14:48:13,118][15372] Fps is (10 sec: 23757.7, 60 sec: 23620.3, 300 sec: 23659.6). Total num frames: 12836864. Throughput: 0: 5892.0. Samples: 3209500. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 14:48:13,119][15372] Avg episode reward: [(0, '23.103')] [2024-08-05 14:48:13,119][15417] Saving new best policy, reward=23.103! [2024-08-05 14:48:14,463][15444] Updated weights for policy 0, policy_version 1571 (0.0014) [2024-08-05 14:48:18,109][15444] Updated weights for policy 0, policy_version 1581 (0.0018) [2024-08-05 14:48:18,118][15372] Fps is (10 sec: 23756.9, 60 sec: 23620.3, 300 sec: 23659.6). Total num frames: 12951552. Throughput: 0: 5899.1. Samples: 3244790. Policy #0 lag: (min: 1.0, avg: 3.7, max: 7.0) [2024-08-05 14:48:18,120][15372] Avg episode reward: [(0, '22.840')] [2024-08-05 14:48:18,124][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000001581_12951552.pth... [2024-08-05 14:48:18,271][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000000888_7274496.pth [2024-08-05 14:48:21,386][15444] Updated weights for policy 0, policy_version 1591 (0.0041) [2024-08-05 14:48:23,119][15372] Fps is (10 sec: 22936.2, 60 sec: 23483.6, 300 sec: 23604.0). Total num frames: 13066240. Throughput: 0: 5880.8. Samples: 3262160. Policy #0 lag: (min: 0.0, avg: 4.2, max: 7.0) [2024-08-05 14:48:23,119][15372] Avg episode reward: [(0, '23.057')] [2024-08-05 14:48:24,962][15444] Updated weights for policy 0, policy_version 1601 (0.0011) [2024-08-05 14:48:28,118][15372] Fps is (10 sec: 23756.9, 60 sec: 23483.8, 300 sec: 23631.9). Total num frames: 13189120. Throughput: 0: 5922.7. Samples: 3298520. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 14:48:28,126][15372] Avg episode reward: [(0, '22.597')] [2024-08-05 14:48:28,163][15444] Updated weights for policy 0, policy_version 1611 (0.0018) [2024-08-05 14:48:31,773][15444] Updated weights for policy 0, policy_version 1621 (0.0013) [2024-08-05 14:48:33,120][15372] Fps is (10 sec: 23754.0, 60 sec: 23483.0, 300 sec: 23631.7). Total num frames: 13303808. Throughput: 0: 5922.9. Samples: 3333290. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 14:48:33,121][15372] Avg episode reward: [(0, '23.098')] [2024-08-05 14:48:35,030][15444] Updated weights for policy 0, policy_version 1631 (0.0015) [2024-08-05 14:48:38,119][15372] Fps is (10 sec: 23756.4, 60 sec: 23620.3, 300 sec: 23659.6). Total num frames: 13426688. Throughput: 0: 5925.3. Samples: 3351630. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 14:48:38,119][15372] Avg episode reward: [(0, '23.495')] [2024-08-05 14:48:38,122][15417] Saving new best policy, reward=23.495! [2024-08-05 14:48:38,706][15444] Updated weights for policy 0, policy_version 1641 (0.0030) [2024-08-05 14:48:42,339][15444] Updated weights for policy 0, policy_version 1651 (0.0025) [2024-08-05 14:48:43,118][15372] Fps is (10 sec: 23761.0, 60 sec: 23620.3, 300 sec: 23631.8). Total num frames: 13541376. Throughput: 0: 5914.5. Samples: 3386120. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 14:48:43,119][15372] Avg episode reward: [(0, '24.113')] [2024-08-05 14:48:43,208][15417] Saving new best policy, reward=24.113! [2024-08-05 14:48:45,513][15444] Updated weights for policy 0, policy_version 1661 (0.0017) [2024-08-05 14:48:48,118][15372] Fps is (10 sec: 23757.1, 60 sec: 23620.3, 300 sec: 23659.6). Total num frames: 13664256. Throughput: 0: 5896.5. Samples: 3421770. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 14:48:48,119][15372] Avg episode reward: [(0, '22.748')] [2024-08-05 14:48:49,138][15444] Updated weights for policy 0, policy_version 1671 (0.0011) [2024-08-05 14:48:52,692][15444] Updated weights for policy 0, policy_version 1681 (0.0023) [2024-08-05 14:48:53,120][15372] Fps is (10 sec: 23754.0, 60 sec: 23619.9, 300 sec: 23659.5). Total num frames: 13778944. Throughput: 0: 5900.7. Samples: 3439460. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 14:48:53,120][15372] Avg episode reward: [(0, '22.101')] [2024-08-05 14:48:56,170][15444] Updated weights for policy 0, policy_version 1691 (0.0011) [2024-08-05 14:48:58,118][15372] Fps is (10 sec: 22937.6, 60 sec: 23620.3, 300 sec: 23631.8). Total num frames: 13893632. Throughput: 0: 5879.6. Samples: 3474080. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 14:48:58,126][15372] Avg episode reward: [(0, '22.968')] [2024-08-05 14:48:59,440][15444] Updated weights for policy 0, policy_version 1701 (0.0034) [2024-08-05 14:49:03,119][15372] Fps is (10 sec: 22940.0, 60 sec: 23483.8, 300 sec: 23631.9). Total num frames: 14008320. Throughput: 0: 5884.2. Samples: 3509580. Policy #0 lag: (min: 0.0, avg: 3.0, max: 8.0) [2024-08-05 14:49:03,119][15372] Avg episode reward: [(0, '23.569')] [2024-08-05 14:49:03,128][15444] Updated weights for policy 0, policy_version 1711 (0.0012) [2024-08-05 14:49:06,535][15444] Updated weights for policy 0, policy_version 1721 (0.0030) [2024-08-05 14:49:08,118][15372] Fps is (10 sec: 23756.8, 60 sec: 23620.3, 300 sec: 23659.6). Total num frames: 14131200. Throughput: 0: 5891.9. Samples: 3527290. Policy #0 lag: (min: 0.0, avg: 4.4, max: 8.0) [2024-08-05 14:49:08,119][15372] Avg episode reward: [(0, '22.789')] [2024-08-05 14:49:09,869][15444] Updated weights for policy 0, policy_version 1731 (0.0020) [2024-08-05 14:49:13,119][15372] Fps is (10 sec: 24575.3, 60 sec: 23620.1, 300 sec: 23659.6). Total num frames: 14254080. Throughput: 0: 5893.3. Samples: 3563720. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 14:49:13,119][15372] Avg episode reward: [(0, '23.462')] [2024-08-05 14:49:13,293][15444] Updated weights for policy 0, policy_version 1741 (0.0023) [2024-08-05 14:49:16,889][15444] Updated weights for policy 0, policy_version 1751 (0.0012) [2024-08-05 14:49:18,118][15372] Fps is (10 sec: 23756.9, 60 sec: 23620.3, 300 sec: 23659.6). Total num frames: 14368768. Throughput: 0: 5894.0. Samples: 3598510. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 14:49:18,119][15372] Avg episode reward: [(0, '23.864')] [2024-08-05 14:49:19,488][15417] Signal inference workers to stop experience collection... (600 times) [2024-08-05 14:49:19,488][15417] Signal inference workers to resume experience collection... (600 times) [2024-08-05 14:49:19,525][15444] InferenceWorker_p0-w0: stopping experience collection (600 times) [2024-08-05 14:49:19,525][15444] InferenceWorker_p0-w0: resuming experience collection (600 times) [2024-08-05 14:49:20,194][15444] Updated weights for policy 0, policy_version 1761 (0.0024) [2024-08-05 14:49:23,118][15372] Fps is (10 sec: 23757.9, 60 sec: 23757.0, 300 sec: 23687.4). Total num frames: 14491648. Throughput: 0: 5900.5. Samples: 3617150. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 14:49:23,119][15372] Avg episode reward: [(0, '22.641')] [2024-08-05 14:49:23,798][15444] Updated weights for policy 0, policy_version 1771 (0.0017) [2024-08-05 14:49:26,996][15444] Updated weights for policy 0, policy_version 1781 (0.0013) [2024-08-05 14:49:28,119][15372] Fps is (10 sec: 23756.4, 60 sec: 23620.2, 300 sec: 23659.6). Total num frames: 14606336. Throughput: 0: 5924.6. Samples: 3652730. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 14:49:28,119][15372] Avg episode reward: [(0, '22.858')] [2024-08-05 14:49:30,604][15444] Updated weights for policy 0, policy_version 1791 (0.0017) [2024-08-05 14:49:33,118][15372] Fps is (10 sec: 23756.9, 60 sec: 23757.5, 300 sec: 23687.8). Total num frames: 14729216. Throughput: 0: 5951.1. Samples: 3689570. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 14:49:33,119][15372] Avg episode reward: [(0, '24.303')] [2024-08-05 14:49:33,142][15417] Saving new best policy, reward=24.303! [2024-08-05 14:49:34,052][15444] Updated weights for policy 0, policy_version 1801 (0.0010) [2024-08-05 14:49:37,426][15444] Updated weights for policy 0, policy_version 1811 (0.0011) [2024-08-05 14:49:38,118][15372] Fps is (10 sec: 23757.2, 60 sec: 23620.3, 300 sec: 23687.4). Total num frames: 14843904. Throughput: 0: 5934.8. Samples: 3706520. Policy #0 lag: (min: 2.0, avg: 4.3, max: 9.0) [2024-08-05 14:49:38,119][15372] Avg episode reward: [(0, '24.350')] [2024-08-05 14:49:38,122][15417] Saving new best policy, reward=24.350! [2024-08-05 14:49:40,850][15444] Updated weights for policy 0, policy_version 1821 (0.0020) [2024-08-05 14:49:43,118][15372] Fps is (10 sec: 23756.7, 60 sec: 23756.8, 300 sec: 23659.6). Total num frames: 14966784. Throughput: 0: 5936.7. Samples: 3741230. Policy #0 lag: (min: 1.0, avg: 3.6, max: 7.0) [2024-08-05 14:49:43,119][15372] Avg episode reward: [(0, '22.702')] [2024-08-05 14:49:44,643][15444] Updated weights for policy 0, policy_version 1831 (0.0049) [2024-08-05 14:49:48,119][15372] Fps is (10 sec: 22936.4, 60 sec: 23483.5, 300 sec: 23631.8). Total num frames: 15073280. Throughput: 0: 5906.0. Samples: 3775350. Policy #0 lag: (min: 0.0, avg: 4.2, max: 7.0) [2024-08-05 14:49:48,120][15372] Avg episode reward: [(0, '22.790')] [2024-08-05 14:49:48,143][15444] Updated weights for policy 0, policy_version 1841 (0.0015) [2024-08-05 14:49:51,531][15444] Updated weights for policy 0, policy_version 1851 (0.0016) [2024-08-05 14:49:53,119][15372] Fps is (10 sec: 22937.4, 60 sec: 23620.7, 300 sec: 23659.6). Total num frames: 15196160. Throughput: 0: 5918.9. Samples: 3793640. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 14:49:53,119][15372] Avg episode reward: [(0, '23.653')] [2024-08-05 14:49:55,188][15444] Updated weights for policy 0, policy_version 1861 (0.0038) [2024-08-05 14:49:58,118][15372] Fps is (10 sec: 24577.2, 60 sec: 23756.8, 300 sec: 23631.8). Total num frames: 15319040. Throughput: 0: 5901.6. Samples: 3829290. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 14:49:58,119][15372] Avg episode reward: [(0, '23.621')] [2024-08-05 14:49:58,438][15444] Updated weights for policy 0, policy_version 1871 (0.0014) [2024-08-05 14:50:00,072][15417] Signal inference workers to stop experience collection... (650 times) [2024-08-05 14:50:00,073][15417] Signal inference workers to resume experience collection... (650 times) [2024-08-05 14:50:00,120][15444] InferenceWorker_p0-w0: stopping experience collection (650 times) [2024-08-05 14:50:00,120][15444] InferenceWorker_p0-w0: resuming experience collection (650 times) [2024-08-05 14:50:01,916][15444] Updated weights for policy 0, policy_version 1881 (0.0022) [2024-08-05 14:50:03,118][15372] Fps is (10 sec: 23757.0, 60 sec: 23756.9, 300 sec: 23659.6). Total num frames: 15433728. Throughput: 0: 5913.8. Samples: 3864630. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 14:50:03,119][15372] Avg episode reward: [(0, '24.273')] [2024-08-05 14:50:05,440][15444] Updated weights for policy 0, policy_version 1891 (0.0013) [2024-08-05 14:50:08,118][15372] Fps is (10 sec: 23756.9, 60 sec: 23756.8, 300 sec: 23659.6). Total num frames: 15556608. Throughput: 0: 5905.1. Samples: 3882880. Policy #0 lag: (min: 1.0, avg: 4.0, max: 8.0) [2024-08-05 14:50:08,119][15372] Avg episode reward: [(0, '24.830')] [2024-08-05 14:50:08,122][15417] Saving new best policy, reward=24.830! [2024-08-05 14:50:08,901][15444] Updated weights for policy 0, policy_version 1901 (0.0012) [2024-08-05 14:50:12,244][15444] Updated weights for policy 0, policy_version 1911 (0.0020) [2024-08-05 14:50:13,119][15372] Fps is (10 sec: 23756.6, 60 sec: 23620.4, 300 sec: 23631.8). Total num frames: 15671296. Throughput: 0: 5902.0. Samples: 3918320. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 14:50:13,119][15372] Avg episode reward: [(0, '24.365')] [2024-08-05 14:50:15,632][15444] Updated weights for policy 0, policy_version 1921 (0.0026) [2024-08-05 14:50:18,118][15372] Fps is (10 sec: 23756.8, 60 sec: 23756.8, 300 sec: 23687.4). Total num frames: 15794176. Throughput: 0: 5883.8. Samples: 3954340. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 14:50:18,119][15372] Avg episode reward: [(0, '24.173')] [2024-08-05 14:50:18,123][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000001928_15794176.pth... [2024-08-05 14:50:18,252][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000001235_10117120.pth [2024-08-05 14:50:19,321][15444] Updated weights for policy 0, policy_version 1931 (0.0011) [2024-08-05 14:50:22,811][15444] Updated weights for policy 0, policy_version 1941 (0.0014) [2024-08-05 14:50:23,119][15372] Fps is (10 sec: 23756.8, 60 sec: 23620.2, 300 sec: 23659.6). Total num frames: 15908864. Throughput: 0: 5900.7. Samples: 3972050. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 14:50:23,119][15372] Avg episode reward: [(0, '24.125')] [2024-08-05 14:50:26,172][15444] Updated weights for policy 0, policy_version 1951 (0.0022) [2024-08-05 14:50:28,118][15372] Fps is (10 sec: 22937.5, 60 sec: 23620.3, 300 sec: 23631.9). Total num frames: 16023552. Throughput: 0: 5911.6. Samples: 4007250. Policy #0 lag: (min: 0.0, avg: 3.2, max: 9.0) [2024-08-05 14:50:28,126][15372] Avg episode reward: [(0, '23.094')] [2024-08-05 14:50:29,484][15444] Updated weights for policy 0, policy_version 1961 (0.0030) [2024-08-05 14:50:33,019][15444] Updated weights for policy 0, policy_version 1971 (0.0020) [2024-08-05 14:50:33,119][15372] Fps is (10 sec: 23755.9, 60 sec: 23620.1, 300 sec: 23659.6). Total num frames: 16146432. Throughput: 0: 5944.0. Samples: 4042830. Policy #0 lag: (min: 0.0, avg: 4.3, max: 7.0) [2024-08-05 14:50:33,119][15372] Avg episode reward: [(0, '23.484')] [2024-08-05 14:50:36,498][15444] Updated weights for policy 0, policy_version 1981 (0.0012) [2024-08-05 14:50:38,118][15372] Fps is (10 sec: 24576.1, 60 sec: 23756.8, 300 sec: 23659.7). Total num frames: 16269312. Throughput: 0: 5938.0. Samples: 4060850. Policy #0 lag: (min: 1.0, avg: 3.4, max: 7.0) [2024-08-05 14:50:38,119][15372] Avg episode reward: [(0, '22.656')] [2024-08-05 14:50:39,877][15444] Updated weights for policy 0, policy_version 1991 (0.0028) [2024-08-05 14:50:43,119][15372] Fps is (10 sec: 23757.5, 60 sec: 23620.2, 300 sec: 23659.6). Total num frames: 16384000. Throughput: 0: 5940.4. Samples: 4096610. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 14:50:43,119][15372] Avg episode reward: [(0, '23.670')] [2024-08-05 14:50:43,523][15444] Updated weights for policy 0, policy_version 2001 (0.0016) [2024-08-05 14:50:46,660][15444] Updated weights for policy 0, policy_version 2011 (0.0011) [2024-08-05 14:50:48,115][15417] Signal inference workers to stop experience collection... (700 times) [2024-08-05 14:50:48,115][15417] Signal inference workers to resume experience collection... (700 times) [2024-08-05 14:50:48,118][15372] Fps is (10 sec: 22937.6, 60 sec: 23757.0, 300 sec: 23631.8). Total num frames: 16498688. Throughput: 0: 5932.2. Samples: 4131580. Policy #0 lag: (min: 0.0, avg: 4.0, max: 9.0) [2024-08-05 14:50:48,119][15372] Avg episode reward: [(0, '24.997')] [2024-08-05 14:50:48,160][15444] InferenceWorker_p0-w0: stopping experience collection (700 times) [2024-08-05 14:50:48,161][15444] InferenceWorker_p0-w0: resuming experience collection (700 times) [2024-08-05 14:50:48,195][15417] Saving new best policy, reward=24.997! [2024-08-05 14:50:50,265][15444] Updated weights for policy 0, policy_version 2021 (0.0010) [2024-08-05 14:50:53,119][15372] Fps is (10 sec: 23756.5, 60 sec: 23756.7, 300 sec: 23659.6). Total num frames: 16621568. Throughput: 0: 5927.7. Samples: 4149630. Policy #0 lag: (min: 0.0, avg: 3.2, max: 7.0) [2024-08-05 14:50:53,119][15372] Avg episode reward: [(0, '24.738')] [2024-08-05 14:50:53,809][15444] Updated weights for policy 0, policy_version 2031 (0.0016) [2024-08-05 14:50:57,159][15444] Updated weights for policy 0, policy_version 2041 (0.0013) [2024-08-05 14:50:58,119][15372] Fps is (10 sec: 23755.0, 60 sec: 23620.0, 300 sec: 23631.8). Total num frames: 16736256. Throughput: 0: 5929.5. Samples: 4185150. Policy #0 lag: (min: 1.0, avg: 4.2, max: 7.0) [2024-08-05 14:50:58,120][15372] Avg episode reward: [(0, '24.429')] [2024-08-05 14:51:00,671][15444] Updated weights for policy 0, policy_version 2051 (0.0020) [2024-08-05 14:51:03,119][15372] Fps is (10 sec: 23757.1, 60 sec: 23756.7, 300 sec: 23659.7). Total num frames: 16859136. Throughput: 0: 5942.2. Samples: 4221740. Policy #0 lag: (min: 1.0, avg: 4.5, max: 8.0) [2024-08-05 14:51:03,119][15372] Avg episode reward: [(0, '24.368')] [2024-08-05 14:51:03,979][15444] Updated weights for policy 0, policy_version 2061 (0.0017) [2024-08-05 14:51:07,674][15444] Updated weights for policy 0, policy_version 2071 (0.0020) [2024-08-05 14:51:08,119][15372] Fps is (10 sec: 24577.3, 60 sec: 23756.7, 300 sec: 23687.4). Total num frames: 16982016. Throughput: 0: 5929.3. Samples: 4238870. Policy #0 lag: (min: 1.0, avg: 3.5, max: 8.0) [2024-08-05 14:51:08,119][15372] Avg episode reward: [(0, '24.187')] [2024-08-05 14:51:10,893][15444] Updated weights for policy 0, policy_version 2081 (0.0017) [2024-08-05 14:51:13,119][15372] Fps is (10 sec: 23756.8, 60 sec: 23756.8, 300 sec: 23659.7). Total num frames: 17096704. Throughput: 0: 5923.1. Samples: 4273790. Policy #0 lag: (min: 1.0, avg: 3.6, max: 8.0) [2024-08-05 14:51:13,126][15372] Avg episode reward: [(0, '24.352')] [2024-08-05 14:51:14,469][15444] Updated weights for policy 0, policy_version 2091 (0.0026) [2024-08-05 14:51:18,118][15372] Fps is (10 sec: 22118.7, 60 sec: 23483.7, 300 sec: 23604.1). Total num frames: 17203200. Throughput: 0: 5909.2. Samples: 4308740. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 14:51:18,119][15372] Avg episode reward: [(0, '23.945')] [2024-08-05 14:51:18,241][15444] Updated weights for policy 0, policy_version 2101 (0.0017) [2024-08-05 14:51:21,427][15444] Updated weights for policy 0, policy_version 2111 (0.0035) [2024-08-05 14:51:23,119][15372] Fps is (10 sec: 23757.0, 60 sec: 23756.8, 300 sec: 23659.6). Total num frames: 17334272. Throughput: 0: 5916.0. Samples: 4327070. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 14:51:23,119][15372] Avg episode reward: [(0, '24.458')] [2024-08-05 14:51:24,938][15444] Updated weights for policy 0, policy_version 2121 (0.0017) [2024-08-05 14:51:28,119][15372] Fps is (10 sec: 23756.8, 60 sec: 23620.3, 300 sec: 23604.1). Total num frames: 17440768. Throughput: 0: 5887.3. Samples: 4361540. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 14:51:28,126][15372] Avg episode reward: [(0, '23.996')] [2024-08-05 14:51:28,939][15444] Updated weights for policy 0, policy_version 2131 (0.0013) [2024-08-05 14:51:32,775][15444] Updated weights for policy 0, policy_version 2141 (0.0013) [2024-08-05 14:51:33,118][15372] Fps is (10 sec: 21299.4, 60 sec: 23347.4, 300 sec: 23576.3). Total num frames: 17547264. Throughput: 0: 5796.2. Samples: 4392410. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 14:51:33,119][15372] Avg episode reward: [(0, '23.930')] [2024-08-05 14:51:36,170][15444] Updated weights for policy 0, policy_version 2151 (0.0021) [2024-08-05 14:51:38,119][15372] Fps is (10 sec: 22118.3, 60 sec: 23210.6, 300 sec: 23548.5). Total num frames: 17661952. Throughput: 0: 5794.2. Samples: 4410370. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 14:51:38,119][15372] Avg episode reward: [(0, '24.138')] [2024-08-05 14:51:39,750][15444] Updated weights for policy 0, policy_version 2161 (0.0013) [2024-08-05 14:51:43,119][15372] Fps is (10 sec: 22936.1, 60 sec: 23210.5, 300 sec: 23576.3). Total num frames: 17776640. Throughput: 0: 5782.5. Samples: 4445360. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 14:51:43,119][15372] Avg episode reward: [(0, '23.710')] [2024-08-05 14:51:43,201][15444] Updated weights for policy 0, policy_version 2171 (0.0013) [2024-08-05 14:51:46,752][15444] Updated weights for policy 0, policy_version 2181 (0.0017) [2024-08-05 14:51:48,119][15372] Fps is (10 sec: 22937.4, 60 sec: 23210.6, 300 sec: 23548.5). Total num frames: 17891328. Throughput: 0: 5732.2. Samples: 4479690. Policy #0 lag: (min: 1.0, avg: 3.8, max: 7.0) [2024-08-05 14:51:48,119][15372] Avg episode reward: [(0, '24.949')] [2024-08-05 14:51:50,179][15444] Updated weights for policy 0, policy_version 2191 (0.0023) [2024-08-05 14:51:53,119][15372] Fps is (10 sec: 22938.3, 60 sec: 23074.1, 300 sec: 23520.7). Total num frames: 18006016. Throughput: 0: 5750.0. Samples: 4497620. Policy #0 lag: (min: 1.0, avg: 4.0, max: 8.0) [2024-08-05 14:51:53,119][15372] Avg episode reward: [(0, '25.143')] [2024-08-05 14:51:53,176][15417] Saving new best policy, reward=25.143! [2024-08-05 14:51:53,648][15417] Signal inference workers to stop experience collection... (750 times) [2024-08-05 14:51:53,655][15417] Signal inference workers to resume experience collection... (750 times) [2024-08-05 14:51:53,734][15444] InferenceWorker_p0-w0: stopping experience collection (750 times) [2024-08-05 14:51:53,742][15444] InferenceWorker_p0-w0: resuming experience collection (750 times) [2024-08-05 14:51:53,743][15444] Updated weights for policy 0, policy_version 2201 (0.0013) [2024-08-05 14:51:57,364][15444] Updated weights for policy 0, policy_version 2211 (0.0020) [2024-08-05 14:51:58,119][15372] Fps is (10 sec: 22937.5, 60 sec: 23074.3, 300 sec: 23520.8). Total num frames: 18120704. Throughput: 0: 5731.5. Samples: 4531710. Policy #0 lag: (min: 1.0, avg: 4.7, max: 8.0) [2024-08-05 14:51:58,119][15372] Avg episode reward: [(0, '24.230')] [2024-08-05 14:52:00,863][15444] Updated weights for policy 0, policy_version 2221 (0.0022) [2024-08-05 14:52:03,119][15372] Fps is (10 sec: 23757.0, 60 sec: 23074.1, 300 sec: 23548.5). Total num frames: 18243584. Throughput: 0: 5756.9. Samples: 4567800. Policy #0 lag: (min: 1.0, avg: 3.6, max: 8.0) [2024-08-05 14:52:03,119][15372] Avg episode reward: [(0, '23.653')] [2024-08-05 14:52:04,521][15444] Updated weights for policy 0, policy_version 2231 (0.0012) [2024-08-05 14:52:07,711][15444] Updated weights for policy 0, policy_version 2241 (0.0014) [2024-08-05 14:52:08,118][15372] Fps is (10 sec: 24576.6, 60 sec: 23074.2, 300 sec: 23548.5). Total num frames: 18366464. Throughput: 0: 5732.0. Samples: 4585010. Policy #0 lag: (min: 1.0, avg: 3.5, max: 8.0) [2024-08-05 14:52:08,119][15372] Avg episode reward: [(0, '24.206')] [2024-08-05 14:52:11,377][15444] Updated weights for policy 0, policy_version 2251 (0.0017) [2024-08-05 14:52:13,118][15372] Fps is (10 sec: 23757.3, 60 sec: 23074.2, 300 sec: 23548.5). Total num frames: 18481152. Throughput: 0: 5743.3. Samples: 4619990. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 14:52:13,119][15372] Avg episode reward: [(0, '24.106')] [2024-08-05 14:52:14,789][15444] Updated weights for policy 0, policy_version 2261 (0.0014) [2024-08-05 14:52:18,107][15444] Updated weights for policy 0, policy_version 2271 (0.0017) [2024-08-05 14:52:18,118][15372] Fps is (10 sec: 23756.7, 60 sec: 23347.2, 300 sec: 23548.5). Total num frames: 18604032. Throughput: 0: 5861.1. Samples: 4656160. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 14:52:18,119][15372] Avg episode reward: [(0, '25.097')] [2024-08-05 14:52:18,123][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000002271_18604032.pth... [2024-08-05 14:52:18,247][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000001581_12951552.pth [2024-08-05 14:52:21,599][15444] Updated weights for policy 0, policy_version 2281 (0.0013) [2024-08-05 14:52:23,119][15372] Fps is (10 sec: 23756.7, 60 sec: 23074.1, 300 sec: 23520.8). Total num frames: 18718720. Throughput: 0: 5852.4. Samples: 4673730. Policy #0 lag: (min: 1.0, avg: 3.9, max: 8.0) [2024-08-05 14:52:23,119][15372] Avg episode reward: [(0, '24.898')] [2024-08-05 14:52:24,992][15444] Updated weights for policy 0, policy_version 2291 (0.0024) [2024-08-05 14:52:28,118][15372] Fps is (10 sec: 23756.9, 60 sec: 23347.2, 300 sec: 23548.5). Total num frames: 18841600. Throughput: 0: 5871.9. Samples: 4709590. Policy #0 lag: (min: 2.0, avg: 3.8, max: 8.0) [2024-08-05 14:52:28,119][15372] Avg episode reward: [(0, '24.835')] [2024-08-05 14:52:28,832][15444] Updated weights for policy 0, policy_version 2301 (0.0019) [2024-08-05 14:52:31,985][15444] Updated weights for policy 0, policy_version 2311 (0.0019) [2024-08-05 14:52:33,119][15372] Fps is (10 sec: 22937.6, 60 sec: 23347.2, 300 sec: 23520.8). Total num frames: 18948096. Throughput: 0: 5888.7. Samples: 4744680. Policy #0 lag: (min: 1.0, avg: 4.1, max: 7.0) [2024-08-05 14:52:33,119][15372] Avg episode reward: [(0, '24.288')] [2024-08-05 14:52:35,541][15444] Updated weights for policy 0, policy_version 2321 (0.0018) [2024-08-05 14:52:38,118][15372] Fps is (10 sec: 23756.8, 60 sec: 23620.3, 300 sec: 23576.3). Total num frames: 19079168. Throughput: 0: 5898.5. Samples: 4763050. Policy #0 lag: (min: 1.0, avg: 3.6, max: 7.0) [2024-08-05 14:52:38,119][15372] Avg episode reward: [(0, '24.425')] [2024-08-05 14:52:38,667][15444] Updated weights for policy 0, policy_version 2331 (0.0017) [2024-08-05 14:52:42,438][15444] Updated weights for policy 0, policy_version 2341 (0.0012) [2024-08-05 14:52:43,118][15372] Fps is (10 sec: 24576.2, 60 sec: 23620.5, 300 sec: 23548.5). Total num frames: 19193856. Throughput: 0: 5925.8. Samples: 4798370. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 14:52:43,119][15372] Avg episode reward: [(0, '25.234')] [2024-08-05 14:52:43,119][15417] Saving new best policy, reward=25.234! [2024-08-05 14:52:45,021][15417] Signal inference workers to stop experience collection... (800 times) [2024-08-05 14:52:45,021][15417] Signal inference workers to resume experience collection... (800 times) [2024-08-05 14:52:45,097][15444] InferenceWorker_p0-w0: stopping experience collection (800 times) [2024-08-05 14:52:45,097][15444] InferenceWorker_p0-w0: resuming experience collection (800 times) [2024-08-05 14:52:45,863][15444] Updated weights for policy 0, policy_version 2351 (0.0018) [2024-08-05 14:52:48,119][15372] Fps is (10 sec: 22937.4, 60 sec: 23620.3, 300 sec: 23548.5). Total num frames: 19308544. Throughput: 0: 5897.8. Samples: 4833200. Policy #0 lag: (min: 1.0, avg: 3.6, max: 8.0) [2024-08-05 14:52:48,119][15372] Avg episode reward: [(0, '25.022')] [2024-08-05 14:52:49,452][15444] Updated weights for policy 0, policy_version 2361 (0.0017) [2024-08-05 14:52:53,035][15444] Updated weights for policy 0, policy_version 2371 (0.0034) [2024-08-05 14:52:53,118][15372] Fps is (10 sec: 22937.6, 60 sec: 23620.4, 300 sec: 23548.5). Total num frames: 19423232. Throughput: 0: 5918.0. Samples: 4851320. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 14:52:53,119][15372] Avg episode reward: [(0, '25.117')] [2024-08-05 14:52:56,209][15444] Updated weights for policy 0, policy_version 2381 (0.0014) [2024-08-05 14:52:58,118][15372] Fps is (10 sec: 23756.9, 60 sec: 23756.9, 300 sec: 23548.6). Total num frames: 19546112. Throughput: 0: 5917.1. Samples: 4886260. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 14:52:58,119][15372] Avg episode reward: [(0, '26.000')] [2024-08-05 14:52:58,122][15417] Saving new best policy, reward=26.000! [2024-08-05 14:52:59,919][15444] Updated weights for policy 0, policy_version 2391 (0.0028) [2024-08-05 14:53:03,117][15444] Updated weights for policy 0, policy_version 2401 (0.0012) [2024-08-05 14:53:03,119][15372] Fps is (10 sec: 24575.0, 60 sec: 23756.7, 300 sec: 23576.3). Total num frames: 19668992. Throughput: 0: 5901.5. Samples: 4921730. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 14:53:03,119][15372] Avg episode reward: [(0, '25.382')] [2024-08-05 14:53:06,920][15444] Updated weights for policy 0, policy_version 2411 (0.0013) [2024-08-05 14:53:08,119][15372] Fps is (10 sec: 22935.5, 60 sec: 23483.4, 300 sec: 23520.7). Total num frames: 19775488. Throughput: 0: 5904.1. Samples: 4939420. Policy #0 lag: (min: 1.0, avg: 4.7, max: 9.0) [2024-08-05 14:53:08,120][15372] Avg episode reward: [(0, '24.951')] [2024-08-05 14:53:10,219][15444] Updated weights for policy 0, policy_version 2421 (0.0026) [2024-08-05 14:53:13,119][15372] Fps is (10 sec: 22119.0, 60 sec: 23483.7, 300 sec: 23520.7). Total num frames: 19890176. Throughput: 0: 5897.8. Samples: 4974990. Policy #0 lag: (min: 0.0, avg: 4.9, max: 9.0) [2024-08-05 14:53:13,127][15372] Avg episode reward: [(0, '25.973')] [2024-08-05 14:53:13,751][15444] Updated weights for policy 0, policy_version 2431 (0.0032) [2024-08-05 14:53:17,361][15444] Updated weights for policy 0, policy_version 2441 (0.0030) [2024-08-05 14:53:18,119][15372] Fps is (10 sec: 23758.9, 60 sec: 23483.7, 300 sec: 23548.6). Total num frames: 20013056. Throughput: 0: 5892.9. Samples: 5009860. Policy #0 lag: (min: 1.0, avg: 4.0, max: 8.0) [2024-08-05 14:53:18,123][15372] Avg episode reward: [(0, '26.015')] [2024-08-05 14:53:18,128][15417] Saving new best policy, reward=26.015! [2024-08-05 14:53:20,554][15444] Updated weights for policy 0, policy_version 2451 (0.0011) [2024-08-05 14:53:23,119][15372] Fps is (10 sec: 24575.8, 60 sec: 23620.2, 300 sec: 23548.5). Total num frames: 20135936. Throughput: 0: 5877.1. Samples: 5027520. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 14:53:23,119][15372] Avg episode reward: [(0, '25.223')] [2024-08-05 14:53:24,304][15444] Updated weights for policy 0, policy_version 2461 (0.0022) [2024-08-05 14:53:27,603][15444] Updated weights for policy 0, policy_version 2471 (0.0048) [2024-08-05 14:53:28,118][15372] Fps is (10 sec: 23756.9, 60 sec: 23483.7, 300 sec: 23548.7). Total num frames: 20250624. Throughput: 0: 5890.7. Samples: 5063450. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 14:53:28,119][15372] Avg episode reward: [(0, '25.506')] [2024-08-05 14:53:28,986][15417] Signal inference workers to stop experience collection... (850 times) [2024-08-05 14:53:28,986][15417] Signal inference workers to resume experience collection... (850 times) [2024-08-05 14:53:29,029][15444] InferenceWorker_p0-w0: stopping experience collection (850 times) [2024-08-05 14:53:29,029][15444] InferenceWorker_p0-w0: resuming experience collection (850 times) [2024-08-05 14:53:31,070][15444] Updated weights for policy 0, policy_version 2481 (0.0028) [2024-08-05 14:53:33,119][15372] Fps is (10 sec: 23756.9, 60 sec: 23756.8, 300 sec: 23548.5). Total num frames: 20373504. Throughput: 0: 5901.3. Samples: 5098760. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 14:53:33,120][15372] Avg episode reward: [(0, '25.303')] [2024-08-05 14:53:34,627][15444] Updated weights for policy 0, policy_version 2491 (0.0016) [2024-08-05 14:53:38,025][15444] Updated weights for policy 0, policy_version 2501 (0.0018) [2024-08-05 14:53:38,121][15372] Fps is (10 sec: 23751.4, 60 sec: 23482.8, 300 sec: 23548.3). Total num frames: 20488192. Throughput: 0: 5889.7. Samples: 5116370. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 14:53:38,121][15372] Avg episode reward: [(0, '25.897')] [2024-08-05 14:53:41,519][15444] Updated weights for policy 0, policy_version 2511 (0.0013) [2024-08-05 14:53:43,120][15372] Fps is (10 sec: 22935.4, 60 sec: 23483.3, 300 sec: 23520.7). Total num frames: 20602880. Throughput: 0: 5897.2. Samples: 5151640. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 14:53:43,120][15372] Avg episode reward: [(0, '25.753')] [2024-08-05 14:53:44,760][15444] Updated weights for policy 0, policy_version 2521 (0.0028) [2024-08-05 14:53:48,119][15372] Fps is (10 sec: 23762.1, 60 sec: 23620.3, 300 sec: 23548.6). Total num frames: 20725760. Throughput: 0: 5892.5. Samples: 5186890. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 14:53:48,126][15372] Avg episode reward: [(0, '24.967')] [2024-08-05 14:53:48,566][15444] Updated weights for policy 0, policy_version 2531 (0.0028) [2024-08-05 14:53:52,166][15444] Updated weights for policy 0, policy_version 2541 (0.0025) [2024-08-05 14:53:53,118][15372] Fps is (10 sec: 23759.6, 60 sec: 23620.3, 300 sec: 23548.5). Total num frames: 20840448. Throughput: 0: 5896.6. Samples: 5204760. Policy #0 lag: (min: 1.0, avg: 3.5, max: 7.0) [2024-08-05 14:53:53,119][15372] Avg episode reward: [(0, '24.597')] [2024-08-05 14:53:55,455][15444] Updated weights for policy 0, policy_version 2551 (0.0010) [2024-08-05 14:53:58,118][15372] Fps is (10 sec: 23757.0, 60 sec: 23620.3, 300 sec: 23576.3). Total num frames: 20963328. Throughput: 0: 5900.0. Samples: 5240490. Policy #0 lag: (min: 1.0, avg: 4.3, max: 8.0) [2024-08-05 14:53:58,119][15372] Avg episode reward: [(0, '25.810')] [2024-08-05 14:53:59,077][15444] Updated weights for policy 0, policy_version 2561 (0.0018) [2024-08-05 14:54:02,114][15444] Updated weights for policy 0, policy_version 2571 (0.0020) [2024-08-05 14:54:03,119][15372] Fps is (10 sec: 22936.8, 60 sec: 23347.2, 300 sec: 23520.7). Total num frames: 21069824. Throughput: 0: 5886.9. Samples: 5274770. Policy #0 lag: (min: 1.0, avg: 3.8, max: 7.0) [2024-08-05 14:54:03,119][15372] Avg episode reward: [(0, '25.745')] [2024-08-05 14:54:05,910][15444] Updated weights for policy 0, policy_version 2581 (0.0016) [2024-08-05 14:54:08,118][15372] Fps is (10 sec: 22937.6, 60 sec: 23620.6, 300 sec: 23520.8). Total num frames: 21192704. Throughput: 0: 5901.4. Samples: 5293080. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 14:54:08,119][15372] Avg episode reward: [(0, '24.572')] [2024-08-05 14:54:09,379][15444] Updated weights for policy 0, policy_version 2591 (0.0019) [2024-08-05 14:54:12,838][15444] Updated weights for policy 0, policy_version 2601 (0.0028) [2024-08-05 14:54:13,118][15372] Fps is (10 sec: 24576.9, 60 sec: 23756.9, 300 sec: 23548.5). Total num frames: 21315584. Throughput: 0: 5882.9. Samples: 5328180. Policy #0 lag: (min: 1.0, avg: 4.6, max: 8.0) [2024-08-05 14:54:13,119][15372] Avg episode reward: [(0, '25.230')] [2024-08-05 14:54:16,519][15444] Updated weights for policy 0, policy_version 2611 (0.0013) [2024-08-05 14:54:18,119][15372] Fps is (10 sec: 22937.5, 60 sec: 23483.7, 300 sec: 23493.0). Total num frames: 21422080. Throughput: 0: 5860.9. Samples: 5362500. Policy #0 lag: (min: 0.0, avg: 3.2, max: 7.0) [2024-08-05 14:54:18,119][15372] Avg episode reward: [(0, '25.289')] [2024-08-05 14:54:18,123][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000002615_21422080.pth... [2024-08-05 14:54:18,244][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000001928_15794176.pth [2024-08-05 14:54:19,975][15444] Updated weights for policy 0, policy_version 2621 (0.0020) [2024-08-05 14:54:23,118][15372] Fps is (10 sec: 22118.4, 60 sec: 23347.3, 300 sec: 23493.0). Total num frames: 21536768. Throughput: 0: 5860.1. Samples: 5380060. Policy #0 lag: (min: 1.0, avg: 3.9, max: 7.0) [2024-08-05 14:54:23,126][15372] Avg episode reward: [(0, '25.938')] [2024-08-05 14:54:23,479][15444] Updated weights for policy 0, policy_version 2631 (0.0014) [2024-08-05 14:54:27,048][15444] Updated weights for policy 0, policy_version 2641 (0.0042) [2024-08-05 14:54:27,873][15417] Signal inference workers to stop experience collection... (900 times) [2024-08-05 14:54:27,873][15417] Signal inference workers to resume experience collection... (900 times) [2024-08-05 14:54:27,915][15444] InferenceWorker_p0-w0: stopping experience collection (900 times) [2024-08-05 14:54:27,916][15444] InferenceWorker_p0-w0: resuming experience collection (900 times) [2024-08-05 14:54:28,119][15372] Fps is (10 sec: 23756.8, 60 sec: 23483.7, 300 sec: 23493.0). Total num frames: 21659648. Throughput: 0: 5842.4. Samples: 5414540. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 14:54:28,119][15372] Avg episode reward: [(0, '26.501')] [2024-08-05 14:54:28,122][15417] Saving new best policy, reward=26.501! [2024-08-05 14:54:30,576][15444] Updated weights for policy 0, policy_version 2651 (0.0012) [2024-08-05 14:54:33,119][15372] Fps is (10 sec: 23756.7, 60 sec: 23347.3, 300 sec: 23493.0). Total num frames: 21774336. Throughput: 0: 5854.2. Samples: 5450330. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 14:54:33,119][15372] Avg episode reward: [(0, '25.012')] [2024-08-05 14:54:34,084][15444] Updated weights for policy 0, policy_version 2661 (0.0016) [2024-08-05 14:54:37,777][15444] Updated weights for policy 0, policy_version 2671 (0.0021) [2024-08-05 14:54:38,119][15372] Fps is (10 sec: 22118.0, 60 sec: 23211.5, 300 sec: 23437.4). Total num frames: 21880832. Throughput: 0: 5846.4. Samples: 5467850. Policy #0 lag: (min: 0.0, avg: 4.3, max: 9.0) [2024-08-05 14:54:38,119][15372] Avg episode reward: [(0, '25.445')] [2024-08-05 14:54:41,034][15444] Updated weights for policy 0, policy_version 2681 (0.0019) [2024-08-05 14:54:43,118][15372] Fps is (10 sec: 22937.6, 60 sec: 23347.7, 300 sec: 23493.0). Total num frames: 22003712. Throughput: 0: 5803.6. Samples: 5501650. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 14:54:43,119][15372] Avg episode reward: [(0, '25.914')] [2024-08-05 14:54:44,717][15444] Updated weights for policy 0, policy_version 2691 (0.0012) [2024-08-05 14:54:48,118][15372] Fps is (10 sec: 23757.3, 60 sec: 23210.7, 300 sec: 23465.2). Total num frames: 22118400. Throughput: 0: 5803.8. Samples: 5535940. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 14:54:48,126][15372] Avg episode reward: [(0, '24.871')] [2024-08-05 14:54:48,440][15444] Updated weights for policy 0, policy_version 2701 (0.0028) [2024-08-05 14:54:51,916][15444] Updated weights for policy 0, policy_version 2711 (0.0036) [2024-08-05 14:54:53,119][15372] Fps is (10 sec: 22936.6, 60 sec: 23210.5, 300 sec: 23437.4). Total num frames: 22233088. Throughput: 0: 5796.8. Samples: 5553940. Policy #0 lag: (min: 1.0, avg: 3.9, max: 7.0) [2024-08-05 14:54:53,119][15372] Avg episode reward: [(0, '24.693')] [2024-08-05 14:54:55,216][15444] Updated weights for policy 0, policy_version 2721 (0.0015) [2024-08-05 14:54:58,119][15372] Fps is (10 sec: 22937.5, 60 sec: 23074.1, 300 sec: 23437.4). Total num frames: 22347776. Throughput: 0: 5788.2. Samples: 5588650. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 14:54:58,126][15372] Avg episode reward: [(0, '25.274')] [2024-08-05 14:54:59,032][15444] Updated weights for policy 0, policy_version 2731 (0.0031) [2024-08-05 14:55:02,261][15444] Updated weights for policy 0, policy_version 2741 (0.0017) [2024-08-05 14:55:03,120][15372] Fps is (10 sec: 22934.9, 60 sec: 23210.2, 300 sec: 23409.6). Total num frames: 22462464. Throughput: 0: 5800.0. Samples: 5623510. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 14:55:03,120][15372] Avg episode reward: [(0, '25.522')] [2024-08-05 14:55:05,902][15444] Updated weights for policy 0, policy_version 2751 (0.0027) [2024-08-05 14:55:08,118][15372] Fps is (10 sec: 23756.8, 60 sec: 23210.6, 300 sec: 23437.5). Total num frames: 22585344. Throughput: 0: 5804.7. Samples: 5641270. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 14:55:08,120][15372] Avg episode reward: [(0, '25.645')] [2024-08-05 14:55:09,457][15444] Updated weights for policy 0, policy_version 2761 (0.0012) [2024-08-05 14:55:12,800][15444] Updated weights for policy 0, policy_version 2771 (0.0024) [2024-08-05 14:55:13,118][15372] Fps is (10 sec: 23760.6, 60 sec: 23074.1, 300 sec: 23409.7). Total num frames: 22700032. Throughput: 0: 5810.7. Samples: 5676020. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 14:55:13,119][15372] Avg episode reward: [(0, '25.358')] [2024-08-05 14:55:16,497][15444] Updated weights for policy 0, policy_version 2781 (0.0010) [2024-08-05 14:55:18,119][15372] Fps is (10 sec: 22936.5, 60 sec: 23210.5, 300 sec: 23409.6). Total num frames: 22814720. Throughput: 0: 5789.9. Samples: 5710880. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 14:55:18,119][15372] Avg episode reward: [(0, '26.368')] [2024-08-05 14:55:19,830][15444] Updated weights for policy 0, policy_version 2791 (0.0011) [2024-08-05 14:55:22,600][15417] Signal inference workers to stop experience collection... (950 times) [2024-08-05 14:55:22,606][15417] Signal inference workers to resume experience collection... (950 times) [2024-08-05 14:55:22,649][15444] InferenceWorker_p0-w0: stopping experience collection (950 times) [2024-08-05 14:55:22,656][15444] InferenceWorker_p0-w0: resuming experience collection (950 times) [2024-08-05 14:55:23,120][15372] Fps is (10 sec: 23753.3, 60 sec: 23346.6, 300 sec: 23437.3). Total num frames: 22937600. Throughput: 0: 5801.2. Samples: 5728910. Policy #0 lag: (min: 1.0, avg: 3.6, max: 7.0) [2024-08-05 14:55:23,120][15372] Avg episode reward: [(0, '26.444')] [2024-08-05 14:55:23,294][15444] Updated weights for policy 0, policy_version 2801 (0.0022) [2024-08-05 14:55:27,078][15444] Updated weights for policy 0, policy_version 2811 (0.0026) [2024-08-05 14:55:28,119][15372] Fps is (10 sec: 23757.7, 60 sec: 23210.6, 300 sec: 23409.7). Total num frames: 23052288. Throughput: 0: 5813.3. Samples: 5763250. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 14:55:28,119][15372] Avg episode reward: [(0, '26.328')] [2024-08-05 14:55:30,313][15444] Updated weights for policy 0, policy_version 2821 (0.0013) [2024-08-05 14:55:33,119][15372] Fps is (10 sec: 23760.2, 60 sec: 23347.2, 300 sec: 23409.7). Total num frames: 23175168. Throughput: 0: 5844.0. Samples: 5798920. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 14:55:33,119][15372] Avg episode reward: [(0, '26.766')] [2024-08-05 14:55:33,119][15417] Saving new best policy, reward=26.766! [2024-08-05 14:55:33,823][15444] Updated weights for policy 0, policy_version 2831 (0.0026) [2024-08-05 14:55:37,331][15444] Updated weights for policy 0, policy_version 2841 (0.0012) [2024-08-05 14:55:38,118][15372] Fps is (10 sec: 23757.3, 60 sec: 23483.8, 300 sec: 23409.7). Total num frames: 23289856. Throughput: 0: 5837.8. Samples: 5816640. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 14:55:38,119][15372] Avg episode reward: [(0, '26.297')] [2024-08-05 14:55:41,024][15444] Updated weights for policy 0, policy_version 2851 (0.0013) [2024-08-05 14:55:43,118][15372] Fps is (10 sec: 22937.7, 60 sec: 23347.2, 300 sec: 23409.7). Total num frames: 23404544. Throughput: 0: 5839.1. Samples: 5851410. Policy #0 lag: (min: 0.0, avg: 4.8, max: 9.0) [2024-08-05 14:55:43,119][15372] Avg episode reward: [(0, '26.364')] [2024-08-05 14:55:44,346][15444] Updated weights for policy 0, policy_version 2861 (0.0011) [2024-08-05 14:55:48,118][15372] Fps is (10 sec: 22118.2, 60 sec: 23210.7, 300 sec: 23354.2). Total num frames: 23511040. Throughput: 0: 5836.2. Samples: 5886130. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 14:55:48,119][15372] Avg episode reward: [(0, '26.144')] [2024-08-05 14:55:48,146][15444] Updated weights for policy 0, policy_version 2871 (0.0026) [2024-08-05 14:55:51,665][15444] Updated weights for policy 0, policy_version 2881 (0.0017) [2024-08-05 14:55:53,119][15372] Fps is (10 sec: 22936.1, 60 sec: 23347.1, 300 sec: 23381.9). Total num frames: 23633920. Throughput: 0: 5827.7. Samples: 5903520. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 14:55:53,127][15372] Avg episode reward: [(0, '25.580')] [2024-08-05 14:55:54,948][15444] Updated weights for policy 0, policy_version 2891 (0.0018) [2024-08-05 14:55:58,120][15372] Fps is (10 sec: 23753.1, 60 sec: 23346.6, 300 sec: 23354.0). Total num frames: 23748608. Throughput: 0: 5836.9. Samples: 5938690. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 14:55:58,120][15372] Avg episode reward: [(0, '26.062')] [2024-08-05 14:55:58,644][15444] Updated weights for policy 0, policy_version 2901 (0.0012) [2024-08-05 14:56:01,923][15444] Updated weights for policy 0, policy_version 2911 (0.0022) [2024-08-05 14:56:03,119][15372] Fps is (10 sec: 22938.4, 60 sec: 23347.7, 300 sec: 23326.4). Total num frames: 23863296. Throughput: 0: 5830.5. Samples: 5973250. Policy #0 lag: (min: 0.0, avg: 4.4, max: 8.0) [2024-08-05 14:56:03,119][15372] Avg episode reward: [(0, '26.269')] [2024-08-05 14:56:05,724][15444] Updated weights for policy 0, policy_version 2921 (0.0016) [2024-08-05 14:56:08,119][15372] Fps is (10 sec: 23759.9, 60 sec: 23347.1, 300 sec: 23354.1). Total num frames: 23986176. Throughput: 0: 5828.8. Samples: 5991200. Policy #0 lag: (min: 0.0, avg: 3.2, max: 8.0) [2024-08-05 14:56:08,119][15372] Avg episode reward: [(0, '25.826')] [2024-08-05 14:56:08,946][15444] Updated weights for policy 0, policy_version 2931 (0.0019) [2024-08-05 14:56:12,667][15444] Updated weights for policy 0, policy_version 2941 (0.0012) [2024-08-05 14:56:13,119][15372] Fps is (10 sec: 23757.3, 60 sec: 23347.2, 300 sec: 23381.9). Total num frames: 24100864. Throughput: 0: 5842.7. Samples: 6026170. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 14:56:13,119][15372] Avg episode reward: [(0, '26.057')] [2024-08-05 14:56:16,030][15444] Updated weights for policy 0, policy_version 2951 (0.0017) [2024-08-05 14:56:17,661][15417] Signal inference workers to stop experience collection... (1000 times) [2024-08-05 14:56:17,662][15417] Signal inference workers to resume experience collection... (1000 times) [2024-08-05 14:56:17,700][15444] InferenceWorker_p0-w0: stopping experience collection (1000 times) [2024-08-05 14:56:17,701][15444] InferenceWorker_p0-w0: resuming experience collection (1000 times) [2024-08-05 14:56:18,119][15372] Fps is (10 sec: 23757.2, 60 sec: 23483.9, 300 sec: 23354.1). Total num frames: 24223744. Throughput: 0: 5816.9. Samples: 6060680. Policy #0 lag: (min: 1.0, avg: 4.4, max: 9.0) [2024-08-05 14:56:18,119][15372] Avg episode reward: [(0, '26.103')] [2024-08-05 14:56:18,122][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000002957_24223744.pth... [2024-08-05 14:56:18,237][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000002271_18604032.pth [2024-08-05 14:56:19,688][15444] Updated weights for policy 0, policy_version 2961 (0.0014) [2024-08-05 14:56:23,118][15372] Fps is (10 sec: 22937.9, 60 sec: 23211.2, 300 sec: 23354.1). Total num frames: 24330240. Throughput: 0: 5810.4. Samples: 6078110. Policy #0 lag: (min: 1.0, avg: 3.5, max: 8.0) [2024-08-05 14:56:23,119][15372] Avg episode reward: [(0, '26.444')] [2024-08-05 14:56:23,217][15444] Updated weights for policy 0, policy_version 2971 (0.0017) [2024-08-05 14:56:26,678][15444] Updated weights for policy 0, policy_version 2981 (0.0022) [2024-08-05 14:56:28,119][15372] Fps is (10 sec: 21299.0, 60 sec: 23074.1, 300 sec: 23354.1). Total num frames: 24436736. Throughput: 0: 5808.2. Samples: 6112780. Policy #0 lag: (min: 0.0, avg: 3.4, max: 8.0) [2024-08-05 14:56:28,119][15372] Avg episode reward: [(0, '26.389')] [2024-08-05 14:56:30,341][15444] Updated weights for policy 0, policy_version 2991 (0.0012) [2024-08-05 14:56:33,119][15372] Fps is (10 sec: 23754.8, 60 sec: 23210.4, 300 sec: 23409.6). Total num frames: 24567808. Throughput: 0: 5803.0. Samples: 6147270. Policy #0 lag: (min: 0.0, avg: 4.3, max: 7.0) [2024-08-05 14:56:33,120][15372] Avg episode reward: [(0, '26.587')] [2024-08-05 14:56:33,860][15444] Updated weights for policy 0, policy_version 3001 (0.0014) [2024-08-05 14:56:37,600][15444] Updated weights for policy 0, policy_version 3011 (0.0027) [2024-08-05 14:56:38,119][15372] Fps is (10 sec: 24576.3, 60 sec: 23210.6, 300 sec: 23409.7). Total num frames: 24682496. Throughput: 0: 5809.6. Samples: 6164950. Policy #0 lag: (min: 1.0, avg: 3.5, max: 7.0) [2024-08-05 14:56:38,119][15372] Avg episode reward: [(0, '27.096')] [2024-08-05 14:56:38,122][15417] Saving new best policy, reward=27.096! [2024-08-05 14:56:40,828][15444] Updated weights for policy 0, policy_version 3021 (0.0017) [2024-08-05 14:56:43,118][15372] Fps is (10 sec: 22939.5, 60 sec: 23210.7, 300 sec: 23409.7). Total num frames: 24797184. Throughput: 0: 5786.0. Samples: 6199050. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 14:56:43,119][15372] Avg episode reward: [(0, '26.658')] [2024-08-05 14:56:44,667][15444] Updated weights for policy 0, policy_version 3031 (0.0017) [2024-08-05 14:56:48,002][15444] Updated weights for policy 0, policy_version 3041 (0.0027) [2024-08-05 14:56:48,119][15372] Fps is (10 sec: 22937.4, 60 sec: 23347.2, 300 sec: 23409.7). Total num frames: 24911872. Throughput: 0: 5780.7. Samples: 6233380. Policy #0 lag: (min: 1.0, avg: 3.9, max: 7.0) [2024-08-05 14:56:48,119][15372] Avg episode reward: [(0, '26.470')] [2024-08-05 14:56:51,860][15444] Updated weights for policy 0, policy_version 3051 (0.0023) [2024-08-05 14:56:53,119][15372] Fps is (10 sec: 22937.2, 60 sec: 23210.9, 300 sec: 23409.7). Total num frames: 25026560. Throughput: 0: 5770.0. Samples: 6250850. Policy #0 lag: (min: 0.0, avg: 3.4, max: 8.0) [2024-08-05 14:56:53,119][15372] Avg episode reward: [(0, '26.732')] [2024-08-05 14:56:54,882][15444] Updated weights for policy 0, policy_version 3061 (0.0016) [2024-08-05 14:56:58,118][15372] Fps is (10 sec: 22937.9, 60 sec: 23211.3, 300 sec: 23381.9). Total num frames: 25141248. Throughput: 0: 5788.2. Samples: 6286640. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 14:56:58,119][15372] Avg episode reward: [(0, '26.226')] [2024-08-05 14:56:58,522][15444] Updated weights for policy 0, policy_version 3071 (0.0012) [2024-08-05 14:57:02,028][15444] Updated weights for policy 0, policy_version 3081 (0.0020) [2024-08-05 14:57:03,119][15372] Fps is (10 sec: 22937.9, 60 sec: 23210.8, 300 sec: 23354.1). Total num frames: 25255936. Throughput: 0: 5789.8. Samples: 6321220. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 14:57:03,119][15372] Avg episode reward: [(0, '25.944')] [2024-08-05 14:57:05,542][15444] Updated weights for policy 0, policy_version 3091 (0.0019) [2024-08-05 14:57:08,118][15372] Fps is (10 sec: 23756.8, 60 sec: 23210.8, 300 sec: 23381.9). Total num frames: 25378816. Throughput: 0: 5807.3. Samples: 6339440. Policy #0 lag: (min: 1.0, avg: 3.6, max: 7.0) [2024-08-05 14:57:08,119][15372] Avg episode reward: [(0, '26.462')] [2024-08-05 14:57:09,127][15444] Updated weights for policy 0, policy_version 3101 (0.0029) [2024-08-05 14:57:12,495][15444] Updated weights for policy 0, policy_version 3111 (0.0020) [2024-08-05 14:57:13,119][15372] Fps is (10 sec: 23756.6, 60 sec: 23210.7, 300 sec: 23354.1). Total num frames: 25493504. Throughput: 0: 5809.3. Samples: 6374200. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 14:57:13,119][15372] Avg episode reward: [(0, '26.188')] [2024-08-05 14:57:16,294][15444] Updated weights for policy 0, policy_version 3121 (0.0015) [2024-08-05 14:57:18,118][15372] Fps is (10 sec: 23756.9, 60 sec: 23210.7, 300 sec: 23381.9). Total num frames: 25616384. Throughput: 0: 5812.1. Samples: 6408810. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 14:57:18,119][15372] Avg episode reward: [(0, '26.211')] [2024-08-05 14:57:19,485][15444] Updated weights for policy 0, policy_version 3131 (0.0014) [2024-08-05 14:57:23,118][15372] Fps is (10 sec: 22937.8, 60 sec: 23210.6, 300 sec: 23326.4). Total num frames: 25722880. Throughput: 0: 5831.8. Samples: 6427380. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 14:57:23,119][15372] Avg episode reward: [(0, '25.930')] [2024-08-05 14:57:23,227][15444] Updated weights for policy 0, policy_version 3141 (0.0030) [2024-08-05 14:57:26,235][15417] Signal inference workers to stop experience collection... (1050 times) [2024-08-05 14:57:26,235][15417] Signal inference workers to resume experience collection... (1050 times) [2024-08-05 14:57:26,306][15444] InferenceWorker_p0-w0: stopping experience collection (1050 times) [2024-08-05 14:57:26,306][15444] InferenceWorker_p0-w0: resuming experience collection (1050 times) [2024-08-05 14:57:26,319][15444] Updated weights for policy 0, policy_version 3151 (0.0041) [2024-08-05 14:57:28,118][15372] Fps is (10 sec: 22937.6, 60 sec: 23483.8, 300 sec: 23381.9). Total num frames: 25845760. Throughput: 0: 5830.9. Samples: 6461440. Policy #0 lag: (min: 1.0, avg: 3.8, max: 8.0) [2024-08-05 14:57:28,119][15372] Avg episode reward: [(0, '25.807')] [2024-08-05 14:57:30,103][15444] Updated weights for policy 0, policy_version 3161 (0.0027) [2024-08-05 14:57:33,118][15372] Fps is (10 sec: 23756.9, 60 sec: 23211.0, 300 sec: 23326.4). Total num frames: 25960448. Throughput: 0: 5853.4. Samples: 6496780. Policy #0 lag: (min: 1.0, avg: 3.4, max: 7.0) [2024-08-05 14:57:33,126][15372] Avg episode reward: [(0, '25.261')] [2024-08-05 14:57:33,545][15444] Updated weights for policy 0, policy_version 3171 (0.0019) [2024-08-05 14:57:37,358][15444] Updated weights for policy 0, policy_version 3181 (0.0017) [2024-08-05 14:57:38,119][15372] Fps is (10 sec: 22936.7, 60 sec: 23210.5, 300 sec: 23326.3). Total num frames: 26075136. Throughput: 0: 5846.2. Samples: 6513930. Policy #0 lag: (min: 0.0, avg: 3.8, max: 9.0) [2024-08-05 14:57:38,119][15372] Avg episode reward: [(0, '25.834')] [2024-08-05 14:57:40,807][15444] Updated weights for policy 0, policy_version 3191 (0.0010) [2024-08-05 14:57:43,120][15372] Fps is (10 sec: 22934.2, 60 sec: 23210.1, 300 sec: 23326.3). Total num frames: 26189824. Throughput: 0: 5823.4. Samples: 6548700. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 14:57:43,120][15372] Avg episode reward: [(0, '25.744')] [2024-08-05 14:57:44,221][15444] Updated weights for policy 0, policy_version 3201 (0.0011) [2024-08-05 14:57:47,875][15444] Updated weights for policy 0, policy_version 3211 (0.0015) [2024-08-05 14:57:48,119][15372] Fps is (10 sec: 23756.4, 60 sec: 23347.1, 300 sec: 23354.1). Total num frames: 26312704. Throughput: 0: 5829.0. Samples: 6583530. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 14:57:48,119][15372] Avg episode reward: [(0, '27.241')] [2024-08-05 14:57:48,125][15417] Saving new best policy, reward=27.241! [2024-08-05 14:57:51,498][15444] Updated weights for policy 0, policy_version 3221 (0.0021) [2024-08-05 14:57:53,119][15372] Fps is (10 sec: 22939.6, 60 sec: 23210.5, 300 sec: 23298.6). Total num frames: 26419200. Throughput: 0: 5812.4. Samples: 6601000. Policy #0 lag: (min: 1.0, avg: 3.7, max: 7.0) [2024-08-05 14:57:53,120][15372] Avg episode reward: [(0, '27.493')] [2024-08-05 14:57:53,122][15417] Saving new best policy, reward=27.493! [2024-08-05 14:57:54,787][15444] Updated weights for policy 0, policy_version 3231 (0.0033) [2024-08-05 14:57:58,118][15372] Fps is (10 sec: 22938.8, 60 sec: 23347.2, 300 sec: 23298.6). Total num frames: 26542080. Throughput: 0: 5811.3. Samples: 6635710. Policy #0 lag: (min: 1.0, avg: 3.9, max: 7.0) [2024-08-05 14:57:58,119][15372] Avg episode reward: [(0, '27.376')] [2024-08-05 14:57:58,392][15444] Updated weights for policy 0, policy_version 3241 (0.0019) [2024-08-05 14:58:01,790][15444] Updated weights for policy 0, policy_version 3251 (0.0022) [2024-08-05 14:58:03,118][15372] Fps is (10 sec: 23758.3, 60 sec: 23347.2, 300 sec: 23326.4). Total num frames: 26656768. Throughput: 0: 5811.8. Samples: 6670340. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 14:58:03,119][15372] Avg episode reward: [(0, '27.017')] [2024-08-05 14:58:05,354][15444] Updated weights for policy 0, policy_version 3261 (0.0038) [2024-08-05 14:58:08,119][15372] Fps is (10 sec: 22936.9, 60 sec: 23210.5, 300 sec: 23326.4). Total num frames: 26771456. Throughput: 0: 5796.8. Samples: 6688240. Policy #0 lag: (min: 1.0, avg: 3.8, max: 7.0) [2024-08-05 14:58:08,119][15372] Avg episode reward: [(0, '25.697')] [2024-08-05 14:58:08,882][15444] Updated weights for policy 0, policy_version 3271 (0.0016) [2024-08-05 14:58:12,605][15444] Updated weights for policy 0, policy_version 3281 (0.0020) [2024-08-05 14:58:13,118][15372] Fps is (10 sec: 23756.8, 60 sec: 23347.2, 300 sec: 23326.4). Total num frames: 26894336. Throughput: 0: 5809.8. Samples: 6722880. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 14:58:13,119][15372] Avg episode reward: [(0, '26.571')] [2024-08-05 14:58:15,895][15444] Updated weights for policy 0, policy_version 3291 (0.0011) [2024-08-05 14:58:18,119][15372] Fps is (10 sec: 23757.3, 60 sec: 23210.6, 300 sec: 23298.6). Total num frames: 27009024. Throughput: 0: 5798.2. Samples: 6757700. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 14:58:18,119][15372] Avg episode reward: [(0, '27.299')] [2024-08-05 14:58:18,122][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000003297_27009024.pth... [2024-08-05 14:58:18,260][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000002615_21422080.pth [2024-08-05 14:58:19,516][15444] Updated weights for policy 0, policy_version 3301 (0.0027) [2024-08-05 14:58:23,052][15444] Updated weights for policy 0, policy_version 3311 (0.0023) [2024-08-05 14:58:23,118][15372] Fps is (10 sec: 22937.5, 60 sec: 23347.2, 300 sec: 23298.6). Total num frames: 27123712. Throughput: 0: 5811.4. Samples: 6775440. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 14:58:23,119][15372] Avg episode reward: [(0, '26.589')] [2024-08-05 14:58:26,382][15444] Updated weights for policy 0, policy_version 3321 (0.0019) [2024-08-05 14:58:28,118][15372] Fps is (10 sec: 22937.9, 60 sec: 23210.7, 300 sec: 23270.8). Total num frames: 27238400. Throughput: 0: 5809.1. Samples: 6810100. Policy #0 lag: (min: 0.0, avg: 3.2, max: 7.0) [2024-08-05 14:58:28,119][15372] Avg episode reward: [(0, '26.929')] [2024-08-05 14:58:30,099][15444] Updated weights for policy 0, policy_version 3331 (0.0029) [2024-08-05 14:58:33,119][15372] Fps is (10 sec: 22937.4, 60 sec: 23210.6, 300 sec: 23271.0). Total num frames: 27353088. Throughput: 0: 5816.1. Samples: 6845250. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 14:58:33,119][15372] Avg episode reward: [(0, '27.219')] [2024-08-05 14:58:33,397][15444] Updated weights for policy 0, policy_version 3341 (0.0021) [2024-08-05 14:58:37,271][15444] Updated weights for policy 0, policy_version 3351 (0.0018) [2024-08-05 14:58:37,374][15417] Signal inference workers to stop experience collection... (1100 times) [2024-08-05 14:58:37,380][15417] Signal inference workers to resume experience collection... (1100 times) [2024-08-05 14:58:37,451][15444] InferenceWorker_p0-w0: stopping experience collection (1100 times) [2024-08-05 14:58:37,457][15444] InferenceWorker_p0-w0: resuming experience collection (1100 times) [2024-08-05 14:58:38,119][15372] Fps is (10 sec: 23754.9, 60 sec: 23347.0, 300 sec: 23298.6). Total num frames: 27475968. Throughput: 0: 5815.5. Samples: 6862700. Policy #0 lag: (min: 1.0, avg: 4.4, max: 8.0) [2024-08-05 14:58:38,120][15372] Avg episode reward: [(0, '27.622')] [2024-08-05 14:58:38,124][15417] Saving new best policy, reward=27.622! [2024-08-05 14:58:40,590][15444] Updated weights for policy 0, policy_version 3361 (0.0014) [2024-08-05 14:58:43,119][15372] Fps is (10 sec: 23756.6, 60 sec: 23347.7, 300 sec: 23270.8). Total num frames: 27590656. Throughput: 0: 5818.6. Samples: 6897550. Policy #0 lag: (min: 1.0, avg: 3.8, max: 8.0) [2024-08-05 14:58:43,119][15372] Avg episode reward: [(0, '27.420')] [2024-08-05 14:58:44,174][15444] Updated weights for policy 0, policy_version 3371 (0.0046) [2024-08-05 14:58:47,764][15444] Updated weights for policy 0, policy_version 3381 (0.0030) [2024-08-05 14:58:48,119][15372] Fps is (10 sec: 22119.9, 60 sec: 23074.3, 300 sec: 23243.1). Total num frames: 27697152. Throughput: 0: 5816.6. Samples: 6932090. Policy #0 lag: (min: 1.0, avg: 3.4, max: 8.0) [2024-08-05 14:58:48,119][15372] Avg episode reward: [(0, '27.434')] [2024-08-05 14:58:51,014][15444] Updated weights for policy 0, policy_version 3391 (0.0015) [2024-08-05 14:58:53,118][15372] Fps is (10 sec: 22937.9, 60 sec: 23347.4, 300 sec: 23243.1). Total num frames: 27820032. Throughput: 0: 5817.4. Samples: 6950020. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 14:58:53,119][15372] Avg episode reward: [(0, '27.908')] [2024-08-05 14:58:53,201][15417] Saving new best policy, reward=27.908! [2024-08-05 14:58:54,929][15444] Updated weights for policy 0, policy_version 3401 (0.0017) [2024-08-05 14:58:58,118][15372] Fps is (10 sec: 23757.1, 60 sec: 23210.7, 300 sec: 23270.9). Total num frames: 27934720. Throughput: 0: 5803.6. Samples: 6984040. Policy #0 lag: (min: 1.0, avg: 3.7, max: 7.0) [2024-08-05 14:58:58,126][15372] Avg episode reward: [(0, '27.461')] [2024-08-05 14:58:58,618][15444] Updated weights for policy 0, policy_version 3411 (0.0028) [2024-08-05 14:59:02,126][15444] Updated weights for policy 0, policy_version 3421 (0.0035) [2024-08-05 14:59:03,118][15372] Fps is (10 sec: 22937.6, 60 sec: 23210.7, 300 sec: 23243.1). Total num frames: 28049408. Throughput: 0: 5788.7. Samples: 7018190. Policy #0 lag: (min: 1.0, avg: 3.6, max: 7.0) [2024-08-05 14:59:03,119][15372] Avg episode reward: [(0, '27.076')] [2024-08-05 14:59:05,433][15444] Updated weights for policy 0, policy_version 3431 (0.0021) [2024-08-05 14:59:08,119][15372] Fps is (10 sec: 22935.5, 60 sec: 23210.4, 300 sec: 23215.2). Total num frames: 28164096. Throughput: 0: 5788.6. Samples: 7035930. Policy #0 lag: (min: 1.0, avg: 4.4, max: 8.0) [2024-08-05 14:59:08,120][15372] Avg episode reward: [(0, '27.354')] [2024-08-05 14:59:08,969][15444] Updated weights for policy 0, policy_version 3441 (0.0013) [2024-08-05 14:59:12,494][15444] Updated weights for policy 0, policy_version 3451 (0.0017) [2024-08-05 14:59:13,118][15372] Fps is (10 sec: 22937.7, 60 sec: 23074.1, 300 sec: 23243.1). Total num frames: 28278784. Throughput: 0: 5798.4. Samples: 7071030. Policy #0 lag: (min: 1.0, avg: 4.3, max: 8.0) [2024-08-05 14:59:13,119][15372] Avg episode reward: [(0, '27.727')] [2024-08-05 14:59:15,986][15444] Updated weights for policy 0, policy_version 3461 (0.0011) [2024-08-05 14:59:18,119][15372] Fps is (10 sec: 23758.4, 60 sec: 23210.6, 300 sec: 23270.8). Total num frames: 28401664. Throughput: 0: 5784.4. Samples: 7105550. Policy #0 lag: (min: 1.0, avg: 3.6, max: 8.0) [2024-08-05 14:59:18,119][15372] Avg episode reward: [(0, '28.030')] [2024-08-05 14:59:18,122][15417] Saving new best policy, reward=28.030! [2024-08-05 14:59:19,827][15444] Updated weights for policy 0, policy_version 3471 (0.0028) [2024-08-05 14:59:23,119][15372] Fps is (10 sec: 23756.2, 60 sec: 23210.6, 300 sec: 23243.0). Total num frames: 28516352. Throughput: 0: 5783.0. Samples: 7122930. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 14:59:23,119][15372] Avg episode reward: [(0, '28.768')] [2024-08-05 14:59:23,120][15417] Saving new best policy, reward=28.768! [2024-08-05 14:59:23,143][15444] Updated weights for policy 0, policy_version 3481 (0.0021) [2024-08-05 14:59:26,743][15444] Updated weights for policy 0, policy_version 3491 (0.0034) [2024-08-05 14:59:28,118][15372] Fps is (10 sec: 22938.1, 60 sec: 23210.7, 300 sec: 23243.1). Total num frames: 28631040. Throughput: 0: 5774.9. Samples: 7157420. Policy #0 lag: (min: 0.0, avg: 3.2, max: 7.0) [2024-08-05 14:59:28,119][15372] Avg episode reward: [(0, '28.214')] [2024-08-05 14:59:30,120][15444] Updated weights for policy 0, policy_version 3501 (0.0014) [2024-08-05 14:59:31,117][15417] Signal inference workers to stop experience collection... (1150 times) [2024-08-05 14:59:31,117][15417] Signal inference workers to resume experience collection... (1150 times) [2024-08-05 14:59:31,155][15444] InferenceWorker_p0-w0: stopping experience collection (1150 times) [2024-08-05 14:59:31,155][15444] InferenceWorker_p0-w0: resuming experience collection (1150 times) [2024-08-05 14:59:33,119][15372] Fps is (10 sec: 22937.9, 60 sec: 23210.7, 300 sec: 23270.8). Total num frames: 28745728. Throughput: 0: 5788.2. Samples: 7192560. Policy #0 lag: (min: 0.0, avg: 4.3, max: 9.0) [2024-08-05 14:59:33,119][15372] Avg episode reward: [(0, '26.891')] [2024-08-05 14:59:33,665][15444] Updated weights for policy 0, policy_version 3511 (0.0013) [2024-08-05 14:59:37,250][15444] Updated weights for policy 0, policy_version 3521 (0.0020) [2024-08-05 14:59:38,119][15372] Fps is (10 sec: 22937.5, 60 sec: 23074.4, 300 sec: 23243.1). Total num frames: 28860416. Throughput: 0: 5788.2. Samples: 7210490. Policy #0 lag: (min: 0.0, avg: 4.2, max: 9.0) [2024-08-05 14:59:38,119][15372] Avg episode reward: [(0, '27.250')] [2024-08-05 14:59:40,714][15444] Updated weights for policy 0, policy_version 3531 (0.0012) [2024-08-05 14:59:43,118][15372] Fps is (10 sec: 23757.0, 60 sec: 23210.7, 300 sec: 23270.8). Total num frames: 28983296. Throughput: 0: 5820.2. Samples: 7245950. Policy #0 lag: (min: 0.0, avg: 4.4, max: 9.0) [2024-08-05 14:59:43,119][15372] Avg episode reward: [(0, '28.026')] [2024-08-05 14:59:44,068][15444] Updated weights for policy 0, policy_version 3541 (0.0014) [2024-08-05 14:59:47,757][15444] Updated weights for policy 0, policy_version 3551 (0.0012) [2024-08-05 14:59:48,118][15372] Fps is (10 sec: 22937.7, 60 sec: 23210.7, 300 sec: 23243.1). Total num frames: 29089792. Throughput: 0: 5822.2. Samples: 7280190. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 14:59:48,119][15372] Avg episode reward: [(0, '27.724')] [2024-08-05 14:59:51,279][15444] Updated weights for policy 0, policy_version 3561 (0.0013) [2024-08-05 14:59:53,118][15372] Fps is (10 sec: 22937.6, 60 sec: 23210.7, 300 sec: 23270.8). Total num frames: 29212672. Throughput: 0: 5823.9. Samples: 7298000. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 14:59:53,126][15372] Avg episode reward: [(0, '27.608')] [2024-08-05 14:59:54,618][15444] Updated weights for policy 0, policy_version 3571 (0.0013) [2024-08-05 14:59:58,119][15372] Fps is (10 sec: 23756.7, 60 sec: 23210.6, 300 sec: 23271.0). Total num frames: 29327360. Throughput: 0: 5828.4. Samples: 7333310. Policy #0 lag: (min: 1.0, avg: 3.9, max: 8.0) [2024-08-05 14:59:58,119][15372] Avg episode reward: [(0, '25.823')] [2024-08-05 14:59:58,312][15444] Updated weights for policy 0, policy_version 3581 (0.0018) [2024-08-05 15:00:01,689][15444] Updated weights for policy 0, policy_version 3591 (0.0023) [2024-08-05 15:00:03,119][15372] Fps is (10 sec: 22936.8, 60 sec: 23210.5, 300 sec: 23243.0). Total num frames: 29442048. Throughput: 0: 5832.6. Samples: 7368020. Policy #0 lag: (min: 1.0, avg: 3.6, max: 8.0) [2024-08-05 15:00:03,119][15372] Avg episode reward: [(0, '26.606')] [2024-08-05 15:00:05,268][15444] Updated weights for policy 0, policy_version 3601 (0.0032) [2024-08-05 15:00:08,119][15372] Fps is (10 sec: 23756.5, 60 sec: 23347.5, 300 sec: 23270.8). Total num frames: 29564928. Throughput: 0: 5842.9. Samples: 7385860. Policy #0 lag: (min: 0.0, avg: 3.4, max: 9.0) [2024-08-05 15:00:08,119][15372] Avg episode reward: [(0, '26.970')] [2024-08-05 15:00:08,634][15444] Updated weights for policy 0, policy_version 3611 (0.0011) [2024-08-05 15:00:12,146][15444] Updated weights for policy 0, policy_version 3621 (0.0023) [2024-08-05 15:00:13,118][15372] Fps is (10 sec: 23757.6, 60 sec: 23347.2, 300 sec: 23270.9). Total num frames: 29679616. Throughput: 0: 5848.0. Samples: 7420580. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 15:00:13,119][15372] Avg episode reward: [(0, '28.322')] [2024-08-05 15:00:15,765][15444] Updated weights for policy 0, policy_version 3631 (0.0017) [2024-08-05 15:00:16,757][15417] Signal inference workers to stop experience collection... (1200 times) [2024-08-05 15:00:16,758][15417] Signal inference workers to resume experience collection... (1200 times) [2024-08-05 15:00:16,790][15444] InferenceWorker_p0-w0: stopping experience collection (1200 times) [2024-08-05 15:00:16,790][15444] InferenceWorker_p0-w0: resuming experience collection (1200 times) [2024-08-05 15:00:18,118][15372] Fps is (10 sec: 22938.0, 60 sec: 23210.8, 300 sec: 23243.2). Total num frames: 29794304. Throughput: 0: 5854.5. Samples: 7456010. Policy #0 lag: (min: 1.0, avg: 3.0, max: 7.0) [2024-08-05 15:00:18,119][15372] Avg episode reward: [(0, '28.282')] [2024-08-05 15:00:18,164][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000003638_29802496.pth... [2024-08-05 15:00:18,357][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000002957_24223744.pth [2024-08-05 15:00:19,037][15444] Updated weights for policy 0, policy_version 3641 (0.0017) [2024-08-05 15:00:22,843][15444] Updated weights for policy 0, policy_version 3651 (0.0020) [2024-08-05 15:00:23,118][15372] Fps is (10 sec: 23756.8, 60 sec: 23347.3, 300 sec: 23270.8). Total num frames: 29917184. Throughput: 0: 5838.0. Samples: 7473200. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 15:00:23,119][15372] Avg episode reward: [(0, '27.883')] [2024-08-05 15:00:26,214][15444] Updated weights for policy 0, policy_version 3661 (0.0023) [2024-08-05 15:00:28,119][15372] Fps is (10 sec: 23756.5, 60 sec: 23347.2, 300 sec: 23243.1). Total num frames: 30031872. Throughput: 0: 5827.3. Samples: 7508180. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 15:00:28,119][15372] Avg episode reward: [(0, '27.527')] [2024-08-05 15:00:29,759][15444] Updated weights for policy 0, policy_version 3671 (0.0047) [2024-08-05 15:00:33,118][15372] Fps is (10 sec: 22937.7, 60 sec: 23347.2, 300 sec: 23243.1). Total num frames: 30146560. Throughput: 0: 5840.2. Samples: 7543000. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 15:00:33,126][15372] Avg episode reward: [(0, '26.880')] [2024-08-05 15:00:33,454][15444] Updated weights for policy 0, policy_version 3681 (0.0024) [2024-08-05 15:00:36,547][15444] Updated weights for policy 0, policy_version 3691 (0.0023) [2024-08-05 15:00:38,119][15372] Fps is (10 sec: 22937.6, 60 sec: 23347.2, 300 sec: 23243.1). Total num frames: 30261248. Throughput: 0: 5848.7. Samples: 7561190. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 15:00:38,127][15372] Avg episode reward: [(0, '27.144')] [2024-08-05 15:00:40,365][15444] Updated weights for policy 0, policy_version 3701 (0.0022) [2024-08-05 15:00:43,119][15372] Fps is (10 sec: 23755.4, 60 sec: 23347.0, 300 sec: 23298.6). Total num frames: 30384128. Throughput: 0: 5821.9. Samples: 7595300. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 15:00:43,127][15372] Avg episode reward: [(0, '27.465')] [2024-08-05 15:00:43,697][15444] Updated weights for policy 0, policy_version 3711 (0.0027) [2024-08-05 15:00:47,466][15444] Updated weights for policy 0, policy_version 3721 (0.0016) [2024-08-05 15:00:48,121][15372] Fps is (10 sec: 23750.2, 60 sec: 23482.6, 300 sec: 23270.7). Total num frames: 30498816. Throughput: 0: 5819.0. Samples: 7629890. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 15:00:48,122][15372] Avg episode reward: [(0, '28.143')] [2024-08-05 15:00:51,156][15444] Updated weights for policy 0, policy_version 3731 (0.0021) [2024-08-05 15:00:53,118][15372] Fps is (10 sec: 22939.0, 60 sec: 23347.2, 300 sec: 23271.0). Total num frames: 30613504. Throughput: 0: 5818.9. Samples: 7647710. Policy #0 lag: (min: 0.0, avg: 3.2, max: 7.0) [2024-08-05 15:00:53,119][15372] Avg episode reward: [(0, '28.879')] [2024-08-05 15:00:53,119][15417] Saving new best policy, reward=28.879! [2024-08-05 15:00:54,476][15444] Updated weights for policy 0, policy_version 3741 (0.0022) [2024-08-05 15:00:58,118][15372] Fps is (10 sec: 22124.8, 60 sec: 23210.7, 300 sec: 23243.1). Total num frames: 30720000. Throughput: 0: 5826.4. Samples: 7682770. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 15:00:58,119][15372] Avg episode reward: [(0, '28.316')] [2024-08-05 15:00:58,312][15444] Updated weights for policy 0, policy_version 3751 (0.0014) [2024-08-05 15:01:01,608][15444] Updated weights for policy 0, policy_version 3761 (0.0020) [2024-08-05 15:01:03,119][15372] Fps is (10 sec: 22936.9, 60 sec: 23347.2, 300 sec: 23243.1). Total num frames: 30842880. Throughput: 0: 5794.4. Samples: 7716760. Policy #0 lag: (min: 1.0, avg: 3.2, max: 7.0) [2024-08-05 15:01:03,119][15372] Avg episode reward: [(0, '27.670')] [2024-08-05 15:01:03,472][15417] Signal inference workers to stop experience collection... (1250 times) [2024-08-05 15:01:03,473][15417] Signal inference workers to resume experience collection... (1250 times) [2024-08-05 15:01:03,526][15444] InferenceWorker_p0-w0: stopping experience collection (1250 times) [2024-08-05 15:01:03,526][15444] InferenceWorker_p0-w0: resuming experience collection (1250 times) [2024-08-05 15:01:05,234][15444] Updated weights for policy 0, policy_version 3771 (0.0025) [2024-08-05 15:01:08,118][15372] Fps is (10 sec: 24576.0, 60 sec: 23347.3, 300 sec: 23270.8). Total num frames: 30965760. Throughput: 0: 5810.9. Samples: 7734690. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 15:01:08,119][15372] Avg episode reward: [(0, '28.306')] [2024-08-05 15:01:08,461][15444] Updated weights for policy 0, policy_version 3781 (0.0026) [2024-08-05 15:01:12,062][15444] Updated weights for policy 0, policy_version 3791 (0.0017) [2024-08-05 15:01:13,118][15372] Fps is (10 sec: 22938.2, 60 sec: 23210.7, 300 sec: 23215.3). Total num frames: 31072256. Throughput: 0: 5807.1. Samples: 7769500. Policy #0 lag: (min: 1.0, avg: 3.3, max: 7.0) [2024-08-05 15:01:13,119][15372] Avg episode reward: [(0, '28.323')] [2024-08-05 15:01:15,674][15444] Updated weights for policy 0, policy_version 3801 (0.0019) [2024-08-05 15:01:18,118][15372] Fps is (10 sec: 22937.5, 60 sec: 23347.2, 300 sec: 23270.8). Total num frames: 31195136. Throughput: 0: 5838.9. Samples: 7805750. Policy #0 lag: (min: 1.0, avg: 3.5, max: 8.0) [2024-08-05 15:01:18,119][15372] Avg episode reward: [(0, '28.273')] [2024-08-05 15:01:19,073][15444] Updated weights for policy 0, policy_version 3811 (0.0015) [2024-08-05 15:01:22,378][15444] Updated weights for policy 0, policy_version 3821 (0.0019) [2024-08-05 15:01:23,118][15372] Fps is (10 sec: 23757.0, 60 sec: 23210.7, 300 sec: 23298.6). Total num frames: 31309824. Throughput: 0: 5817.4. Samples: 7822970. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 15:01:23,119][15372] Avg episode reward: [(0, '28.219')] [2024-08-05 15:01:25,906][15444] Updated weights for policy 0, policy_version 3831 (0.0031) [2024-08-05 15:01:28,119][15372] Fps is (10 sec: 23756.7, 60 sec: 23347.2, 300 sec: 23270.9). Total num frames: 31432704. Throughput: 0: 5827.0. Samples: 7857510. Policy #0 lag: (min: 0.0, avg: 3.9, max: 9.0) [2024-08-05 15:01:28,120][15372] Avg episode reward: [(0, '27.814')] [2024-08-05 15:01:29,512][15444] Updated weights for policy 0, policy_version 3841 (0.0022) [2024-08-05 15:01:33,119][15372] Fps is (10 sec: 22936.6, 60 sec: 23210.5, 300 sec: 23243.0). Total num frames: 31539200. Throughput: 0: 5824.1. Samples: 7891960. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 15:01:33,120][15372] Avg episode reward: [(0, '28.105')] [2024-08-05 15:01:33,203][15444] Updated weights for policy 0, policy_version 3851 (0.0025) [2024-08-05 15:01:36,488][15444] Updated weights for policy 0, policy_version 3861 (0.0027) [2024-08-05 15:01:38,118][15372] Fps is (10 sec: 22937.6, 60 sec: 23347.2, 300 sec: 23270.8). Total num frames: 31662080. Throughput: 0: 5832.2. Samples: 7910160. Policy #0 lag: (min: 1.0, avg: 3.4, max: 8.0) [2024-08-05 15:01:38,126][15372] Avg episode reward: [(0, '27.946')] [2024-08-05 15:01:40,219][15444] Updated weights for policy 0, policy_version 3871 (0.0022) [2024-08-05 15:01:43,118][15372] Fps is (10 sec: 23757.8, 60 sec: 23210.9, 300 sec: 23270.8). Total num frames: 31776768. Throughput: 0: 5808.7. Samples: 7944160. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 15:01:43,127][15372] Avg episode reward: [(0, '27.664')] [2024-08-05 15:01:43,678][15444] Updated weights for policy 0, policy_version 3881 (0.0014) [2024-08-05 15:01:47,265][15444] Updated weights for policy 0, policy_version 3891 (0.0043) [2024-08-05 15:01:48,119][15372] Fps is (10 sec: 22937.4, 60 sec: 23211.7, 300 sec: 23270.8). Total num frames: 31891456. Throughput: 0: 5821.6. Samples: 7978730. Policy #0 lag: (min: 1.0, avg: 3.4, max: 7.0) [2024-08-05 15:01:48,119][15372] Avg episode reward: [(0, '27.567')] [2024-08-05 15:01:50,917][15444] Updated weights for policy 0, policy_version 3901 (0.0023) [2024-08-05 15:01:53,119][15372] Fps is (10 sec: 22937.2, 60 sec: 23210.6, 300 sec: 23270.8). Total num frames: 32006144. Throughput: 0: 5816.2. Samples: 7996420. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 15:01:53,119][15372] Avg episode reward: [(0, '28.921')] [2024-08-05 15:01:53,120][15417] Saving new best policy, reward=28.921! [2024-08-05 15:01:54,557][15444] Updated weights for policy 0, policy_version 3911 (0.0012) [2024-08-05 15:01:58,038][15444] Updated weights for policy 0, policy_version 3921 (0.0020) [2024-08-05 15:01:58,118][15372] Fps is (10 sec: 22937.9, 60 sec: 23347.2, 300 sec: 23270.8). Total num frames: 32120832. Throughput: 0: 5816.2. Samples: 8031230. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 15:01:58,119][15372] Avg episode reward: [(0, '27.827')] [2024-08-05 15:02:01,187][15444] Updated weights for policy 0, policy_version 3931 (0.0029) [2024-08-05 15:02:03,118][15372] Fps is (10 sec: 22938.0, 60 sec: 23210.8, 300 sec: 23243.1). Total num frames: 32235520. Throughput: 0: 5774.2. Samples: 8065590. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 15:02:03,119][15372] Avg episode reward: [(0, '27.797')] [2024-08-05 15:02:03,358][15417] Signal inference workers to stop experience collection... (1300 times) [2024-08-05 15:02:03,359][15417] Signal inference workers to resume experience collection... (1300 times) [2024-08-05 15:02:03,436][15444] InferenceWorker_p0-w0: stopping experience collection (1300 times) [2024-08-05 15:02:03,436][15444] InferenceWorker_p0-w0: resuming experience collection (1300 times) [2024-08-05 15:02:05,030][15444] Updated weights for policy 0, policy_version 3941 (0.0018) [2024-08-05 15:02:08,119][15372] Fps is (10 sec: 23756.6, 60 sec: 23210.6, 300 sec: 23270.8). Total num frames: 32358400. Throughput: 0: 5780.0. Samples: 8083070. Policy #0 lag: (min: 1.0, avg: 3.8, max: 8.0) [2024-08-05 15:02:08,135][15372] Avg episode reward: [(0, '28.462')] [2024-08-05 15:02:08,402][15444] Updated weights for policy 0, policy_version 3951 (0.0011) [2024-08-05 15:02:12,047][15444] Updated weights for policy 0, policy_version 3961 (0.0023) [2024-08-05 15:02:13,118][15372] Fps is (10 sec: 23756.8, 60 sec: 23347.2, 300 sec: 23243.1). Total num frames: 32473088. Throughput: 0: 5785.1. Samples: 8117840. Policy #0 lag: (min: 0.0, avg: 4.0, max: 9.0) [2024-08-05 15:02:13,119][15372] Avg episode reward: [(0, '28.520')] [2024-08-05 15:02:15,697][15444] Updated weights for policy 0, policy_version 3971 (0.0031) [2024-08-05 15:02:18,119][15372] Fps is (10 sec: 22937.5, 60 sec: 23210.6, 300 sec: 23270.8). Total num frames: 32587776. Throughput: 0: 5801.4. Samples: 8153020. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 15:02:18,119][15372] Avg episode reward: [(0, '28.240')] [2024-08-05 15:02:18,122][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000003978_32587776.pth... [2024-08-05 15:02:18,315][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000003297_27009024.pth [2024-08-05 15:02:19,203][15444] Updated weights for policy 0, policy_version 3981 (0.0020) [2024-08-05 15:02:22,633][15444] Updated weights for policy 0, policy_version 3991 (0.0018) [2024-08-05 15:02:23,118][15372] Fps is (10 sec: 22937.6, 60 sec: 23210.7, 300 sec: 23243.1). Total num frames: 32702464. Throughput: 0: 5773.3. Samples: 8169960. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 15:02:23,119][15372] Avg episode reward: [(0, '28.172')] [2024-08-05 15:02:26,152][15444] Updated weights for policy 0, policy_version 4001 (0.0023) [2024-08-05 15:02:28,119][15372] Fps is (10 sec: 22937.4, 60 sec: 23074.1, 300 sec: 23243.0). Total num frames: 32817152. Throughput: 0: 5788.2. Samples: 8204630. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 15:02:28,119][15372] Avg episode reward: [(0, '27.700')] [2024-08-05 15:02:29,554][15444] Updated weights for policy 0, policy_version 4011 (0.0012) [2024-08-05 15:02:33,074][15444] Updated weights for policy 0, policy_version 4021 (0.0027) [2024-08-05 15:02:33,118][15372] Fps is (10 sec: 23756.8, 60 sec: 23347.4, 300 sec: 23270.9). Total num frames: 32940032. Throughput: 0: 5812.5. Samples: 8240290. Policy #0 lag: (min: 1.0, avg: 3.8, max: 9.0) [2024-08-05 15:02:33,119][15372] Avg episode reward: [(0, '27.748')] [2024-08-05 15:02:36,697][15444] Updated weights for policy 0, policy_version 4031 (0.0030) [2024-08-05 15:02:38,119][15372] Fps is (10 sec: 22936.9, 60 sec: 23073.9, 300 sec: 23243.1). Total num frames: 33046528. Throughput: 0: 5814.6. Samples: 8258080. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 15:02:38,119][15372] Avg episode reward: [(0, '27.620')] [2024-08-05 15:02:40,276][15444] Updated weights for policy 0, policy_version 4041 (0.0031) [2024-08-05 15:02:43,119][15372] Fps is (10 sec: 22935.5, 60 sec: 23210.3, 300 sec: 23243.0). Total num frames: 33169408. Throughput: 0: 5804.8. Samples: 8292450. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 15:02:43,120][15372] Avg episode reward: [(0, '27.896')] [2024-08-05 15:02:43,881][15444] Updated weights for policy 0, policy_version 4051 (0.0012) [2024-08-05 15:02:47,415][15444] Updated weights for policy 0, policy_version 4061 (0.0012) [2024-08-05 15:02:48,118][15372] Fps is (10 sec: 23758.1, 60 sec: 23210.7, 300 sec: 23270.9). Total num frames: 33284096. Throughput: 0: 5811.1. Samples: 8327090. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 15:02:48,119][15372] Avg episode reward: [(0, '28.604')] [2024-08-05 15:02:50,574][15444] Updated weights for policy 0, policy_version 4071 (0.0017) [2024-08-05 15:02:53,119][15372] Fps is (10 sec: 22939.1, 60 sec: 23210.6, 300 sec: 23243.0). Total num frames: 33398784. Throughput: 0: 5810.4. Samples: 8344540. Policy #0 lag: (min: 0.0, avg: 3.9, max: 9.0) [2024-08-05 15:02:53,127][15372] Avg episode reward: [(0, '28.112')] [2024-08-05 15:02:54,522][15444] Updated weights for policy 0, policy_version 4081 (0.0011) [2024-08-05 15:02:57,857][15444] Updated weights for policy 0, policy_version 4091 (0.0012) [2024-08-05 15:02:58,119][15372] Fps is (10 sec: 22936.7, 60 sec: 23210.5, 300 sec: 23243.0). Total num frames: 33513472. Throughput: 0: 5819.3. Samples: 8379710. Policy #0 lag: (min: 1.0, avg: 4.2, max: 7.0) [2024-08-05 15:02:58,119][15372] Avg episode reward: [(0, '27.714')] [2024-08-05 15:03:01,330][15444] Updated weights for policy 0, policy_version 4101 (0.0027) [2024-08-05 15:03:03,118][15372] Fps is (10 sec: 22938.2, 60 sec: 23210.7, 300 sec: 23243.1). Total num frames: 33628160. Throughput: 0: 5803.1. Samples: 8414160. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 15:03:03,119][15372] Avg episode reward: [(0, '28.039')] [2024-08-05 15:03:03,428][15417] Signal inference workers to stop experience collection... (1350 times) [2024-08-05 15:03:03,429][15417] Signal inference workers to resume experience collection... (1350 times) [2024-08-05 15:03:03,469][15444] InferenceWorker_p0-w0: stopping experience collection (1350 times) [2024-08-05 15:03:03,470][15444] InferenceWorker_p0-w0: resuming experience collection (1350 times) [2024-08-05 15:03:05,269][15444] Updated weights for policy 0, policy_version 4111 (0.0015) [2024-08-05 15:03:08,119][15372] Fps is (10 sec: 23756.7, 60 sec: 23210.5, 300 sec: 23243.0). Total num frames: 33751040. Throughput: 0: 5823.9. Samples: 8432040. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 15:03:08,119][15372] Avg episode reward: [(0, '28.388')] [2024-08-05 15:03:08,184][15444] Updated weights for policy 0, policy_version 4121 (0.0020) [2024-08-05 15:03:11,984][15444] Updated weights for policy 0, policy_version 4131 (0.0013) [2024-08-05 15:03:13,118][15372] Fps is (10 sec: 23756.8, 60 sec: 23210.7, 300 sec: 23243.1). Total num frames: 33865728. Throughput: 0: 5830.7. Samples: 8467010. Policy #0 lag: (min: 1.0, avg: 3.3, max: 7.0) [2024-08-05 15:03:13,119][15372] Avg episode reward: [(0, '27.980')] [2024-08-05 15:03:15,339][15444] Updated weights for policy 0, policy_version 4141 (0.0012) [2024-08-05 15:03:18,118][15372] Fps is (10 sec: 22938.6, 60 sec: 23210.7, 300 sec: 23243.1). Total num frames: 33980416. Throughput: 0: 5818.4. Samples: 8502120. Policy #0 lag: (min: 1.0, avg: 3.7, max: 7.0) [2024-08-05 15:03:18,119][15372] Avg episode reward: [(0, '28.148')] [2024-08-05 15:03:18,970][15444] Updated weights for policy 0, policy_version 4151 (0.0012) [2024-08-05 15:03:22,314][15444] Updated weights for policy 0, policy_version 4161 (0.0020) [2024-08-05 15:03:23,118][15372] Fps is (10 sec: 23756.9, 60 sec: 23347.2, 300 sec: 23270.8). Total num frames: 34103296. Throughput: 0: 5817.0. Samples: 8519840. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 15:03:23,119][15372] Avg episode reward: [(0, '28.506')] [2024-08-05 15:03:25,996][15444] Updated weights for policy 0, policy_version 4171 (0.0014) [2024-08-05 15:03:28,119][15372] Fps is (10 sec: 22937.2, 60 sec: 23210.7, 300 sec: 23243.1). Total num frames: 34209792. Throughput: 0: 5823.2. Samples: 8554490. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 15:03:28,120][15372] Avg episode reward: [(0, '27.927')] [2024-08-05 15:03:29,405][15444] Updated weights for policy 0, policy_version 4181 (0.0024) [2024-08-05 15:03:33,119][15372] Fps is (10 sec: 22117.8, 60 sec: 23074.1, 300 sec: 23215.3). Total num frames: 34324480. Throughput: 0: 5809.3. Samples: 8588510. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 15:03:33,126][15372] Avg episode reward: [(0, '26.952')] [2024-08-05 15:03:33,265][15444] Updated weights for policy 0, policy_version 4191 (0.0022) [2024-08-05 15:03:36,390][15444] Updated weights for policy 0, policy_version 4201 (0.0017) [2024-08-05 15:03:38,119][15372] Fps is (10 sec: 22936.6, 60 sec: 23210.6, 300 sec: 23215.3). Total num frames: 34439168. Throughput: 0: 5828.0. Samples: 8606800. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 15:03:38,120][15372] Avg episode reward: [(0, '27.724')] [2024-08-05 15:03:40,412][15444] Updated weights for policy 0, policy_version 4211 (0.0013) [2024-08-05 15:03:43,119][15372] Fps is (10 sec: 22118.7, 60 sec: 22937.9, 300 sec: 23215.3). Total num frames: 34545664. Throughput: 0: 5768.3. Samples: 8639280. Policy #0 lag: (min: 1.0, avg: 3.3, max: 7.0) [2024-08-05 15:03:43,119][15372] Avg episode reward: [(0, '28.776')] [2024-08-05 15:03:44,525][15444] Updated weights for policy 0, policy_version 4221 (0.0027) [2024-08-05 15:03:48,119][15372] Fps is (10 sec: 20481.1, 60 sec: 22664.5, 300 sec: 23132.0). Total num frames: 34643968. Throughput: 0: 5653.3. Samples: 8668560. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 15:03:48,119][15372] Avg episode reward: [(0, '28.325')] [2024-08-05 15:03:48,569][15444] Updated weights for policy 0, policy_version 4231 (0.0025) [2024-08-05 15:03:52,575][15444] Updated weights for policy 0, policy_version 4241 (0.0012) [2024-08-05 15:03:53,118][15372] Fps is (10 sec: 21299.3, 60 sec: 22664.6, 300 sec: 23132.0). Total num frames: 34758656. Throughput: 0: 5616.0. Samples: 8684760. Policy #0 lag: (min: 0.0, avg: 3.0, max: 7.0) [2024-08-05 15:03:53,119][15372] Avg episode reward: [(0, '28.612')] [2024-08-05 15:03:53,184][15417] Signal inference workers to stop experience collection... (1400 times) [2024-08-05 15:03:53,184][15417] Signal inference workers to resume experience collection... (1400 times) [2024-08-05 15:03:53,213][15444] InferenceWorker_p0-w0: stopping experience collection (1400 times) [2024-08-05 15:03:53,213][15444] InferenceWorker_p0-w0: resuming experience collection (1400 times) [2024-08-05 15:03:55,549][15444] Updated weights for policy 0, policy_version 4251 (0.0028) [2024-08-05 15:03:58,119][15372] Fps is (10 sec: 23756.7, 60 sec: 22801.2, 300 sec: 23159.7). Total num frames: 34881536. Throughput: 0: 5624.6. Samples: 8720120. Policy #0 lag: (min: 1.0, avg: 3.4, max: 7.0) [2024-08-05 15:03:58,119][15372] Avg episode reward: [(0, '29.003')] [2024-08-05 15:03:58,122][15417] Saving new best policy, reward=29.003! [2024-08-05 15:03:59,258][15444] Updated weights for policy 0, policy_version 4261 (0.0024) [2024-08-05 15:04:02,474][15444] Updated weights for policy 0, policy_version 4271 (0.0019) [2024-08-05 15:04:03,118][15372] Fps is (10 sec: 23757.0, 60 sec: 22801.1, 300 sec: 23159.8). Total num frames: 34996224. Throughput: 0: 5638.0. Samples: 8755830. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 15:04:03,119][15372] Avg episode reward: [(0, '28.994')] [2024-08-05 15:04:05,909][15444] Updated weights for policy 0, policy_version 4281 (0.0011) [2024-08-05 15:04:08,119][15372] Fps is (10 sec: 23755.6, 60 sec: 22801.0, 300 sec: 23187.5). Total num frames: 35119104. Throughput: 0: 5656.4. Samples: 8774380. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 15:04:08,127][15372] Avg episode reward: [(0, '28.826')] [2024-08-05 15:04:09,476][15444] Updated weights for policy 0, policy_version 4291 (0.0012) [2024-08-05 15:04:12,572][15444] Updated weights for policy 0, policy_version 4301 (0.0013) [2024-08-05 15:04:13,118][15372] Fps is (10 sec: 24575.9, 60 sec: 22937.6, 300 sec: 23187.5). Total num frames: 35241984. Throughput: 0: 5692.9. Samples: 8810670. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 15:04:13,119][15372] Avg episode reward: [(0, '28.203')] [2024-08-05 15:04:16,021][15444] Updated weights for policy 0, policy_version 4311 (0.0014) [2024-08-05 15:04:18,119][15372] Fps is (10 sec: 24576.4, 60 sec: 23073.9, 300 sec: 23215.3). Total num frames: 35364864. Throughput: 0: 5736.4. Samples: 8846650. Policy #0 lag: (min: 1.0, avg: 3.6, max: 7.0) [2024-08-05 15:04:18,120][15372] Avg episode reward: [(0, '28.018')] [2024-08-05 15:04:18,124][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000004317_35364864.pth... [2024-08-05 15:04:18,270][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000003638_29802496.pth [2024-08-05 15:04:19,687][15444] Updated weights for policy 0, policy_version 4321 (0.0014) [2024-08-05 15:04:22,844][15444] Updated weights for policy 0, policy_version 4331 (0.0035) [2024-08-05 15:04:23,119][15372] Fps is (10 sec: 23756.7, 60 sec: 22937.6, 300 sec: 23215.3). Total num frames: 35479552. Throughput: 0: 5727.4. Samples: 8864530. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 15:04:23,119][15372] Avg episode reward: [(0, '28.469')] [2024-08-05 15:04:26,296][15444] Updated weights for policy 0, policy_version 4341 (0.0016) [2024-08-05 15:04:28,119][15372] Fps is (10 sec: 22938.3, 60 sec: 23074.1, 300 sec: 23215.3). Total num frames: 35594240. Throughput: 0: 5798.0. Samples: 8900190. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 15:04:28,126][15372] Avg episode reward: [(0, '28.773')] [2024-08-05 15:04:29,495][15444] Updated weights for policy 0, policy_version 4351 (0.0018) [2024-08-05 15:04:32,998][15444] Updated weights for policy 0, policy_version 4361 (0.0033) [2024-08-05 15:04:33,118][15372] Fps is (10 sec: 24576.3, 60 sec: 23347.3, 300 sec: 23270.8). Total num frames: 35725312. Throughput: 0: 5967.3. Samples: 8937090. Policy #0 lag: (min: 0.0, avg: 3.4, max: 8.0) [2024-08-05 15:04:33,119][15372] Avg episode reward: [(0, '28.247')] [2024-08-05 15:04:36,703][15444] Updated weights for policy 0, policy_version 4371 (0.0027) [2024-08-05 15:04:38,118][15372] Fps is (10 sec: 24576.5, 60 sec: 23347.5, 300 sec: 23243.1). Total num frames: 35840000. Throughput: 0: 6015.6. Samples: 8955460. Policy #0 lag: (min: 1.0, avg: 3.4, max: 8.0) [2024-08-05 15:04:38,119][15372] Avg episode reward: [(0, '28.213')] [2024-08-05 15:04:39,640][15444] Updated weights for policy 0, policy_version 4381 (0.0024) [2024-08-05 15:04:43,046][15417] Signal inference workers to stop experience collection... (1450 times) [2024-08-05 15:04:43,055][15417] Signal inference workers to resume experience collection... (1450 times) [2024-08-05 15:04:43,088][15444] InferenceWorker_p0-w0: stopping experience collection (1450 times) [2024-08-05 15:04:43,095][15444] InferenceWorker_p0-w0: resuming experience collection (1450 times) [2024-08-05 15:04:43,118][15372] Fps is (10 sec: 22937.5, 60 sec: 23483.8, 300 sec: 23270.8). Total num frames: 35954688. Throughput: 0: 6034.9. Samples: 8991690. Policy #0 lag: (min: 2.0, avg: 4.5, max: 8.0) [2024-08-05 15:04:43,119][15372] Avg episode reward: [(0, '28.125')] [2024-08-05 15:04:43,393][15444] Updated weights for policy 0, policy_version 4391 (0.0012) [2024-08-05 15:04:46,522][15444] Updated weights for policy 0, policy_version 4401 (0.0012) [2024-08-05 15:04:48,120][15372] Fps is (10 sec: 24573.2, 60 sec: 24029.5, 300 sec: 23298.5). Total num frames: 36085760. Throughput: 0: 6031.6. Samples: 9027260. Policy #0 lag: (min: 0.0, avg: 4.8, max: 9.0) [2024-08-05 15:04:48,120][15372] Avg episode reward: [(0, '27.351')] [2024-08-05 15:04:49,922][15444] Updated weights for policy 0, policy_version 4411 (0.0012) [2024-08-05 15:04:53,118][15372] Fps is (10 sec: 25395.1, 60 sec: 24166.4, 300 sec: 23326.4). Total num frames: 36208640. Throughput: 0: 6036.5. Samples: 9046020. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 15:04:53,119][15372] Avg episode reward: [(0, '28.174')] [2024-08-05 15:04:53,457][15444] Updated weights for policy 0, policy_version 4421 (0.0011) [2024-08-05 15:04:56,537][15444] Updated weights for policy 0, policy_version 4431 (0.0016) [2024-08-05 15:04:58,118][15372] Fps is (10 sec: 24578.8, 60 sec: 24166.5, 300 sec: 23354.2). Total num frames: 36331520. Throughput: 0: 6034.9. Samples: 9082240. Policy #0 lag: (min: 1.0, avg: 3.7, max: 9.0) [2024-08-05 15:04:58,126][15372] Avg episode reward: [(0, '28.828')] [2024-08-05 15:05:00,142][15444] Updated weights for policy 0, policy_version 4441 (0.0014) [2024-08-05 15:05:03,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24302.9, 300 sec: 23354.2). Total num frames: 36454400. Throughput: 0: 6050.3. Samples: 9118910. Policy #0 lag: (min: 1.0, avg: 4.5, max: 9.0) [2024-08-05 15:05:03,119][15372] Avg episode reward: [(0, '27.752')] [2024-08-05 15:05:03,361][15444] Updated weights for policy 0, policy_version 4451 (0.0018) [2024-08-05 15:05:06,825][15444] Updated weights for policy 0, policy_version 4461 (0.0012) [2024-08-05 15:05:08,119][15372] Fps is (10 sec: 23755.6, 60 sec: 24166.5, 300 sec: 23354.1). Total num frames: 36569088. Throughput: 0: 6061.5. Samples: 9137300. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 15:05:08,119][15372] Avg episode reward: [(0, '27.621')] [2024-08-05 15:05:10,127][15444] Updated weights for policy 0, policy_version 4471 (0.0010) [2024-08-05 15:05:13,119][15372] Fps is (10 sec: 23756.4, 60 sec: 24166.4, 300 sec: 23381.9). Total num frames: 36691968. Throughput: 0: 6090.7. Samples: 9174270. Policy #0 lag: (min: 1.0, avg: 3.5, max: 7.0) [2024-08-05 15:05:13,126][15372] Avg episode reward: [(0, '28.920')] [2024-08-05 15:05:13,479][15444] Updated weights for policy 0, policy_version 4481 (0.0028) [2024-08-05 15:05:16,893][15444] Updated weights for policy 0, policy_version 4491 (0.0012) [2024-08-05 15:05:18,118][15372] Fps is (10 sec: 24577.2, 60 sec: 24166.6, 300 sec: 23381.9). Total num frames: 36814848. Throughput: 0: 6062.7. Samples: 9209910. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 15:05:18,119][15372] Avg episode reward: [(0, '28.882')] [2024-08-05 15:05:20,407][15444] Updated weights for policy 0, policy_version 4501 (0.0012) [2024-08-05 15:05:23,119][15372] Fps is (10 sec: 24575.4, 60 sec: 24302.8, 300 sec: 23409.7). Total num frames: 36937728. Throughput: 0: 6066.2. Samples: 9228440. Policy #0 lag: (min: 1.0, avg: 3.5, max: 8.0) [2024-08-05 15:05:23,119][15372] Avg episode reward: [(0, '28.688')] [2024-08-05 15:05:23,599][15444] Updated weights for policy 0, policy_version 4511 (0.0015) [2024-08-05 15:05:27,230][15444] Updated weights for policy 0, policy_version 4521 (0.0016) [2024-08-05 15:05:27,659][15417] Signal inference workers to stop experience collection... (1500 times) [2024-08-05 15:05:27,660][15417] Signal inference workers to resume experience collection... (1500 times) [2024-08-05 15:05:27,715][15444] InferenceWorker_p0-w0: stopping experience collection (1500 times) [2024-08-05 15:05:27,722][15444] InferenceWorker_p0-w0: resuming experience collection (1500 times) [2024-08-05 15:05:28,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24439.5, 300 sec: 23437.5). Total num frames: 37060608. Throughput: 0: 6060.7. Samples: 9264420. Policy #0 lag: (min: 2.0, avg: 4.8, max: 9.0) [2024-08-05 15:05:28,119][15372] Avg episode reward: [(0, '28.199')] [2024-08-05 15:05:30,263][15444] Updated weights for policy 0, policy_version 4531 (0.0010) [2024-08-05 15:05:33,118][15372] Fps is (10 sec: 23757.7, 60 sec: 24166.4, 300 sec: 23437.5). Total num frames: 37175296. Throughput: 0: 6097.3. Samples: 9301630. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 15:05:33,119][15372] Avg episode reward: [(0, '27.896')] [2024-08-05 15:05:33,775][15444] Updated weights for policy 0, policy_version 4541 (0.0014) [2024-08-05 15:05:37,273][15444] Updated weights for policy 0, policy_version 4551 (0.0025) [2024-08-05 15:05:38,119][15372] Fps is (10 sec: 24575.8, 60 sec: 24439.4, 300 sec: 23465.3). Total num frames: 37306368. Throughput: 0: 6081.3. Samples: 9319680. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 15:05:38,119][15372] Avg episode reward: [(0, '27.550')] [2024-08-05 15:05:40,354][15444] Updated weights for policy 0, policy_version 4561 (0.0016) [2024-08-05 15:05:43,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24439.5, 300 sec: 23465.5). Total num frames: 37421056. Throughput: 0: 6086.0. Samples: 9356110. Policy #0 lag: (min: 1.0, avg: 4.6, max: 8.0) [2024-08-05 15:05:43,119][15372] Avg episode reward: [(0, '27.725')] [2024-08-05 15:05:43,965][15444] Updated weights for policy 0, policy_version 4571 (0.0019) [2024-08-05 15:05:47,390][15444] Updated weights for policy 0, policy_version 4581 (0.0023) [2024-08-05 15:05:48,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24303.4, 300 sec: 23493.0). Total num frames: 37543936. Throughput: 0: 6067.3. Samples: 9391940. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 15:05:48,119][15372] Avg episode reward: [(0, '27.252')] [2024-08-05 15:05:50,686][15444] Updated weights for policy 0, policy_version 4591 (0.0013) [2024-08-05 15:05:53,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 23520.8). Total num frames: 37658624. Throughput: 0: 6059.4. Samples: 9409970. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 15:05:53,131][15372] Avg episode reward: [(0, '27.298')] [2024-08-05 15:05:54,303][15444] Updated weights for policy 0, policy_version 4601 (0.0018) [2024-08-05 15:05:57,535][15444] Updated weights for policy 0, policy_version 4611 (0.0014) [2024-08-05 15:05:58,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.4, 300 sec: 23520.8). Total num frames: 37781504. Throughput: 0: 6049.8. Samples: 9446510. Policy #0 lag: (min: 1.0, avg: 4.5, max: 9.0) [2024-08-05 15:05:58,119][15372] Avg episode reward: [(0, '26.914')] [2024-08-05 15:06:00,978][15444] Updated weights for policy 0, policy_version 4621 (0.0020) [2024-08-05 15:06:03,119][15372] Fps is (10 sec: 24575.8, 60 sec: 24166.4, 300 sec: 23520.8). Total num frames: 37904384. Throughput: 0: 6071.1. Samples: 9483110. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 15:06:03,119][15372] Avg episode reward: [(0, '27.444')] [2024-08-05 15:06:04,224][15444] Updated weights for policy 0, policy_version 4631 (0.0021) [2024-08-05 15:06:07,656][15444] Updated weights for policy 0, policy_version 4641 (0.0022) [2024-08-05 15:06:08,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24303.1, 300 sec: 23576.3). Total num frames: 38027264. Throughput: 0: 6040.5. Samples: 9500260. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 15:06:08,119][15372] Avg episode reward: [(0, '28.114')] [2024-08-05 15:06:11,311][15444] Updated weights for policy 0, policy_version 4651 (0.0012) [2024-08-05 15:06:13,119][15372] Fps is (10 sec: 23756.0, 60 sec: 24166.3, 300 sec: 23548.5). Total num frames: 38141952. Throughput: 0: 6043.3. Samples: 9536370. Policy #0 lag: (min: 1.0, avg: 3.7, max: 7.0) [2024-08-05 15:06:13,119][15372] Avg episode reward: [(0, '28.624')] [2024-08-05 15:06:14,321][15444] Updated weights for policy 0, policy_version 4661 (0.0014) [2024-08-05 15:06:18,103][15444] Updated weights for policy 0, policy_version 4671 (0.0019) [2024-08-05 15:06:18,119][15372] Fps is (10 sec: 23756.6, 60 sec: 24166.4, 300 sec: 23576.3). Total num frames: 38264832. Throughput: 0: 6021.8. Samples: 9572610. Policy #0 lag: (min: 0.0, avg: 4.7, max: 8.0) [2024-08-05 15:06:18,119][15372] Avg episode reward: [(0, '27.983')] [2024-08-05 15:06:18,123][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000004671_38264832.pth... [2024-08-05 15:06:18,262][15417] Signal inference workers to stop experience collection... (1550 times) [2024-08-05 15:06:18,263][15417] Signal inference workers to resume experience collection... (1550 times) [2024-08-05 15:06:18,269][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000003978_32587776.pth [2024-08-05 15:06:18,313][15444] InferenceWorker_p0-w0: stopping experience collection (1550 times) [2024-08-05 15:06:18,318][15444] InferenceWorker_p0-w0: resuming experience collection (1550 times) [2024-08-05 15:06:22,238][15444] Updated weights for policy 0, policy_version 4681 (0.0038) [2024-08-05 15:06:23,118][15372] Fps is (10 sec: 22938.6, 60 sec: 23893.5, 300 sec: 23520.8). Total num frames: 38371328. Throughput: 0: 5990.5. Samples: 9589250. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 15:06:23,119][15372] Avg episode reward: [(0, '28.334')] [2024-08-05 15:06:25,362][15444] Updated weights for policy 0, policy_version 4691 (0.0032) [2024-08-05 15:06:28,118][15372] Fps is (10 sec: 22118.7, 60 sec: 23756.8, 300 sec: 23548.6). Total num frames: 38486016. Throughput: 0: 5910.7. Samples: 9622090. Policy #0 lag: (min: 1.0, avg: 2.8, max: 7.0) [2024-08-05 15:06:28,124][15372] Avg episode reward: [(0, '28.620')] [2024-08-05 15:06:29,310][15444] Updated weights for policy 0, policy_version 4701 (0.0011) [2024-08-05 15:06:32,701][15444] Updated weights for policy 0, policy_version 4711 (0.0017) [2024-08-05 15:06:33,119][15372] Fps is (10 sec: 22937.1, 60 sec: 23756.7, 300 sec: 23520.7). Total num frames: 38600704. Throughput: 0: 5866.0. Samples: 9655910. Policy #0 lag: (min: 1.0, avg: 4.6, max: 9.0) [2024-08-05 15:06:33,119][15372] Avg episode reward: [(0, '28.886')] [2024-08-05 15:06:36,340][15444] Updated weights for policy 0, policy_version 4721 (0.0021) [2024-08-05 15:06:38,118][15372] Fps is (10 sec: 22937.6, 60 sec: 23483.8, 300 sec: 23520.8). Total num frames: 38715392. Throughput: 0: 5865.6. Samples: 9673920. Policy #0 lag: (min: 0.0, avg: 4.0, max: 9.0) [2024-08-05 15:06:38,119][15372] Avg episode reward: [(0, '28.357')] [2024-08-05 15:06:39,657][15444] Updated weights for policy 0, policy_version 4731 (0.0014) [2024-08-05 15:06:43,107][15444] Updated weights for policy 0, policy_version 4741 (0.0038) [2024-08-05 15:06:43,119][15372] Fps is (10 sec: 23755.7, 60 sec: 23620.0, 300 sec: 23548.5). Total num frames: 38838272. Throughput: 0: 5838.1. Samples: 9709230. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 15:06:43,119][15372] Avg episode reward: [(0, '28.408')] [2024-08-05 15:06:46,839][15444] Updated weights for policy 0, policy_version 4751 (0.0021) [2024-08-05 15:06:48,119][15372] Fps is (10 sec: 23756.3, 60 sec: 23483.7, 300 sec: 23548.5). Total num frames: 38952960. Throughput: 0: 5781.1. Samples: 9743260. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 15:06:48,119][15372] Avg episode reward: [(0, '28.484')] [2024-08-05 15:06:50,034][15444] Updated weights for policy 0, policy_version 4761 (0.0022) [2024-08-05 15:06:53,118][15372] Fps is (10 sec: 22119.8, 60 sec: 23347.2, 300 sec: 23520.8). Total num frames: 39059456. Throughput: 0: 5800.2. Samples: 9761270. Policy #0 lag: (min: 1.0, avg: 4.4, max: 8.0) [2024-08-05 15:06:53,119][15372] Avg episode reward: [(0, '28.535')] [2024-08-05 15:06:53,885][15444] Updated weights for policy 0, policy_version 4771 (0.0013) [2024-08-05 15:06:56,130][15417] Signal inference workers to stop experience collection... (1600 times) [2024-08-05 15:06:56,133][15417] Signal inference workers to resume experience collection... (1600 times) [2024-08-05 15:06:56,229][15444] InferenceWorker_p0-w0: stopping experience collection (1600 times) [2024-08-05 15:06:56,235][15444] InferenceWorker_p0-w0: resuming experience collection (1600 times) [2024-08-05 15:06:57,580][15444] Updated weights for policy 0, policy_version 4781 (0.0032) [2024-08-05 15:06:58,119][15372] Fps is (10 sec: 22937.9, 60 sec: 23347.2, 300 sec: 23548.5). Total num frames: 39182336. Throughput: 0: 5762.5. Samples: 9795680. Policy #0 lag: (min: 0.0, avg: 4.4, max: 9.0) [2024-08-05 15:06:58,119][15372] Avg episode reward: [(0, '28.805')] [2024-08-05 15:07:00,794][15444] Updated weights for policy 0, policy_version 4791 (0.0021) [2024-08-05 15:07:03,118][15372] Fps is (10 sec: 23756.8, 60 sec: 23210.7, 300 sec: 23520.8). Total num frames: 39297024. Throughput: 0: 5747.1. Samples: 9831230. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 15:07:03,119][15372] Avg episode reward: [(0, '28.249')] [2024-08-05 15:07:04,530][15444] Updated weights for policy 0, policy_version 4801 (0.0029) [2024-08-05 15:07:07,825][15444] Updated weights for policy 0, policy_version 4811 (0.0013) [2024-08-05 15:07:08,119][15372] Fps is (10 sec: 22937.1, 60 sec: 23074.0, 300 sec: 23520.7). Total num frames: 39411712. Throughput: 0: 5742.0. Samples: 9847640. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 15:07:08,120][15372] Avg episode reward: [(0, '27.537')] [2024-08-05 15:07:11,374][15444] Updated weights for policy 0, policy_version 4821 (0.0012) [2024-08-05 15:07:13,119][15372] Fps is (10 sec: 22937.1, 60 sec: 23074.2, 300 sec: 23520.8). Total num frames: 39526400. Throughput: 0: 5788.9. Samples: 9882590. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 15:07:13,126][15372] Avg episode reward: [(0, '28.322')] [2024-08-05 15:07:14,986][15444] Updated weights for policy 0, policy_version 4831 (0.0025) [2024-08-05 15:07:18,119][15372] Fps is (10 sec: 23756.5, 60 sec: 23074.0, 300 sec: 23548.5). Total num frames: 39649280. Throughput: 0: 5824.9. Samples: 9918030. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 15:07:18,127][15372] Avg episode reward: [(0, '28.784')] [2024-08-05 15:07:18,487][15444] Updated weights for policy 0, policy_version 4841 (0.0017) [2024-08-05 15:07:21,990][15444] Updated weights for policy 0, policy_version 4851 (0.0024) [2024-08-05 15:07:23,118][15372] Fps is (10 sec: 23757.3, 60 sec: 23210.7, 300 sec: 23548.5). Total num frames: 39763968. Throughput: 0: 5826.0. Samples: 9936090. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 15:07:23,119][15372] Avg episode reward: [(0, '28.860')] [2024-08-05 15:07:25,477][15444] Updated weights for policy 0, policy_version 4861 (0.0012) [2024-08-05 15:07:28,119][15372] Fps is (10 sec: 22938.3, 60 sec: 23210.6, 300 sec: 23520.8). Total num frames: 39878656. Throughput: 0: 5799.0. Samples: 9970180. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 15:07:28,132][15372] Avg episode reward: [(0, '27.643')] [2024-08-05 15:07:29,209][15444] Updated weights for policy 0, policy_version 4871 (0.0022) [2024-08-05 15:07:32,549][15444] Updated weights for policy 0, policy_version 4881 (0.0013) [2024-08-05 15:07:33,119][15372] Fps is (10 sec: 22935.5, 60 sec: 23210.4, 300 sec: 23548.5). Total num frames: 39993344. Throughput: 0: 5809.9. Samples: 10004710. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 15:07:33,120][15372] Avg episode reward: [(0, '28.098')] [2024-08-05 15:07:36,020][15444] Updated weights for policy 0, policy_version 4891 (0.0012) [2024-08-05 15:07:38,118][15372] Fps is (10 sec: 22937.7, 60 sec: 23210.6, 300 sec: 23520.8). Total num frames: 40108032. Throughput: 0: 5803.5. Samples: 10022430. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 15:07:38,119][15372] Avg episode reward: [(0, '28.602')] [2024-08-05 15:07:39,549][15444] Updated weights for policy 0, policy_version 4901 (0.0012) [2024-08-05 15:07:43,119][15372] Fps is (10 sec: 22939.5, 60 sec: 23074.4, 300 sec: 23520.8). Total num frames: 40222720. Throughput: 0: 5806.0. Samples: 10056950. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 15:07:43,126][15372] Avg episode reward: [(0, '28.286')] [2024-08-05 15:07:43,430][15444] Updated weights for policy 0, policy_version 4911 (0.0011) [2024-08-05 15:07:46,867][15444] Updated weights for policy 0, policy_version 4921 (0.0020) [2024-08-05 15:07:48,118][15372] Fps is (10 sec: 23756.8, 60 sec: 23210.7, 300 sec: 23548.5). Total num frames: 40345600. Throughput: 0: 5786.0. Samples: 10091600. Policy #0 lag: (min: 0.0, avg: 3.9, max: 9.0) [2024-08-05 15:07:48,119][15372] Avg episode reward: [(0, '27.592')] [2024-08-05 15:07:50,119][15444] Updated weights for policy 0, policy_version 4931 (0.0018) [2024-08-05 15:07:51,443][15417] Signal inference workers to stop experience collection... (1650 times) [2024-08-05 15:07:51,443][15417] Signal inference workers to resume experience collection... (1650 times) [2024-08-05 15:07:51,474][15444] InferenceWorker_p0-w0: stopping experience collection (1650 times) [2024-08-05 15:07:51,524][15444] InferenceWorker_p0-w0: resuming experience collection (1650 times) [2024-08-05 15:07:53,118][15372] Fps is (10 sec: 23757.0, 60 sec: 23347.2, 300 sec: 23548.6). Total num frames: 40460288. Throughput: 0: 5817.4. Samples: 10109420. Policy #0 lag: (min: 0.0, avg: 4.3, max: 8.0) [2024-08-05 15:07:53,119][15372] Avg episode reward: [(0, '27.667')] [2024-08-05 15:07:53,668][15444] Updated weights for policy 0, policy_version 4941 (0.0018) [2024-08-05 15:07:57,152][15444] Updated weights for policy 0, policy_version 4951 (0.0021) [2024-08-05 15:07:58,120][15372] Fps is (10 sec: 23753.9, 60 sec: 23346.7, 300 sec: 23576.2). Total num frames: 40583168. Throughput: 0: 5837.9. Samples: 10145300. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 15:07:58,120][15372] Avg episode reward: [(0, '28.620')] [2024-08-05 15:08:00,681][15444] Updated weights for policy 0, policy_version 4961 (0.0012) [2024-08-05 15:08:03,118][15372] Fps is (10 sec: 23756.8, 60 sec: 23347.2, 300 sec: 23548.6). Total num frames: 40697856. Throughput: 0: 5845.2. Samples: 10181060. Policy #0 lag: (min: 1.0, avg: 3.6, max: 7.0) [2024-08-05 15:08:03,119][15372] Avg episode reward: [(0, '28.769')] [2024-08-05 15:08:04,043][15444] Updated weights for policy 0, policy_version 4971 (0.0011) [2024-08-05 15:08:07,797][15444] Updated weights for policy 0, policy_version 4981 (0.0012) [2024-08-05 15:08:08,118][15372] Fps is (10 sec: 22940.4, 60 sec: 23347.3, 300 sec: 23548.5). Total num frames: 40812544. Throughput: 0: 5813.5. Samples: 10197700. Policy #0 lag: (min: 1.0, avg: 3.9, max: 8.0) [2024-08-05 15:08:08,119][15372] Avg episode reward: [(0, '28.226')] [2024-08-05 15:08:11,155][15444] Updated weights for policy 0, policy_version 4991 (0.0020) [2024-08-05 15:08:13,118][15372] Fps is (10 sec: 23756.8, 60 sec: 23483.8, 300 sec: 23576.3). Total num frames: 40935424. Throughput: 0: 5832.2. Samples: 10232630. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 15:08:13,126][15372] Avg episode reward: [(0, '27.495')] [2024-08-05 15:08:14,587][15444] Updated weights for policy 0, policy_version 5001 (0.0013) [2024-08-05 15:08:17,994][15444] Updated weights for policy 0, policy_version 5011 (0.0013) [2024-08-05 15:08:18,118][15372] Fps is (10 sec: 23756.8, 60 sec: 23347.4, 300 sec: 23548.5). Total num frames: 41050112. Throughput: 0: 5846.6. Samples: 10267800. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 15:08:18,119][15372] Avg episode reward: [(0, '27.770')] [2024-08-05 15:08:18,123][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000005011_41050112.pth... [2024-08-05 15:08:18,237][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000004317_35364864.pth [2024-08-05 15:08:21,585][15444] Updated weights for policy 0, policy_version 5021 (0.0013) [2024-08-05 15:08:23,118][15372] Fps is (10 sec: 22937.7, 60 sec: 23347.2, 300 sec: 23576.3). Total num frames: 41164800. Throughput: 0: 5846.0. Samples: 10285500. Policy #0 lag: (min: 1.0, avg: 3.8, max: 7.0) [2024-08-05 15:08:23,126][15372] Avg episode reward: [(0, '28.103')] [2024-08-05 15:08:25,110][15444] Updated weights for policy 0, policy_version 5031 (0.0019) [2024-08-05 15:08:28,118][15372] Fps is (10 sec: 22937.7, 60 sec: 23347.2, 300 sec: 23576.3). Total num frames: 41279488. Throughput: 0: 5868.5. Samples: 10321030. Policy #0 lag: (min: 0.0, avg: 3.6, max: 9.0) [2024-08-05 15:08:28,126][15372] Avg episode reward: [(0, '28.825')] [2024-08-05 15:08:28,508][15444] Updated weights for policy 0, policy_version 5041 (0.0032) [2024-08-05 15:08:32,412][15444] Updated weights for policy 0, policy_version 5051 (0.0011) [2024-08-05 15:08:33,119][15372] Fps is (10 sec: 23756.6, 60 sec: 23484.1, 300 sec: 23604.1). Total num frames: 41402368. Throughput: 0: 5861.1. Samples: 10355350. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 15:08:33,119][15372] Avg episode reward: [(0, '29.000')] [2024-08-05 15:08:35,466][15444] Updated weights for policy 0, policy_version 5061 (0.0016) [2024-08-05 15:08:38,119][15372] Fps is (10 sec: 22936.2, 60 sec: 23347.0, 300 sec: 23604.0). Total num frames: 41508864. Throughput: 0: 5849.5. Samples: 10372650. Policy #0 lag: (min: 1.0, avg: 3.4, max: 7.0) [2024-08-05 15:08:38,119][15372] Avg episode reward: [(0, '28.352')] [2024-08-05 15:08:39,209][15444] Updated weights for policy 0, policy_version 5071 (0.0033) [2024-08-05 15:08:43,118][15372] Fps is (10 sec: 22118.6, 60 sec: 23347.2, 300 sec: 23659.6). Total num frames: 41623552. Throughput: 0: 5820.2. Samples: 10407200. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 15:08:43,123][15444] Updated weights for policy 0, policy_version 5081 (0.0022) [2024-08-05 15:08:43,126][15372] Avg episode reward: [(0, '28.486')] [2024-08-05 15:08:43,743][15417] Signal inference workers to stop experience collection... (1700 times) [2024-08-05 15:08:43,743][15417] Signal inference workers to resume experience collection... (1700 times) [2024-08-05 15:08:43,792][15444] InferenceWorker_p0-w0: stopping experience collection (1700 times) [2024-08-05 15:08:43,793][15444] InferenceWorker_p0-w0: resuming experience collection (1700 times) [2024-08-05 15:08:46,223][15444] Updated weights for policy 0, policy_version 5091 (0.0027) [2024-08-05 15:08:48,119][15372] Fps is (10 sec: 22938.9, 60 sec: 23210.7, 300 sec: 23659.6). Total num frames: 41738240. Throughput: 0: 5790.7. Samples: 10441640. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 15:08:48,126][15372] Avg episode reward: [(0, '28.005')] [2024-08-05 15:08:50,226][15444] Updated weights for policy 0, policy_version 5101 (0.0057) [2024-08-05 15:08:53,119][15372] Fps is (10 sec: 23756.1, 60 sec: 23347.1, 300 sec: 23659.6). Total num frames: 41861120. Throughput: 0: 5798.6. Samples: 10458640. Policy #0 lag: (min: 2.0, avg: 3.6, max: 8.0) [2024-08-05 15:08:53,119][15372] Avg episode reward: [(0, '28.743')] [2024-08-05 15:08:53,786][15444] Updated weights for policy 0, policy_version 5111 (0.0014) [2024-08-05 15:08:57,020][15444] Updated weights for policy 0, policy_version 5121 (0.0025) [2024-08-05 15:08:58,118][15372] Fps is (10 sec: 22937.7, 60 sec: 23074.6, 300 sec: 23631.8). Total num frames: 41967616. Throughput: 0: 5792.9. Samples: 10493310. Policy #0 lag: (min: 0.0, avg: 4.4, max: 10.0) [2024-08-05 15:08:58,119][15372] Avg episode reward: [(0, '29.206')] [2024-08-05 15:08:58,122][15417] Saving new best policy, reward=29.206! [2024-08-05 15:09:00,797][15444] Updated weights for policy 0, policy_version 5131 (0.0022) [2024-08-05 15:09:03,119][15372] Fps is (10 sec: 22118.5, 60 sec: 23074.0, 300 sec: 23604.1). Total num frames: 42082304. Throughput: 0: 5796.9. Samples: 10528660. Policy #0 lag: (min: 1.0, avg: 3.8, max: 8.0) [2024-08-05 15:09:03,119][15372] Avg episode reward: [(0, '28.270')] [2024-08-05 15:09:04,051][15444] Updated weights for policy 0, policy_version 5141 (0.0017) [2024-08-05 15:09:07,833][15444] Updated weights for policy 0, policy_version 5151 (0.0037) [2024-08-05 15:09:08,119][15372] Fps is (10 sec: 23756.4, 60 sec: 23210.6, 300 sec: 23604.1). Total num frames: 42205184. Throughput: 0: 5772.4. Samples: 10545260. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 15:09:08,119][15372] Avg episode reward: [(0, '27.685')] [2024-08-05 15:09:11,273][15444] Updated weights for policy 0, policy_version 5161 (0.0014) [2024-08-05 15:09:13,119][15372] Fps is (10 sec: 23757.1, 60 sec: 23074.1, 300 sec: 23576.3). Total num frames: 42319872. Throughput: 0: 5748.0. Samples: 10579690. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 15:09:13,126][15372] Avg episode reward: [(0, '28.255')] [2024-08-05 15:09:14,851][15444] Updated weights for policy 0, policy_version 5171 (0.0028) [2024-08-05 15:09:18,119][15372] Fps is (10 sec: 22937.7, 60 sec: 23074.1, 300 sec: 23576.3). Total num frames: 42434560. Throughput: 0: 5770.2. Samples: 10615010. Policy #0 lag: (min: 0.0, avg: 3.3, max: 8.0) [2024-08-05 15:09:18,119][15372] Avg episode reward: [(0, '29.261')] [2024-08-05 15:09:18,122][15417] Saving new best policy, reward=29.261! [2024-08-05 15:09:18,342][15444] Updated weights for policy 0, policy_version 5181 (0.0015) [2024-08-05 15:09:21,917][15444] Updated weights for policy 0, policy_version 5191 (0.0014) [2024-08-05 15:09:23,118][15372] Fps is (10 sec: 22937.9, 60 sec: 23074.1, 300 sec: 23576.3). Total num frames: 42549248. Throughput: 0: 5775.9. Samples: 10632560. Policy #0 lag: (min: 0.0, avg: 4.4, max: 9.0) [2024-08-05 15:09:23,119][15372] Avg episode reward: [(0, '29.054')] [2024-08-05 15:09:25,445][15444] Updated weights for policy 0, policy_version 5201 (0.0011) [2024-08-05 15:09:28,119][15372] Fps is (10 sec: 23756.6, 60 sec: 23210.6, 300 sec: 23548.5). Total num frames: 42672128. Throughput: 0: 5789.3. Samples: 10667720. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 15:09:28,119][15372] Avg episode reward: [(0, '28.846')] [2024-08-05 15:09:28,862][15444] Updated weights for policy 0, policy_version 5211 (0.0025) [2024-08-05 15:09:32,568][15444] Updated weights for policy 0, policy_version 5221 (0.0022) [2024-08-05 15:09:33,118][15372] Fps is (10 sec: 23756.7, 60 sec: 23074.1, 300 sec: 23548.5). Total num frames: 42786816. Throughput: 0: 5793.6. Samples: 10702350. Policy #0 lag: (min: 1.0, avg: 3.9, max: 8.0) [2024-08-05 15:09:33,119][15372] Avg episode reward: [(0, '28.042')] [2024-08-05 15:09:35,775][15444] Updated weights for policy 0, policy_version 5231 (0.0033) [2024-08-05 15:09:38,118][15372] Fps is (10 sec: 22938.0, 60 sec: 23210.9, 300 sec: 23548.5). Total num frames: 42901504. Throughput: 0: 5807.6. Samples: 10719980. Policy #0 lag: (min: 1.0, avg: 3.9, max: 8.0) [2024-08-05 15:09:38,126][15372] Avg episode reward: [(0, '28.229')] [2024-08-05 15:09:39,548][15444] Updated weights for policy 0, policy_version 5241 (0.0027) [2024-08-05 15:09:39,658][15417] Signal inference workers to stop experience collection... (1750 times) [2024-08-05 15:09:39,658][15417] Signal inference workers to resume experience collection... (1750 times) [2024-08-05 15:09:39,694][15444] InferenceWorker_p0-w0: stopping experience collection (1750 times) [2024-08-05 15:09:39,695][15444] InferenceWorker_p0-w0: resuming experience collection (1750 times) [2024-08-05 15:09:43,039][15444] Updated weights for policy 0, policy_version 5251 (0.0020) [2024-08-05 15:09:43,119][15372] Fps is (10 sec: 22937.3, 60 sec: 23210.6, 300 sec: 23493.1). Total num frames: 43016192. Throughput: 0: 5802.4. Samples: 10754420. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 15:09:43,119][15372] Avg episode reward: [(0, '27.658')] [2024-08-05 15:09:46,638][15444] Updated weights for policy 0, policy_version 5261 (0.0023) [2024-08-05 15:09:48,118][15372] Fps is (10 sec: 22118.3, 60 sec: 23074.1, 300 sec: 23437.4). Total num frames: 43122688. Throughput: 0: 5778.7. Samples: 10788700. Policy #0 lag: (min: 0.0, avg: 3.4, max: 8.0) [2024-08-05 15:09:48,126][15372] Avg episode reward: [(0, '27.648')] [2024-08-05 15:09:50,366][15444] Updated weights for policy 0, policy_version 5271 (0.0012) [2024-08-05 15:09:53,120][15372] Fps is (10 sec: 22934.6, 60 sec: 23073.7, 300 sec: 23437.3). Total num frames: 43245568. Throughput: 0: 5804.5. Samples: 10806470. Policy #0 lag: (min: 0.0, avg: 4.8, max: 9.0) [2024-08-05 15:09:53,120][15372] Avg episode reward: [(0, '28.379')] [2024-08-05 15:09:53,546][15444] Updated weights for policy 0, policy_version 5281 (0.0044) [2024-08-05 15:09:57,224][15444] Updated weights for policy 0, policy_version 5291 (0.0013) [2024-08-05 15:09:58,118][15372] Fps is (10 sec: 24576.2, 60 sec: 23347.2, 300 sec: 23437.5). Total num frames: 43368448. Throughput: 0: 5812.9. Samples: 10841270. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 15:09:58,119][15372] Avg episode reward: [(0, '29.012')] [2024-08-05 15:10:00,438][15444] Updated weights for policy 0, policy_version 5301 (0.0020) [2024-08-05 15:10:03,128][15372] Fps is (10 sec: 23736.5, 60 sec: 23343.4, 300 sec: 23436.7). Total num frames: 43483136. Throughput: 0: 5800.7. Samples: 10876100. Policy #0 lag: (min: 1.0, avg: 3.5, max: 7.0) [2024-08-05 15:10:03,129][15372] Avg episode reward: [(0, '28.908')] [2024-08-05 15:10:04,235][15444] Updated weights for policy 0, policy_version 5311 (0.0018) [2024-08-05 15:10:07,776][15444] Updated weights for policy 0, policy_version 5321 (0.0013) [2024-08-05 15:10:08,118][15372] Fps is (10 sec: 22118.3, 60 sec: 23074.2, 300 sec: 23381.9). Total num frames: 43589632. Throughput: 0: 5803.3. Samples: 10893710. Policy #0 lag: (min: 1.0, avg: 3.9, max: 8.0) [2024-08-05 15:10:08,119][15372] Avg episode reward: [(0, '27.870')] [2024-08-05 15:10:11,079][15444] Updated weights for policy 0, policy_version 5331 (0.0019) [2024-08-05 15:10:13,118][15372] Fps is (10 sec: 22960.5, 60 sec: 23210.7, 300 sec: 23381.9). Total num frames: 43712512. Throughput: 0: 5795.6. Samples: 10928520. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 15:10:13,120][15372] Avg episode reward: [(0, '27.918')] [2024-08-05 15:10:15,002][15444] Updated weights for policy 0, policy_version 5341 (0.0023) [2024-08-05 15:10:18,118][15372] Fps is (10 sec: 23756.8, 60 sec: 23210.7, 300 sec: 23354.2). Total num frames: 43827200. Throughput: 0: 5792.9. Samples: 10963030. Policy #0 lag: (min: 2.0, avg: 3.8, max: 8.0) [2024-08-05 15:10:18,130][15372] Avg episode reward: [(0, '28.590')] [2024-08-05 15:10:18,135][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000005350_43827200.pth... [2024-08-05 15:10:18,273][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000004671_38264832.pth [2024-08-05 15:10:18,389][15444] Updated weights for policy 0, policy_version 5351 (0.0018) [2024-08-05 15:10:22,009][15444] Updated weights for policy 0, policy_version 5361 (0.0023) [2024-08-05 15:10:23,118][15372] Fps is (10 sec: 22937.6, 60 sec: 23210.7, 300 sec: 23326.4). Total num frames: 43941888. Throughput: 0: 5780.7. Samples: 10980110. Policy #0 lag: (min: 1.0, avg: 4.2, max: 7.0) [2024-08-05 15:10:23,119][15372] Avg episode reward: [(0, '29.329')] [2024-08-05 15:10:23,119][15417] Saving new best policy, reward=29.329! [2024-08-05 15:10:25,366][15444] Updated weights for policy 0, policy_version 5371 (0.0015) [2024-08-05 15:10:28,119][15372] Fps is (10 sec: 22935.9, 60 sec: 23073.9, 300 sec: 23326.3). Total num frames: 44056576. Throughput: 0: 5790.1. Samples: 11014980. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 15:10:28,120][15372] Avg episode reward: [(0, '28.683')] [2024-08-05 15:10:29,139][15444] Updated weights for policy 0, policy_version 5381 (0.0018) [2024-08-05 15:10:32,780][15444] Updated weights for policy 0, policy_version 5391 (0.0012) [2024-08-05 15:10:32,929][15417] Signal inference workers to stop experience collection... (1800 times) [2024-08-05 15:10:32,929][15417] Signal inference workers to resume experience collection... (1800 times) [2024-08-05 15:10:32,976][15444] InferenceWorker_p0-w0: stopping experience collection (1800 times) [2024-08-05 15:10:32,976][15444] InferenceWorker_p0-w0: resuming experience collection (1800 times) [2024-08-05 15:10:33,118][15372] Fps is (10 sec: 22937.6, 60 sec: 23074.2, 300 sec: 23270.8). Total num frames: 44171264. Throughput: 0: 5802.5. Samples: 11049810. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 15:10:33,119][15372] Avg episode reward: [(0, '28.290')] [2024-08-05 15:10:36,049][15444] Updated weights for policy 0, policy_version 5401 (0.0019) [2024-08-05 15:10:38,119][15372] Fps is (10 sec: 22939.2, 60 sec: 23074.1, 300 sec: 23270.8). Total num frames: 44285952. Throughput: 0: 5795.1. Samples: 11067240. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 15:10:38,119][15372] Avg episode reward: [(0, '28.720')] [2024-08-05 15:10:39,801][15444] Updated weights for policy 0, policy_version 5411 (0.0026) [2024-08-05 15:10:42,956][15444] Updated weights for policy 0, policy_version 5421 (0.0019) [2024-08-05 15:10:43,118][15372] Fps is (10 sec: 23756.7, 60 sec: 23210.7, 300 sec: 23270.8). Total num frames: 44408832. Throughput: 0: 5794.0. Samples: 11102000. Policy #0 lag: (min: 1.0, avg: 3.4, max: 9.0) [2024-08-05 15:10:43,119][15372] Avg episode reward: [(0, '29.037')] [2024-08-05 15:10:46,824][15444] Updated weights for policy 0, policy_version 5431 (0.0015) [2024-08-05 15:10:48,118][15372] Fps is (10 sec: 22937.7, 60 sec: 23210.7, 300 sec: 23243.1). Total num frames: 44515328. Throughput: 0: 5775.7. Samples: 11135950. Policy #0 lag: (min: 1.0, avg: 3.4, max: 9.0) [2024-08-05 15:10:48,119][15372] Avg episode reward: [(0, '28.356')] [2024-08-05 15:10:50,299][15444] Updated weights for policy 0, policy_version 5441 (0.0027) [2024-08-05 15:10:53,118][15372] Fps is (10 sec: 22118.4, 60 sec: 23074.7, 300 sec: 23215.3). Total num frames: 44630016. Throughput: 0: 5782.0. Samples: 11153900. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 15:10:53,119][15372] Avg episode reward: [(0, '28.953')] [2024-08-05 15:10:53,863][15444] Updated weights for policy 0, policy_version 5451 (0.0017) [2024-08-05 15:10:57,613][15444] Updated weights for policy 0, policy_version 5461 (0.0024) [2024-08-05 15:10:58,118][15372] Fps is (10 sec: 22937.7, 60 sec: 22937.6, 300 sec: 23187.5). Total num frames: 44744704. Throughput: 0: 5758.7. Samples: 11187660. Policy #0 lag: (min: 0.0, avg: 3.4, max: 8.0) [2024-08-05 15:10:58,120][15372] Avg episode reward: [(0, '28.366')] [2024-08-05 15:11:00,850][15444] Updated weights for policy 0, policy_version 5471 (0.0023) [2024-08-05 15:11:03,118][15372] Fps is (10 sec: 23756.7, 60 sec: 23078.0, 300 sec: 23187.5). Total num frames: 44867584. Throughput: 0: 5760.4. Samples: 11222250. Policy #0 lag: (min: 0.0, avg: 5.0, max: 10.0) [2024-08-05 15:11:03,126][15372] Avg episode reward: [(0, '28.296')] [2024-08-05 15:11:04,818][15444] Updated weights for policy 0, policy_version 5481 (0.0030) [2024-08-05 15:11:08,118][15372] Fps is (10 sec: 22937.7, 60 sec: 23074.2, 300 sec: 23159.8). Total num frames: 44974080. Throughput: 0: 5749.8. Samples: 11238850. Policy #0 lag: (min: 0.0, avg: 3.3, max: 8.0) [2024-08-05 15:11:08,119][15372] Avg episode reward: [(0, '28.215')] [2024-08-05 15:11:08,191][15444] Updated weights for policy 0, policy_version 5491 (0.0012) [2024-08-05 15:11:11,559][15417] Signal inference workers to stop experience collection... (1850 times) [2024-08-05 15:11:11,565][15417] Signal inference workers to resume experience collection... (1850 times) [2024-08-05 15:11:11,636][15444] InferenceWorker_p0-w0: stopping experience collection (1850 times) [2024-08-05 15:11:11,637][15444] InferenceWorker_p0-w0: resuming experience collection (1850 times) [2024-08-05 15:11:11,638][15444] Updated weights for policy 0, policy_version 5501 (0.0019) [2024-08-05 15:11:13,118][15372] Fps is (10 sec: 22118.4, 60 sec: 22937.6, 300 sec: 23132.0). Total num frames: 45088768. Throughput: 0: 5745.9. Samples: 11273540. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 15:11:13,126][15372] Avg episode reward: [(0, '29.730')] [2024-08-05 15:11:13,127][15417] Saving new best policy, reward=29.730! [2024-08-05 15:11:15,456][15444] Updated weights for policy 0, policy_version 5511 (0.0013) [2024-08-05 15:11:18,131][15372] Fps is (10 sec: 22907.9, 60 sec: 22932.7, 300 sec: 23158.7). Total num frames: 45203456. Throughput: 0: 5737.2. Samples: 11308060. Policy #0 lag: (min: 0.0, avg: 4.6, max: 8.0) [2024-08-05 15:11:18,139][15372] Avg episode reward: [(0, '29.404')] [2024-08-05 15:11:19,181][15444] Updated weights for policy 0, policy_version 5521 (0.0014) [2024-08-05 15:11:23,102][15444] Updated weights for policy 0, policy_version 5531 (0.0049) [2024-08-05 15:11:23,119][15372] Fps is (10 sec: 22117.9, 60 sec: 22801.0, 300 sec: 23132.0). Total num frames: 45309952. Throughput: 0: 5712.4. Samples: 11324300. Policy #0 lag: (min: 1.0, avg: 4.0, max: 8.0) [2024-08-05 15:11:23,122][15372] Avg episode reward: [(0, '29.032')] [2024-08-05 15:11:26,998][15444] Updated weights for policy 0, policy_version 5541 (0.0022) [2024-08-05 15:11:28,119][15372] Fps is (10 sec: 21326.4, 60 sec: 22664.8, 300 sec: 23104.2). Total num frames: 45416448. Throughput: 0: 5612.4. Samples: 11354560. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 15:11:28,119][15372] Avg episode reward: [(0, '28.890')] [2024-08-05 15:11:30,260][15444] Updated weights for policy 0, policy_version 5551 (0.0027) [2024-08-05 15:11:33,119][15372] Fps is (10 sec: 22117.1, 60 sec: 22664.2, 300 sec: 23104.1). Total num frames: 45531136. Throughput: 0: 5639.9. Samples: 11389750. Policy #0 lag: (min: 1.0, avg: 3.8, max: 7.0) [2024-08-05 15:11:33,132][15372] Avg episode reward: [(0, '28.224')] [2024-08-05 15:11:33,936][15444] Updated weights for policy 0, policy_version 5561 (0.0017) [2024-08-05 15:11:37,912][15444] Updated weights for policy 0, policy_version 5571 (0.0013) [2024-08-05 15:11:38,118][15372] Fps is (10 sec: 22118.7, 60 sec: 22528.0, 300 sec: 23048.7). Total num frames: 45637632. Throughput: 0: 5632.7. Samples: 11407370. Policy #0 lag: (min: 2.0, avg: 4.8, max: 9.0) [2024-08-05 15:11:38,119][15372] Avg episode reward: [(0, '28.919')] [2024-08-05 15:11:41,063][15444] Updated weights for policy 0, policy_version 5581 (0.0029) [2024-08-05 15:11:43,119][15372] Fps is (10 sec: 22938.1, 60 sec: 22527.8, 300 sec: 23076.4). Total num frames: 45760512. Throughput: 0: 5620.4. Samples: 11440580. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 15:11:43,119][15372] Avg episode reward: [(0, '30.368')] [2024-08-05 15:11:43,120][15417] Saving new best policy, reward=30.368! [2024-08-05 15:11:45,022][15444] Updated weights for policy 0, policy_version 5591 (0.0035) [2024-08-05 15:11:47,604][15417] Signal inference workers to stop experience collection... (1900 times) [2024-08-05 15:11:47,605][15417] Signal inference workers to resume experience collection... (1900 times) [2024-08-05 15:11:47,638][15444] InferenceWorker_p0-w0: stopping experience collection (1900 times) [2024-08-05 15:11:47,645][15444] InferenceWorker_p0-w0: resuming experience collection (1900 times) [2024-08-05 15:11:48,119][15372] Fps is (10 sec: 23754.9, 60 sec: 22664.2, 300 sec: 23104.2). Total num frames: 45875200. Throughput: 0: 5616.3. Samples: 11474990. Policy #0 lag: (min: 2.0, avg: 4.2, max: 8.0) [2024-08-05 15:11:48,127][15372] Avg episode reward: [(0, '29.088')] [2024-08-05 15:11:48,251][15444] Updated weights for policy 0, policy_version 5601 (0.0021) [2024-08-05 15:11:52,162][15444] Updated weights for policy 0, policy_version 5611 (0.0013) [2024-08-05 15:11:53,119][15372] Fps is (10 sec: 23757.8, 60 sec: 22801.0, 300 sec: 23104.2). Total num frames: 45998080. Throughput: 0: 5626.4. Samples: 11492040. Policy #0 lag: (min: 2.0, avg: 4.2, max: 8.0) [2024-08-05 15:11:53,119][15372] Avg episode reward: [(0, '28.111')] [2024-08-05 15:11:55,266][15444] Updated weights for policy 0, policy_version 5621 (0.0029) [2024-08-05 15:11:58,119][15372] Fps is (10 sec: 22938.9, 60 sec: 22664.4, 300 sec: 23076.4). Total num frames: 46104576. Throughput: 0: 5634.9. Samples: 11527110. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 15:11:58,119][15372] Avg episode reward: [(0, '27.592')] [2024-08-05 15:11:59,179][15444] Updated weights for policy 0, policy_version 5631 (0.0023) [2024-08-05 15:12:02,581][15444] Updated weights for policy 0, policy_version 5641 (0.0022) [2024-08-05 15:12:03,119][15372] Fps is (10 sec: 21299.6, 60 sec: 22391.5, 300 sec: 23048.7). Total num frames: 46211072. Throughput: 0: 5623.2. Samples: 11561030. Policy #0 lag: (min: 1.0, avg: 3.5, max: 8.0) [2024-08-05 15:12:03,119][15372] Avg episode reward: [(0, '28.968')] [2024-08-05 15:12:06,089][15444] Updated weights for policy 0, policy_version 5651 (0.0032) [2024-08-05 15:12:08,119][15372] Fps is (10 sec: 22937.0, 60 sec: 22664.3, 300 sec: 23076.4). Total num frames: 46333952. Throughput: 0: 5649.7. Samples: 11578540. Policy #0 lag: (min: 1.0, avg: 4.5, max: 8.0) [2024-08-05 15:12:08,122][15372] Avg episode reward: [(0, '29.885')] [2024-08-05 15:12:09,775][15444] Updated weights for policy 0, policy_version 5661 (0.0019) [2024-08-05 15:12:13,115][15444] Updated weights for policy 0, policy_version 5671 (0.0026) [2024-08-05 15:12:13,118][15372] Fps is (10 sec: 24576.1, 60 sec: 22801.1, 300 sec: 23076.5). Total num frames: 46456832. Throughput: 0: 5746.9. Samples: 11613170. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 15:12:13,123][15372] Avg episode reward: [(0, '29.113')] [2024-08-05 15:12:16,912][15444] Updated weights for policy 0, policy_version 5681 (0.0018) [2024-08-05 15:12:18,118][15372] Fps is (10 sec: 22938.7, 60 sec: 22669.4, 300 sec: 23048.7). Total num frames: 46563328. Throughput: 0: 5725.4. Samples: 11647390. Policy #0 lag: (min: 0.0, avg: 4.6, max: 9.0) [2024-08-05 15:12:18,119][15372] Avg episode reward: [(0, '28.058')] [2024-08-05 15:12:18,122][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000005684_46563328.pth... [2024-08-05 15:12:18,246][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000005011_41050112.pth [2024-08-05 15:12:20,498][15444] Updated weights for policy 0, policy_version 5691 (0.0020) [2024-08-05 15:12:23,119][15372] Fps is (10 sec: 22118.2, 60 sec: 22801.1, 300 sec: 23048.7). Total num frames: 46678016. Throughput: 0: 5728.0. Samples: 11665130. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 15:12:23,119][15372] Avg episode reward: [(0, '29.193')] [2024-08-05 15:12:23,953][15444] Updated weights for policy 0, policy_version 5701 (0.0023) [2024-08-05 15:12:27,599][15444] Updated weights for policy 0, policy_version 5711 (0.0015) [2024-08-05 15:12:28,119][15372] Fps is (10 sec: 22936.6, 60 sec: 22937.5, 300 sec: 23048.7). Total num frames: 46792704. Throughput: 0: 5752.5. Samples: 11699440. Policy #0 lag: (min: 0.0, avg: 3.7, max: 9.0) [2024-08-05 15:12:28,120][15372] Avg episode reward: [(0, '29.658')] [2024-08-05 15:12:30,805][15444] Updated weights for policy 0, policy_version 5721 (0.0050) [2024-08-05 15:12:33,118][15372] Fps is (10 sec: 23757.0, 60 sec: 23074.5, 300 sec: 23076.5). Total num frames: 46915584. Throughput: 0: 5765.0. Samples: 11734410. Policy #0 lag: (min: 0.0, avg: 3.0, max: 8.0) [2024-08-05 15:12:33,126][15372] Avg episode reward: [(0, '28.348')] [2024-08-05 15:12:34,755][15444] Updated weights for policy 0, policy_version 5731 (0.0016) [2024-08-05 15:12:36,746][15417] Signal inference workers to stop experience collection... (1950 times) [2024-08-05 15:12:36,747][15417] Signal inference workers to resume experience collection... (1950 times) [2024-08-05 15:12:36,805][15444] InferenceWorker_p0-w0: stopping experience collection (1950 times) [2024-08-05 15:12:36,806][15444] InferenceWorker_p0-w0: resuming experience collection (1950 times) [2024-08-05 15:12:38,036][15444] Updated weights for policy 0, policy_version 5741 (0.0015) [2024-08-05 15:12:38,118][15372] Fps is (10 sec: 23757.9, 60 sec: 23210.7, 300 sec: 23076.5). Total num frames: 47030272. Throughput: 0: 5756.3. Samples: 11751070. Policy #0 lag: (min: 0.0, avg: 4.7, max: 9.0) [2024-08-05 15:12:38,119][15372] Avg episode reward: [(0, '29.127')] [2024-08-05 15:12:41,683][15444] Updated weights for policy 0, policy_version 5751 (0.0011) [2024-08-05 15:12:43,118][15372] Fps is (10 sec: 22937.5, 60 sec: 23074.4, 300 sec: 23048.7). Total num frames: 47144960. Throughput: 0: 5765.4. Samples: 11786550. Policy #0 lag: (min: 0.0, avg: 4.7, max: 9.0) [2024-08-05 15:12:43,119][15372] Avg episode reward: [(0, '30.174')] [2024-08-05 15:12:45,184][15444] Updated weights for policy 0, policy_version 5761 (0.0022) [2024-08-05 15:12:48,122][15372] Fps is (10 sec: 22110.6, 60 sec: 22936.6, 300 sec: 23020.6). Total num frames: 47251456. Throughput: 0: 5776.7. Samples: 11821000. Policy #0 lag: (min: 0.0, avg: 3.0, max: 7.0) [2024-08-05 15:12:48,130][15372] Avg episode reward: [(0, '28.430')] [2024-08-05 15:12:48,597][15444] Updated weights for policy 0, policy_version 5771 (0.0027) [2024-08-05 15:12:52,552][15444] Updated weights for policy 0, policy_version 5781 (0.0016) [2024-08-05 15:12:53,118][15372] Fps is (10 sec: 22937.6, 60 sec: 22937.7, 300 sec: 23021.0). Total num frames: 47374336. Throughput: 0: 5764.3. Samples: 11837930. Policy #0 lag: (min: 0.0, avg: 4.7, max: 10.0) [2024-08-05 15:12:53,119][15372] Avg episode reward: [(0, '28.240')] [2024-08-05 15:12:55,662][15444] Updated weights for policy 0, policy_version 5791 (0.0028) [2024-08-05 15:12:58,119][15372] Fps is (10 sec: 23764.6, 60 sec: 23074.1, 300 sec: 23020.9). Total num frames: 47489024. Throughput: 0: 5756.0. Samples: 11872190. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 15:12:58,128][15372] Avg episode reward: [(0, '28.937')] [2024-08-05 15:12:59,626][15444] Updated weights for policy 0, policy_version 5801 (0.0031) [2024-08-05 15:13:03,118][15372] Fps is (10 sec: 22118.5, 60 sec: 23074.2, 300 sec: 22993.1). Total num frames: 47595520. Throughput: 0: 5748.0. Samples: 11906050. Policy #0 lag: (min: 1.0, avg: 4.1, max: 7.0) [2024-08-05 15:13:03,119][15372] Avg episode reward: [(0, '29.472')] [2024-08-05 15:13:03,229][15444] Updated weights for policy 0, policy_version 5811 (0.0021) [2024-08-05 15:13:06,590][15444] Updated weights for policy 0, policy_version 5821 (0.0013) [2024-08-05 15:13:08,118][15372] Fps is (10 sec: 22938.2, 60 sec: 23074.3, 300 sec: 22993.1). Total num frames: 47718400. Throughput: 0: 5756.2. Samples: 11924160. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 15:13:08,126][15372] Avg episode reward: [(0, '28.982')] [2024-08-05 15:13:10,082][15444] Updated weights for policy 0, policy_version 5831 (0.0041) [2024-08-05 15:13:13,119][15372] Fps is (10 sec: 23756.3, 60 sec: 22937.5, 300 sec: 22993.1). Total num frames: 47833088. Throughput: 0: 5768.0. Samples: 11959000. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 15:13:13,127][15372] Avg episode reward: [(0, '29.353')] [2024-08-05 15:13:13,842][15444] Updated weights for policy 0, policy_version 5841 (0.0011) [2024-08-05 15:13:17,241][15444] Updated weights for policy 0, policy_version 5851 (0.0017) [2024-08-05 15:13:18,119][15372] Fps is (10 sec: 22936.6, 60 sec: 23074.0, 300 sec: 22993.1). Total num frames: 47947776. Throughput: 0: 5755.1. Samples: 11993390. Policy #0 lag: (min: 1.0, avg: 3.9, max: 7.0) [2024-08-05 15:13:18,119][15372] Avg episode reward: [(0, '29.306')] [2024-08-05 15:13:20,705][15444] Updated weights for policy 0, policy_version 5861 (0.0017) [2024-08-05 15:13:23,118][15372] Fps is (10 sec: 23757.2, 60 sec: 23210.7, 300 sec: 23020.9). Total num frames: 48070656. Throughput: 0: 5788.4. Samples: 12011550. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 15:13:23,119][15372] Avg episode reward: [(0, '29.000')] [2024-08-05 15:13:24,366][15444] Updated weights for policy 0, policy_version 5871 (0.0033) [2024-08-05 15:13:27,613][15417] Signal inference workers to stop experience collection... (2000 times) [2024-08-05 15:13:27,620][15417] Signal inference workers to resume experience collection... (2000 times) [2024-08-05 15:13:27,657][15444] InferenceWorker_p0-w0: stopping experience collection (2000 times) [2024-08-05 15:13:27,658][15444] InferenceWorker_p0-w0: resuming experience collection (2000 times) [2024-08-05 15:13:27,702][15444] Updated weights for policy 0, policy_version 5881 (0.0015) [2024-08-05 15:13:28,118][15372] Fps is (10 sec: 22938.7, 60 sec: 23074.3, 300 sec: 22965.4). Total num frames: 48177152. Throughput: 0: 5762.2. Samples: 12045850. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 15:13:28,119][15372] Avg episode reward: [(0, '29.309')] [2024-08-05 15:13:31,220][15444] Updated weights for policy 0, policy_version 5891 (0.0032) [2024-08-05 15:13:33,118][15372] Fps is (10 sec: 22118.5, 60 sec: 22937.6, 300 sec: 22993.2). Total num frames: 48291840. Throughput: 0: 5761.3. Samples: 12080240. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 15:13:33,126][15372] Avg episode reward: [(0, '29.264')] [2024-08-05 15:13:35,001][15444] Updated weights for policy 0, policy_version 5901 (0.0018) [2024-08-05 15:13:38,118][15372] Fps is (10 sec: 23756.8, 60 sec: 23074.1, 300 sec: 23020.9). Total num frames: 48414720. Throughput: 0: 5770.2. Samples: 12097590. Policy #0 lag: (min: 1.0, avg: 4.7, max: 8.0) [2024-08-05 15:13:38,126][15372] Avg episode reward: [(0, '29.744')] [2024-08-05 15:13:38,518][15444] Updated weights for policy 0, policy_version 5911 (0.0018) [2024-08-05 15:13:42,025][15444] Updated weights for policy 0, policy_version 5921 (0.0039) [2024-08-05 15:13:43,118][15372] Fps is (10 sec: 23756.7, 60 sec: 23074.1, 300 sec: 23020.9). Total num frames: 48529408. Throughput: 0: 5775.4. Samples: 12132080. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 15:13:43,119][15372] Avg episode reward: [(0, '28.869')] [2024-08-05 15:13:45,788][15444] Updated weights for policy 0, policy_version 5931 (0.0014) [2024-08-05 15:13:48,119][15372] Fps is (10 sec: 22937.4, 60 sec: 23212.0, 300 sec: 22993.2). Total num frames: 48644096. Throughput: 0: 5821.1. Samples: 12168000. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 15:13:48,119][15372] Avg episode reward: [(0, '29.107')] [2024-08-05 15:13:48,871][15444] Updated weights for policy 0, policy_version 5941 (0.0011) [2024-08-05 15:13:52,870][15444] Updated weights for policy 0, policy_version 5951 (0.0011) [2024-08-05 15:13:53,118][15372] Fps is (10 sec: 22937.6, 60 sec: 23074.2, 300 sec: 23020.9). Total num frames: 48758784. Throughput: 0: 5792.4. Samples: 12184820. Policy #0 lag: (min: 2.0, avg: 4.1, max: 9.0) [2024-08-05 15:13:53,119][15372] Avg episode reward: [(0, '29.556')] [2024-08-05 15:13:56,028][15444] Updated weights for policy 0, policy_version 5961 (0.0022) [2024-08-05 15:13:58,118][15372] Fps is (10 sec: 22937.8, 60 sec: 23074.2, 300 sec: 23020.9). Total num frames: 48873472. Throughput: 0: 5782.5. Samples: 12219210. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 15:13:58,119][15372] Avg episode reward: [(0, '29.189')] [2024-08-05 15:13:59,838][15444] Updated weights for policy 0, policy_version 5971 (0.0023) [2024-08-05 15:14:03,118][15372] Fps is (10 sec: 22937.4, 60 sec: 23210.6, 300 sec: 22993.1). Total num frames: 48988160. Throughput: 0: 5766.7. Samples: 12252890. Policy #0 lag: (min: 1.0, avg: 3.5, max: 7.0) [2024-08-05 15:14:03,127][15372] Avg episode reward: [(0, '29.699')] [2024-08-05 15:14:03,397][15444] Updated weights for policy 0, policy_version 5981 (0.0012) [2024-08-05 15:14:06,810][15444] Updated weights for policy 0, policy_version 5991 (0.0029) [2024-08-05 15:14:08,119][15372] Fps is (10 sec: 22937.4, 60 sec: 23074.1, 300 sec: 22993.1). Total num frames: 49102848. Throughput: 0: 5764.4. Samples: 12270950. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 15:14:08,119][15372] Avg episode reward: [(0, '29.344')] [2024-08-05 15:14:10,546][15444] Updated weights for policy 0, policy_version 6001 (0.0027) [2024-08-05 15:14:13,124][15372] Fps is (10 sec: 22925.2, 60 sec: 23072.1, 300 sec: 22992.7). Total num frames: 49217536. Throughput: 0: 5760.0. Samples: 12305080. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 15:14:13,132][15372] Avg episode reward: [(0, '29.309')] [2024-08-05 15:14:13,917][15444] Updated weights for policy 0, policy_version 6011 (0.0012) [2024-08-05 15:14:16,231][15417] Signal inference workers to stop experience collection... (2050 times) [2024-08-05 15:14:16,236][15417] Signal inference workers to resume experience collection... (2050 times) [2024-08-05 15:14:16,286][15444] InferenceWorker_p0-w0: stopping experience collection (2050 times) [2024-08-05 15:14:16,286][15444] InferenceWorker_p0-w0: resuming experience collection (2050 times) [2024-08-05 15:14:17,664][15444] Updated weights for policy 0, policy_version 6021 (0.0020) [2024-08-05 15:14:18,118][15372] Fps is (10 sec: 22937.7, 60 sec: 23074.3, 300 sec: 22993.1). Total num frames: 49332224. Throughput: 0: 5768.2. Samples: 12339810. Policy #0 lag: (min: 1.0, avg: 4.0, max: 8.0) [2024-08-05 15:14:18,119][15372] Avg episode reward: [(0, '30.189')] [2024-08-05 15:14:18,170][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000006023_49340416.pth... [2024-08-05 15:14:18,303][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000005350_43827200.pth [2024-08-05 15:14:21,198][15444] Updated weights for policy 0, policy_version 6031 (0.0014) [2024-08-05 15:14:23,118][15372] Fps is (10 sec: 22950.1, 60 sec: 22937.6, 300 sec: 22965.4). Total num frames: 49446912. Throughput: 0: 5761.6. Samples: 12356860. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 15:14:23,129][15372] Avg episode reward: [(0, '30.145')] [2024-08-05 15:14:24,631][15444] Updated weights for policy 0, policy_version 6041 (0.0020) [2024-08-05 15:14:28,084][15444] Updated weights for policy 0, policy_version 6051 (0.0013) [2024-08-05 15:14:28,118][15372] Fps is (10 sec: 23756.8, 60 sec: 23210.6, 300 sec: 22993.1). Total num frames: 49569792. Throughput: 0: 5778.0. Samples: 12392090. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 15:14:28,119][15372] Avg episode reward: [(0, '29.372')] [2024-08-05 15:14:31,800][15444] Updated weights for policy 0, policy_version 6061 (0.0012) [2024-08-05 15:14:33,119][15372] Fps is (10 sec: 22937.4, 60 sec: 23074.1, 300 sec: 22965.4). Total num frames: 49676288. Throughput: 0: 5742.7. Samples: 12426420. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 15:14:33,119][15372] Avg episode reward: [(0, '30.048')] [2024-08-05 15:14:35,115][15444] Updated weights for policy 0, policy_version 6071 (0.0012) [2024-08-05 15:14:38,118][15372] Fps is (10 sec: 22937.6, 60 sec: 23074.1, 300 sec: 22993.1). Total num frames: 49799168. Throughput: 0: 5768.7. Samples: 12444410. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 15:14:38,126][15372] Avg episode reward: [(0, '29.957')] [2024-08-05 15:14:38,710][15444] Updated weights for policy 0, policy_version 6081 (0.0012) [2024-08-05 15:14:42,392][15444] Updated weights for policy 0, policy_version 6091 (0.0029) [2024-08-05 15:14:43,118][15372] Fps is (10 sec: 23757.0, 60 sec: 23074.1, 300 sec: 23020.9). Total num frames: 49913856. Throughput: 0: 5775.1. Samples: 12479090. Policy #0 lag: (min: 0.0, avg: 4.3, max: 8.0) [2024-08-05 15:14:43,119][15372] Avg episode reward: [(0, '28.857')] [2024-08-05 15:14:45,632][15444] Updated weights for policy 0, policy_version 6101 (0.0010) [2024-08-05 15:14:48,118][15372] Fps is (10 sec: 22937.6, 60 sec: 23074.2, 300 sec: 22993.3). Total num frames: 50028544. Throughput: 0: 5799.3. Samples: 12513860. Policy #0 lag: (min: 0.0, avg: 4.3, max: 8.0) [2024-08-05 15:14:48,126][15372] Avg episode reward: [(0, '28.297')] [2024-08-05 15:14:49,564][15444] Updated weights for policy 0, policy_version 6111 (0.0017) [2024-08-05 15:14:53,019][15444] Updated weights for policy 0, policy_version 6121 (0.0020) [2024-08-05 15:14:53,119][15372] Fps is (10 sec: 22937.4, 60 sec: 23074.1, 300 sec: 22965.4). Total num frames: 50143232. Throughput: 0: 5775.6. Samples: 12530850. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 15:14:53,119][15372] Avg episode reward: [(0, '28.968')] [2024-08-05 15:14:56,378][15444] Updated weights for policy 0, policy_version 6131 (0.0012) [2024-08-05 15:14:58,119][15372] Fps is (10 sec: 22937.5, 60 sec: 23074.1, 300 sec: 22966.1). Total num frames: 50257920. Throughput: 0: 5784.5. Samples: 12565350. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 15:14:58,126][15372] Avg episode reward: [(0, '29.561')] [2024-08-05 15:14:59,983][15444] Updated weights for policy 0, policy_version 6141 (0.0013) [2024-08-05 15:15:03,119][15372] Fps is (10 sec: 22936.8, 60 sec: 23074.0, 300 sec: 22993.1). Total num frames: 50372608. Throughput: 0: 5793.3. Samples: 12600510. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 15:15:03,127][15372] Avg episode reward: [(0, '29.298')] [2024-08-05 15:15:03,474][15444] Updated weights for policy 0, policy_version 6151 (0.0014) [2024-08-05 15:15:07,041][15444] Updated weights for policy 0, policy_version 6161 (0.0025) [2024-08-05 15:15:08,119][15372] Fps is (10 sec: 23756.7, 60 sec: 23210.7, 300 sec: 22993.1). Total num frames: 50495488. Throughput: 0: 5800.7. Samples: 12617890. Policy #0 lag: (min: 1.0, avg: 3.5, max: 7.0) [2024-08-05 15:15:08,119][15372] Avg episode reward: [(0, '29.355')] [2024-08-05 15:15:10,601][15444] Updated weights for policy 0, policy_version 6171 (0.0023) [2024-08-05 15:15:13,119][15372] Fps is (10 sec: 23757.5, 60 sec: 23212.7, 300 sec: 22993.1). Total num frames: 50610176. Throughput: 0: 5789.1. Samples: 12652600. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 15:15:13,119][15372] Avg episode reward: [(0, '29.086')] [2024-08-05 15:15:14,173][15444] Updated weights for policy 0, policy_version 6181 (0.0019) [2024-08-05 15:15:17,728][15444] Updated weights for policy 0, policy_version 6191 (0.0010) [2024-08-05 15:15:18,118][15372] Fps is (10 sec: 22937.9, 60 sec: 23210.7, 300 sec: 22993.1). Total num frames: 50724864. Throughput: 0: 5811.8. Samples: 12687950. Policy #0 lag: (min: 1.0, avg: 3.8, max: 8.0) [2024-08-05 15:15:18,119][15372] Avg episode reward: [(0, '29.488')] [2024-08-05 15:15:21,313][15444] Updated weights for policy 0, policy_version 6201 (0.0012) [2024-08-05 15:15:22,571][15417] Signal inference workers to stop experience collection... (2100 times) [2024-08-05 15:15:22,571][15417] Signal inference workers to resume experience collection... (2100 times) [2024-08-05 15:15:22,641][15444] InferenceWorker_p0-w0: stopping experience collection (2100 times) [2024-08-05 15:15:22,641][15444] InferenceWorker_p0-w0: resuming experience collection (2100 times) [2024-08-05 15:15:23,118][15372] Fps is (10 sec: 22937.8, 60 sec: 23210.7, 300 sec: 22993.2). Total num frames: 50839552. Throughput: 0: 5794.7. Samples: 12705170. Policy #0 lag: (min: 1.0, avg: 3.8, max: 8.0) [2024-08-05 15:15:23,119][15372] Avg episode reward: [(0, '30.006')] [2024-08-05 15:15:24,507][15444] Updated weights for policy 0, policy_version 6211 (0.0027) [2024-08-05 15:15:28,118][15372] Fps is (10 sec: 22937.6, 60 sec: 23074.1, 300 sec: 22993.1). Total num frames: 50954240. Throughput: 0: 5795.3. Samples: 12739880. Policy #0 lag: (min: 0.0, avg: 4.0, max: 9.0) [2024-08-05 15:15:28,126][15372] Avg episode reward: [(0, '30.531')] [2024-08-05 15:15:28,130][15417] Saving new best policy, reward=30.531! [2024-08-05 15:15:28,395][15444] Updated weights for policy 0, policy_version 6221 (0.0018) [2024-08-05 15:15:31,800][15444] Updated weights for policy 0, policy_version 6231 (0.0012) [2024-08-05 15:15:33,119][15372] Fps is (10 sec: 22937.2, 60 sec: 23210.6, 300 sec: 22993.1). Total num frames: 51068928. Throughput: 0: 5783.5. Samples: 12774120. Policy #0 lag: (min: 1.0, avg: 4.0, max: 8.0) [2024-08-05 15:15:33,119][15372] Avg episode reward: [(0, '29.838')] [2024-08-05 15:15:35,348][15444] Updated weights for policy 0, policy_version 6241 (0.0020) [2024-08-05 15:15:38,119][15372] Fps is (10 sec: 23756.5, 60 sec: 23210.6, 300 sec: 22993.1). Total num frames: 51191808. Throughput: 0: 5799.1. Samples: 12791810. Policy #0 lag: (min: 1.0, avg: 3.7, max: 7.0) [2024-08-05 15:15:38,119][15372] Avg episode reward: [(0, '29.064')] [2024-08-05 15:15:38,882][15444] Updated weights for policy 0, policy_version 6251 (0.0017) [2024-08-05 15:15:42,291][15444] Updated weights for policy 0, policy_version 6261 (0.0023) [2024-08-05 15:15:43,118][15372] Fps is (10 sec: 23757.2, 60 sec: 23210.6, 300 sec: 23020.9). Total num frames: 51306496. Throughput: 0: 5807.8. Samples: 12826700. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 15:15:43,119][15372] Avg episode reward: [(0, '29.313')] [2024-08-05 15:15:45,825][15444] Updated weights for policy 0, policy_version 6271 (0.0023) [2024-08-05 15:15:48,119][15372] Fps is (10 sec: 22937.4, 60 sec: 23210.6, 300 sec: 23020.9). Total num frames: 51421184. Throughput: 0: 5807.8. Samples: 12861860. Policy #0 lag: (min: 1.0, avg: 4.0, max: 7.0) [2024-08-05 15:15:48,119][15372] Avg episode reward: [(0, '29.648')] [2024-08-05 15:15:49,576][15444] Updated weights for policy 0, policy_version 6281 (0.0015) [2024-08-05 15:15:53,054][15444] Updated weights for policy 0, policy_version 6291 (0.0016) [2024-08-05 15:15:53,118][15372] Fps is (10 sec: 22937.7, 60 sec: 23210.7, 300 sec: 23020.9). Total num frames: 51535872. Throughput: 0: 5803.6. Samples: 12879050. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 15:15:53,119][15372] Avg episode reward: [(0, '30.293')] [2024-08-05 15:15:56,320][15444] Updated weights for policy 0, policy_version 6301 (0.0013) [2024-08-05 15:15:58,118][15372] Fps is (10 sec: 23757.3, 60 sec: 23347.2, 300 sec: 23020.9). Total num frames: 51658752. Throughput: 0: 5796.7. Samples: 12913450. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 15:15:58,126][15372] Avg episode reward: [(0, '30.358')] [2024-08-05 15:15:59,972][15444] Updated weights for policy 0, policy_version 6311 (0.0022) [2024-08-05 15:16:03,118][15372] Fps is (10 sec: 22937.6, 60 sec: 23210.8, 300 sec: 23020.9). Total num frames: 51765248. Throughput: 0: 5791.6. Samples: 12948570. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 15:16:03,127][15372] Avg episode reward: [(0, '30.086')] [2024-08-05 15:16:03,608][15444] Updated weights for policy 0, policy_version 6321 (0.0027) [2024-08-05 15:16:07,271][15444] Updated weights for policy 0, policy_version 6331 (0.0033) [2024-08-05 15:16:08,118][15372] Fps is (10 sec: 22937.6, 60 sec: 23210.7, 300 sec: 23048.7). Total num frames: 51888128. Throughput: 0: 5797.8. Samples: 12966070. Policy #0 lag: (min: 1.0, avg: 3.8, max: 8.0) [2024-08-05 15:16:08,119][15372] Avg episode reward: [(0, '30.939')] [2024-08-05 15:16:08,124][15417] Saving new best policy, reward=30.939! [2024-08-05 15:16:10,570][15444] Updated weights for policy 0, policy_version 6341 (0.0024) [2024-08-05 15:16:13,118][15372] Fps is (10 sec: 23756.8, 60 sec: 23210.7, 300 sec: 23049.7). Total num frames: 52002816. Throughput: 0: 5798.4. Samples: 13000810. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 15:16:13,119][15372] Avg episode reward: [(0, '29.668')] [2024-08-05 15:16:14,133][15444] Updated weights for policy 0, policy_version 6351 (0.0015) [2024-08-05 15:16:17,608][15444] Updated weights for policy 0, policy_version 6361 (0.0029) [2024-08-05 15:16:18,119][15372] Fps is (10 sec: 22936.1, 60 sec: 23210.4, 300 sec: 23076.4). Total num frames: 52117504. Throughput: 0: 5797.3. Samples: 13035000. Policy #0 lag: (min: 1.0, avg: 3.7, max: 7.0) [2024-08-05 15:16:18,119][15372] Avg episode reward: [(0, '30.122')] [2024-08-05 15:16:18,123][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000006362_52117504.pth... [2024-08-05 15:16:18,283][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000005684_46563328.pth [2024-08-05 15:16:19,640][15417] Signal inference workers to stop experience collection... (2150 times) [2024-08-05 15:16:19,640][15417] Signal inference workers to resume experience collection... (2150 times) [2024-08-05 15:16:19,677][15444] InferenceWorker_p0-w0: stopping experience collection (2150 times) [2024-08-05 15:16:19,683][15444] InferenceWorker_p0-w0: resuming experience collection (2150 times) [2024-08-05 15:16:21,492][15444] Updated weights for policy 0, policy_version 6371 (0.0019) [2024-08-05 15:16:23,118][15372] Fps is (10 sec: 22937.7, 60 sec: 23210.7, 300 sec: 23104.2). Total num frames: 52232192. Throughput: 0: 5796.2. Samples: 13052640. Policy #0 lag: (min: 0.0, avg: 4.6, max: 8.0) [2024-08-05 15:16:23,119][15372] Avg episode reward: [(0, '30.079')] [2024-08-05 15:16:24,743][15444] Updated weights for policy 0, policy_version 6381 (0.0043) [2024-08-05 15:16:28,119][15372] Fps is (10 sec: 22119.6, 60 sec: 23074.1, 300 sec: 23076.5). Total num frames: 52338688. Throughput: 0: 5742.9. Samples: 13085130. Policy #0 lag: (min: 0.0, avg: 4.6, max: 8.0) [2024-08-05 15:16:28,119][15372] Avg episode reward: [(0, '29.332')] [2024-08-05 15:16:28,895][15444] Updated weights for policy 0, policy_version 6391 (0.0020) [2024-08-05 15:16:32,222][15444] Updated weights for policy 0, policy_version 6401 (0.0022) [2024-08-05 15:16:33,119][15372] Fps is (10 sec: 21298.9, 60 sec: 22937.6, 300 sec: 23076.4). Total num frames: 52445184. Throughput: 0: 5726.0. Samples: 13119530. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 15:16:33,119][15372] Avg episode reward: [(0, '30.449')] [2024-08-05 15:16:35,827][15444] Updated weights for policy 0, policy_version 6411 (0.0017) [2024-08-05 15:16:38,119][15372] Fps is (10 sec: 22935.6, 60 sec: 22937.3, 300 sec: 23076.4). Total num frames: 52568064. Throughput: 0: 5737.0. Samples: 13137220. Policy #0 lag: (min: 0.0, avg: 4.5, max: 7.0) [2024-08-05 15:16:38,120][15372] Avg episode reward: [(0, '30.130')] [2024-08-05 15:16:39,777][15444] Updated weights for policy 0, policy_version 6421 (0.0017) [2024-08-05 15:16:42,856][15444] Updated weights for policy 0, policy_version 6431 (0.0019) [2024-08-05 15:16:43,118][15372] Fps is (10 sec: 24576.3, 60 sec: 23074.1, 300 sec: 23104.3). Total num frames: 52690944. Throughput: 0: 5741.3. Samples: 13171810. Policy #0 lag: (min: 1.0, avg: 4.2, max: 9.0) [2024-08-05 15:16:43,119][15372] Avg episode reward: [(0, '29.324')] [2024-08-05 15:16:46,874][15444] Updated weights for policy 0, policy_version 6441 (0.0013) [2024-08-05 15:16:48,119][15372] Fps is (10 sec: 23757.8, 60 sec: 23074.0, 300 sec: 23076.4). Total num frames: 52805632. Throughput: 0: 5710.4. Samples: 13205540. Policy #0 lag: (min: 0.0, avg: 3.0, max: 8.0) [2024-08-05 15:16:48,120][15372] Avg episode reward: [(0, '29.455')] [2024-08-05 15:16:49,913][15444] Updated weights for policy 0, policy_version 6451 (0.0024) [2024-08-05 15:16:53,118][15372] Fps is (10 sec: 22118.3, 60 sec: 22937.6, 300 sec: 23076.5). Total num frames: 52912128. Throughput: 0: 5724.7. Samples: 13223680. Policy #0 lag: (min: 0.0, avg: 4.7, max: 8.0) [2024-08-05 15:16:53,126][15372] Avg episode reward: [(0, '30.853')] [2024-08-05 15:16:53,757][15444] Updated weights for policy 0, policy_version 6461 (0.0022) [2024-08-05 15:16:57,066][15444] Updated weights for policy 0, policy_version 6471 (0.0017) [2024-08-05 15:16:58,119][15372] Fps is (10 sec: 22119.3, 60 sec: 22801.0, 300 sec: 23104.2). Total num frames: 53026816. Throughput: 0: 5707.8. Samples: 13257660. Policy #0 lag: (min: 0.0, avg: 4.7, max: 8.0) [2024-08-05 15:16:58,119][15372] Avg episode reward: [(0, '31.734')] [2024-08-05 15:16:58,123][15417] Saving new best policy, reward=31.734! [2024-08-05 15:16:58,246][15417] Signal inference workers to stop experience collection... (2200 times) [2024-08-05 15:16:58,246][15417] Signal inference workers to resume experience collection... (2200 times) [2024-08-05 15:16:58,289][15444] InferenceWorker_p0-w0: stopping experience collection (2200 times) [2024-08-05 15:16:58,297][15444] InferenceWorker_p0-w0: resuming experience collection (2200 times) [2024-08-05 15:17:00,665][15444] Updated weights for policy 0, policy_version 6481 (0.0017) [2024-08-05 15:17:03,119][15372] Fps is (10 sec: 23756.1, 60 sec: 23074.0, 300 sec: 23104.2). Total num frames: 53149696. Throughput: 0: 5734.3. Samples: 13293040. Policy #0 lag: (min: 1.0, avg: 3.9, max: 8.0) [2024-08-05 15:17:03,119][15372] Avg episode reward: [(0, '31.086')] [2024-08-05 15:17:04,280][15444] Updated weights for policy 0, policy_version 6491 (0.0016) [2024-08-05 15:17:07,726][15444] Updated weights for policy 0, policy_version 6501 (0.0019) [2024-08-05 15:17:08,119][15372] Fps is (10 sec: 23757.0, 60 sec: 22937.6, 300 sec: 23076.4). Total num frames: 53264384. Throughput: 0: 5716.0. Samples: 13309860. Policy #0 lag: (min: 0.0, avg: 3.2, max: 7.0) [2024-08-05 15:17:08,119][15372] Avg episode reward: [(0, '31.026')] [2024-08-05 15:17:11,541][15444] Updated weights for policy 0, policy_version 6511 (0.0022) [2024-08-05 15:17:13,119][15372] Fps is (10 sec: 22117.8, 60 sec: 22800.8, 300 sec: 23076.4). Total num frames: 53370880. Throughput: 0: 5761.9. Samples: 13344420. Policy #0 lag: (min: 0.0, avg: 4.3, max: 7.0) [2024-08-05 15:17:13,120][15372] Avg episode reward: [(0, '30.346')] [2024-08-05 15:17:14,874][15444] Updated weights for policy 0, policy_version 6521 (0.0040) [2024-08-05 15:17:18,118][15372] Fps is (10 sec: 22937.7, 60 sec: 22937.8, 300 sec: 23104.2). Total num frames: 53493760. Throughput: 0: 5778.5. Samples: 13379560. Policy #0 lag: (min: 0.0, avg: 4.4, max: 9.0) [2024-08-05 15:17:18,126][15372] Avg episode reward: [(0, '30.253')] [2024-08-05 15:17:18,436][15444] Updated weights for policy 0, policy_version 6531 (0.0025) [2024-08-05 15:17:21,883][15444] Updated weights for policy 0, policy_version 6541 (0.0010) [2024-08-05 15:17:23,118][15372] Fps is (10 sec: 23758.3, 60 sec: 22937.6, 300 sec: 23104.3). Total num frames: 53608448. Throughput: 0: 5780.8. Samples: 13397350. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 15:17:23,119][15372] Avg episode reward: [(0, '30.087')] [2024-08-05 15:17:25,448][15444] Updated weights for policy 0, policy_version 6551 (0.0023) [2024-08-05 15:17:28,119][15372] Fps is (10 sec: 23756.5, 60 sec: 23210.6, 300 sec: 23104.2). Total num frames: 53731328. Throughput: 0: 5784.9. Samples: 13432130. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 15:17:28,119][15372] Avg episode reward: [(0, '30.925')] [2024-08-05 15:17:28,938][15444] Updated weights for policy 0, policy_version 6561 (0.0025) [2024-08-05 15:17:32,475][15444] Updated weights for policy 0, policy_version 6571 (0.0024) [2024-08-05 15:17:33,118][15372] Fps is (10 sec: 22937.5, 60 sec: 23210.7, 300 sec: 23076.4). Total num frames: 53837824. Throughput: 0: 5783.6. Samples: 13465800. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 15:17:33,126][15372] Avg episode reward: [(0, '31.059')] [2024-08-05 15:17:36,099][15444] Updated weights for policy 0, policy_version 6581 (0.0027) [2024-08-05 15:17:38,118][15372] Fps is (10 sec: 22118.7, 60 sec: 23074.5, 300 sec: 23076.4). Total num frames: 53952512. Throughput: 0: 5762.2. Samples: 13482980. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 15:17:38,119][15372] Avg episode reward: [(0, '30.601')] [2024-08-05 15:17:40,052][15444] Updated weights for policy 0, policy_version 6591 (0.0017) [2024-08-05 15:17:43,119][15372] Fps is (10 sec: 23756.5, 60 sec: 23074.1, 300 sec: 23132.3). Total num frames: 54075392. Throughput: 0: 5768.9. Samples: 13517260. Policy #0 lag: (min: 0.0, avg: 4.2, max: 9.0) [2024-08-05 15:17:43,127][15372] Avg episode reward: [(0, '31.019')] [2024-08-05 15:17:43,130][15444] Updated weights for policy 0, policy_version 6601 (0.0022) [2024-08-05 15:17:46,939][15444] Updated weights for policy 0, policy_version 6611 (0.0020) [2024-08-05 15:17:48,118][15372] Fps is (10 sec: 22937.7, 60 sec: 22937.8, 300 sec: 23076.4). Total num frames: 54181888. Throughput: 0: 5741.8. Samples: 13551420. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 15:17:48,119][15372] Avg episode reward: [(0, '30.853')] [2024-08-05 15:17:50,364][15444] Updated weights for policy 0, policy_version 6621 (0.0020) [2024-08-05 15:17:53,118][15372] Fps is (10 sec: 21299.4, 60 sec: 22937.6, 300 sec: 23048.7). Total num frames: 54288384. Throughput: 0: 5756.9. Samples: 13568920. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 15:17:53,126][15372] Avg episode reward: [(0, '31.237')] [2024-08-05 15:17:54,192][15444] Updated weights for policy 0, policy_version 6631 (0.0046) [2024-08-05 15:17:56,265][15417] Signal inference workers to stop experience collection... (2250 times) [2024-08-05 15:17:56,266][15417] Signal inference workers to resume experience collection... (2250 times) [2024-08-05 15:17:56,325][15444] InferenceWorker_p0-w0: stopping experience collection (2250 times) [2024-08-05 15:17:56,326][15444] InferenceWorker_p0-w0: resuming experience collection (2250 times) [2024-08-05 15:17:57,778][15444] Updated weights for policy 0, policy_version 6641 (0.0041) [2024-08-05 15:17:58,119][15372] Fps is (10 sec: 22937.3, 60 sec: 23074.1, 300 sec: 23104.2). Total num frames: 54411264. Throughput: 0: 5756.3. Samples: 13603450. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 15:17:58,119][15372] Avg episode reward: [(0, '30.976')] [2024-08-05 15:18:00,776][15444] Updated weights for policy 0, policy_version 6651 (0.0020) [2024-08-05 15:18:03,119][15372] Fps is (10 sec: 24575.6, 60 sec: 23074.2, 300 sec: 23104.2). Total num frames: 54534144. Throughput: 0: 5761.5. Samples: 13638830. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 15:18:03,127][15372] Avg episode reward: [(0, '30.593')] [2024-08-05 15:18:04,680][15444] Updated weights for policy 0, policy_version 6661 (0.0030) [2024-08-05 15:18:08,119][15372] Fps is (10 sec: 22937.7, 60 sec: 22937.6, 300 sec: 23076.5). Total num frames: 54640640. Throughput: 0: 5746.7. Samples: 13655950. Policy #0 lag: (min: 1.0, avg: 4.0, max: 8.0) [2024-08-05 15:18:08,127][15372] Avg episode reward: [(0, '30.552')] [2024-08-05 15:18:08,174][15444] Updated weights for policy 0, policy_version 6671 (0.0027) [2024-08-05 15:18:11,823][15444] Updated weights for policy 0, policy_version 6681 (0.0033) [2024-08-05 15:18:13,118][15372] Fps is (10 sec: 22118.7, 60 sec: 23074.4, 300 sec: 23076.5). Total num frames: 54755328. Throughput: 0: 5734.2. Samples: 13690170. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 15:18:13,119][15372] Avg episode reward: [(0, '30.587')] [2024-08-05 15:18:15,372][15444] Updated weights for policy 0, policy_version 6691 (0.0011) [2024-08-05 15:18:18,119][15372] Fps is (10 sec: 22937.0, 60 sec: 22937.5, 300 sec: 23048.7). Total num frames: 54870016. Throughput: 0: 5762.8. Samples: 13725130. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 15:18:18,127][15372] Avg episode reward: [(0, '31.443')] [2024-08-05 15:18:18,223][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000006699_54878208.pth... [2024-08-05 15:18:18,383][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000006023_49340416.pth [2024-08-05 15:18:18,957][15444] Updated weights for policy 0, policy_version 6701 (0.0017) [2024-08-05 15:18:22,593][15444] Updated weights for policy 0, policy_version 6711 (0.0018) [2024-08-05 15:18:23,118][15372] Fps is (10 sec: 22937.7, 60 sec: 22937.6, 300 sec: 23076.4). Total num frames: 54984704. Throughput: 0: 5741.6. Samples: 13741350. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 15:18:23,119][15372] Avg episode reward: [(0, '31.002')] [2024-08-05 15:18:26,096][15444] Updated weights for policy 0, policy_version 6721 (0.0017) [2024-08-05 15:18:28,119][15372] Fps is (10 sec: 23757.4, 60 sec: 22937.6, 300 sec: 23104.2). Total num frames: 55107584. Throughput: 0: 5733.3. Samples: 13775260. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 15:18:28,120][15372] Avg episode reward: [(0, '30.900')] [2024-08-05 15:18:29,828][15444] Updated weights for policy 0, policy_version 6731 (0.0019) [2024-08-05 15:18:33,118][15372] Fps is (10 sec: 22937.5, 60 sec: 22937.6, 300 sec: 23048.7). Total num frames: 55214080. Throughput: 0: 5740.2. Samples: 13809730. Policy #0 lag: (min: 1.0, avg: 3.8, max: 8.0) [2024-08-05 15:18:33,126][15372] Avg episode reward: [(0, '31.315')] [2024-08-05 15:18:33,447][15444] Updated weights for policy 0, policy_version 6741 (0.0032) [2024-08-05 15:18:36,830][15444] Updated weights for policy 0, policy_version 6751 (0.0020) [2024-08-05 15:18:38,119][15372] Fps is (10 sec: 22117.9, 60 sec: 22937.5, 300 sec: 23048.7). Total num frames: 55328768. Throughput: 0: 5748.8. Samples: 13827620. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 15:18:38,119][15372] Avg episode reward: [(0, '31.505')] [2024-08-05 15:18:40,517][15444] Updated weights for policy 0, policy_version 6761 (0.0012) [2024-08-05 15:18:43,119][15372] Fps is (10 sec: 22937.5, 60 sec: 22801.1, 300 sec: 23048.7). Total num frames: 55443456. Throughput: 0: 5725.8. Samples: 13861110. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 15:18:43,126][15372] Avg episode reward: [(0, '30.014')] [2024-08-05 15:18:43,922][15444] Updated weights for policy 0, policy_version 6771 (0.0012) [2024-08-05 15:18:47,767][15444] Updated weights for policy 0, policy_version 6781 (0.0012) [2024-08-05 15:18:48,119][15372] Fps is (10 sec: 22937.9, 60 sec: 22937.5, 300 sec: 23048.7). Total num frames: 55558144. Throughput: 0: 5705.1. Samples: 13895560. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 15:18:48,119][15372] Avg episode reward: [(0, '29.655')] [2024-08-05 15:18:50,067][15417] Signal inference workers to stop experience collection... (2300 times) [2024-08-05 15:18:50,068][15417] Signal inference workers to resume experience collection... (2300 times) [2024-08-05 15:18:50,133][15444] InferenceWorker_p0-w0: stopping experience collection (2300 times) [2024-08-05 15:18:50,134][15444] InferenceWorker_p0-w0: resuming experience collection (2300 times) [2024-08-05 15:18:51,327][15444] Updated weights for policy 0, policy_version 6791 (0.0023) [2024-08-05 15:18:53,119][15372] Fps is (10 sec: 22937.5, 60 sec: 23074.1, 300 sec: 23048.7). Total num frames: 55672832. Throughput: 0: 5697.8. Samples: 13912350. Policy #0 lag: (min: 0.0, avg: 4.6, max: 9.0) [2024-08-05 15:18:53,119][15372] Avg episode reward: [(0, '30.714')] [2024-08-05 15:18:54,650][15444] Updated weights for policy 0, policy_version 6801 (0.0030) [2024-08-05 15:18:58,118][15372] Fps is (10 sec: 22938.0, 60 sec: 22937.6, 300 sec: 23048.7). Total num frames: 55787520. Throughput: 0: 5744.9. Samples: 13948690. Policy #0 lag: (min: 1.0, avg: 4.4, max: 8.0) [2024-08-05 15:18:58,119][15372] Avg episode reward: [(0, '31.370')] [2024-08-05 15:18:58,291][15444] Updated weights for policy 0, policy_version 6811 (0.0043) [2024-08-05 15:19:01,374][15444] Updated weights for policy 0, policy_version 6821 (0.0012) [2024-08-05 15:19:03,118][15372] Fps is (10 sec: 23757.0, 60 sec: 22937.7, 300 sec: 23076.5). Total num frames: 55910400. Throughput: 0: 5743.4. Samples: 13983580. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 15:19:03,126][15372] Avg episode reward: [(0, '30.704')] [2024-08-05 15:19:05,163][15444] Updated weights for policy 0, policy_version 6831 (0.0019) [2024-08-05 15:19:08,119][15372] Fps is (10 sec: 24575.7, 60 sec: 23210.7, 300 sec: 23104.6). Total num frames: 56033280. Throughput: 0: 5773.8. Samples: 14001170. Policy #0 lag: (min: 0.0, avg: 3.1, max: 7.0) [2024-08-05 15:19:08,126][15372] Avg episode reward: [(0, '30.957')] [2024-08-05 15:19:08,636][15444] Updated weights for policy 0, policy_version 6841 (0.0021) [2024-08-05 15:19:12,241][15444] Updated weights for policy 0, policy_version 6851 (0.0015) [2024-08-05 15:19:13,119][15372] Fps is (10 sec: 22937.2, 60 sec: 23074.1, 300 sec: 23076.4). Total num frames: 56139776. Throughput: 0: 5787.5. Samples: 14035700. Policy #0 lag: (min: 0.0, avg: 3.1, max: 7.0) [2024-08-05 15:19:13,119][15372] Avg episode reward: [(0, '31.495')] [2024-08-05 15:19:16,056][15444] Updated weights for policy 0, policy_version 6861 (0.0015) [2024-08-05 15:19:18,118][15372] Fps is (10 sec: 22118.6, 60 sec: 23074.3, 300 sec: 23076.4). Total num frames: 56254464. Throughput: 0: 5782.0. Samples: 14069920. Policy #0 lag: (min: 1.0, avg: 3.7, max: 7.0) [2024-08-05 15:19:18,119][15372] Avg episode reward: [(0, '31.160')] [2024-08-05 15:19:19,281][15444] Updated weights for policy 0, policy_version 6871 (0.0033) [2024-08-05 15:19:23,118][15372] Fps is (10 sec: 22118.8, 60 sec: 22937.6, 300 sec: 23020.9). Total num frames: 56360960. Throughput: 0: 5774.3. Samples: 14087460. Policy #0 lag: (min: 1.0, avg: 4.0, max: 7.0) [2024-08-05 15:19:23,126][15372] Avg episode reward: [(0, '31.087')] [2024-08-05 15:19:23,192][15444] Updated weights for policy 0, policy_version 6881 (0.0018) [2024-08-05 15:19:26,505][15444] Updated weights for policy 0, policy_version 6891 (0.0017) [2024-08-05 15:19:28,118][15372] Fps is (10 sec: 22937.6, 60 sec: 22937.6, 300 sec: 23076.5). Total num frames: 56483840. Throughput: 0: 5783.1. Samples: 14121350. Policy #0 lag: (min: 2.0, avg: 3.9, max: 8.0) [2024-08-05 15:19:28,126][15372] Avg episode reward: [(0, '31.648')] [2024-08-05 15:19:30,075][15444] Updated weights for policy 0, policy_version 6901 (0.0032) [2024-08-05 15:19:33,119][15372] Fps is (10 sec: 23756.1, 60 sec: 23074.0, 300 sec: 23048.7). Total num frames: 56598528. Throughput: 0: 5797.8. Samples: 14156460. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 15:19:33,120][15372] Avg episode reward: [(0, '32.104')] [2024-08-05 15:19:33,174][15417] Saving new best policy, reward=32.104! [2024-08-05 15:19:33,734][15444] Updated weights for policy 0, policy_version 6911 (0.0023) [2024-08-05 15:19:34,900][15417] Signal inference workers to stop experience collection... (2350 times) [2024-08-05 15:19:34,901][15417] Signal inference workers to resume experience collection... (2350 times) [2024-08-05 15:19:34,957][15444] InferenceWorker_p0-w0: stopping experience collection (2350 times) [2024-08-05 15:19:34,957][15444] InferenceWorker_p0-w0: resuming experience collection (2350 times) [2024-08-05 15:19:37,066][15444] Updated weights for policy 0, policy_version 6921 (0.0027) [2024-08-05 15:19:38,119][15372] Fps is (10 sec: 23756.5, 60 sec: 23210.7, 300 sec: 23076.4). Total num frames: 56721408. Throughput: 0: 5806.9. Samples: 14173660. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 15:19:38,119][15372] Avg episode reward: [(0, '31.548')] [2024-08-05 15:19:40,631][15444] Updated weights for policy 0, policy_version 6931 (0.0013) [2024-08-05 15:19:43,119][15372] Fps is (10 sec: 23757.3, 60 sec: 23210.7, 300 sec: 23076.4). Total num frames: 56836096. Throughput: 0: 5795.3. Samples: 14209480. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 15:19:43,119][15372] Avg episode reward: [(0, '30.888')] [2024-08-05 15:19:44,041][15444] Updated weights for policy 0, policy_version 6941 (0.0012) [2024-08-05 15:19:47,591][15444] Updated weights for policy 0, policy_version 6951 (0.0028) [2024-08-05 15:19:48,119][15372] Fps is (10 sec: 22937.8, 60 sec: 23210.7, 300 sec: 23076.4). Total num frames: 56950784. Throughput: 0: 5785.1. Samples: 14243910. Policy #0 lag: (min: 0.0, avg: 4.4, max: 9.0) [2024-08-05 15:19:48,119][15372] Avg episode reward: [(0, '30.834')] [2024-08-05 15:19:51,143][15444] Updated weights for policy 0, policy_version 6961 (0.0019) [2024-08-05 15:19:53,118][15372] Fps is (10 sec: 22937.7, 60 sec: 23210.7, 300 sec: 23076.5). Total num frames: 57065472. Throughput: 0: 5789.1. Samples: 14261680. Policy #0 lag: (min: 1.0, avg: 4.4, max: 8.0) [2024-08-05 15:19:53,126][15372] Avg episode reward: [(0, '31.068')] [2024-08-05 15:19:54,553][15444] Updated weights for policy 0, policy_version 6971 (0.0013) [2024-08-05 15:19:58,118][15372] Fps is (10 sec: 23756.9, 60 sec: 23347.2, 300 sec: 23104.2). Total num frames: 57188352. Throughput: 0: 5808.0. Samples: 14297060. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 15:19:58,119][15372] Avg episode reward: [(0, '30.813')] [2024-08-05 15:19:58,141][15444] Updated weights for policy 0, policy_version 6981 (0.0039) [2024-08-05 15:20:01,660][15444] Updated weights for policy 0, policy_version 6991 (0.0014) [2024-08-05 15:20:03,121][15372] Fps is (10 sec: 23751.5, 60 sec: 23209.8, 300 sec: 23076.3). Total num frames: 57303040. Throughput: 0: 5810.4. Samples: 14331400. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 15:20:03,129][15372] Avg episode reward: [(0, '30.995')] [2024-08-05 15:20:05,225][15444] Updated weights for policy 0, policy_version 7001 (0.0024) [2024-08-05 15:20:08,120][15372] Fps is (10 sec: 22934.8, 60 sec: 23073.7, 300 sec: 23076.4). Total num frames: 57417728. Throughput: 0: 5815.0. Samples: 14349140. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 15:20:08,120][15372] Avg episode reward: [(0, '32.054')] [2024-08-05 15:20:08,714][15444] Updated weights for policy 0, policy_version 7011 (0.0014) [2024-08-05 15:20:12,124][15444] Updated weights for policy 0, policy_version 7021 (0.0012) [2024-08-05 15:20:13,118][15372] Fps is (10 sec: 22942.6, 60 sec: 23210.7, 300 sec: 23076.4). Total num frames: 57532416. Throughput: 0: 5832.4. Samples: 14383810. Policy #0 lag: (min: 1.0, avg: 3.5, max: 7.0) [2024-08-05 15:20:13,119][15372] Avg episode reward: [(0, '32.366')] [2024-08-05 15:20:13,119][15417] Saving new best policy, reward=32.366! [2024-08-05 15:20:15,829][15444] Updated weights for policy 0, policy_version 7031 (0.0020) [2024-08-05 15:20:18,118][15372] Fps is (10 sec: 22940.3, 60 sec: 23210.7, 300 sec: 23076.4). Total num frames: 57647104. Throughput: 0: 5835.1. Samples: 14419040. Policy #0 lag: (min: 1.0, avg: 3.6, max: 7.0) [2024-08-05 15:20:18,119][15372] Avg episode reward: [(0, '32.508')] [2024-08-05 15:20:18,123][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000007037_57647104.pth... [2024-08-05 15:20:18,283][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000006362_52117504.pth [2024-08-05 15:20:18,295][15417] Saving new best policy, reward=32.508! [2024-08-05 15:20:19,370][15444] Updated weights for policy 0, policy_version 7041 (0.0011) [2024-08-05 15:20:23,027][15444] Updated weights for policy 0, policy_version 7051 (0.0022) [2024-08-05 15:20:23,119][15372] Fps is (10 sec: 22937.5, 60 sec: 23347.2, 300 sec: 23076.4). Total num frames: 57761792. Throughput: 0: 5802.4. Samples: 14434770. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 15:20:23,119][15372] Avg episode reward: [(0, '31.342')] [2024-08-05 15:20:26,580][15444] Updated weights for policy 0, policy_version 7061 (0.0017) [2024-08-05 15:20:28,119][15372] Fps is (10 sec: 22936.6, 60 sec: 23210.5, 300 sec: 23076.4). Total num frames: 57876480. Throughput: 0: 5771.3. Samples: 14469190. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 15:20:28,119][15372] Avg episode reward: [(0, '31.675')] [2024-08-05 15:20:30,064][15444] Updated weights for policy 0, policy_version 7071 (0.0025) [2024-08-05 15:20:33,118][15372] Fps is (10 sec: 22937.7, 60 sec: 23210.8, 300 sec: 23048.7). Total num frames: 57991168. Throughput: 0: 5780.9. Samples: 14504050. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 15:20:33,119][15372] Avg episode reward: [(0, '30.920')] [2024-08-05 15:20:33,786][15444] Updated weights for policy 0, policy_version 7081 (0.0012) [2024-08-05 15:20:37,049][15444] Updated weights for policy 0, policy_version 7091 (0.0020) [2024-08-05 15:20:38,119][15372] Fps is (10 sec: 22938.5, 60 sec: 23074.2, 300 sec: 23048.7). Total num frames: 58105856. Throughput: 0: 5780.7. Samples: 14521810. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 15:20:38,119][15372] Avg episode reward: [(0, '31.150')] [2024-08-05 15:20:38,172][15417] Signal inference workers to stop experience collection... (2400 times) [2024-08-05 15:20:38,172][15417] Signal inference workers to resume experience collection... (2400 times) [2024-08-05 15:20:38,224][15444] InferenceWorker_p0-w0: stopping experience collection (2400 times) [2024-08-05 15:20:38,229][15444] InferenceWorker_p0-w0: resuming experience collection (2400 times) [2024-08-05 15:20:40,745][15444] Updated weights for policy 0, policy_version 7101 (0.0028) [2024-08-05 15:20:43,119][15372] Fps is (10 sec: 23756.7, 60 sec: 23210.7, 300 sec: 23076.5). Total num frames: 58228736. Throughput: 0: 5773.5. Samples: 14556870. Policy #0 lag: (min: 0.0, avg: 4.3, max: 8.0) [2024-08-05 15:20:43,119][15372] Avg episode reward: [(0, '31.963')] [2024-08-05 15:20:44,269][15444] Updated weights for policy 0, policy_version 7111 (0.0027) [2024-08-05 15:20:47,781][15444] Updated weights for policy 0, policy_version 7121 (0.0023) [2024-08-05 15:20:48,124][15372] Fps is (10 sec: 23743.9, 60 sec: 23208.5, 300 sec: 23076.0). Total num frames: 58343424. Throughput: 0: 5777.1. Samples: 14591390. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 15:20:48,124][15372] Avg episode reward: [(0, '31.448')] [2024-08-05 15:20:51,309][15444] Updated weights for policy 0, policy_version 7131 (0.0020) [2024-08-05 15:20:53,119][15372] Fps is (10 sec: 22118.3, 60 sec: 23074.1, 300 sec: 23020.9). Total num frames: 58449920. Throughput: 0: 5777.5. Samples: 14609120. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 15:20:53,127][15372] Avg episode reward: [(0, '31.361')] [2024-08-05 15:20:54,616][15444] Updated weights for policy 0, policy_version 7141 (0.0029) [2024-08-05 15:20:58,118][15372] Fps is (10 sec: 22130.6, 60 sec: 22937.6, 300 sec: 23048.7). Total num frames: 58564608. Throughput: 0: 5775.6. Samples: 14643710. Policy #0 lag: (min: 1.0, avg: 3.4, max: 7.0) [2024-08-05 15:20:58,119][15372] Avg episode reward: [(0, '30.819')] [2024-08-05 15:20:58,367][15444] Updated weights for policy 0, policy_version 7151 (0.0014) [2024-08-05 15:21:02,028][15444] Updated weights for policy 0, policy_version 7161 (0.0018) [2024-08-05 15:21:03,119][15372] Fps is (10 sec: 23754.9, 60 sec: 23074.6, 300 sec: 23048.6). Total num frames: 58687488. Throughput: 0: 5757.4. Samples: 14678130. Policy #0 lag: (min: 1.0, avg: 3.3, max: 7.0) [2024-08-05 15:21:03,120][15372] Avg episode reward: [(0, '30.909')] [2024-08-05 15:21:05,387][15444] Updated weights for policy 0, policy_version 7171 (0.0034) [2024-08-05 15:21:08,118][15372] Fps is (10 sec: 24576.0, 60 sec: 23211.1, 300 sec: 23076.4). Total num frames: 58810368. Throughput: 0: 5803.1. Samples: 14695910. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 15:21:08,119][15372] Avg episode reward: [(0, '31.913')] [2024-08-05 15:21:08,885][15444] Updated weights for policy 0, policy_version 7181 (0.0018) [2024-08-05 15:21:12,520][15444] Updated weights for policy 0, policy_version 7191 (0.0023) [2024-08-05 15:21:13,118][15372] Fps is (10 sec: 23758.9, 60 sec: 23210.7, 300 sec: 23076.5). Total num frames: 58925056. Throughput: 0: 5813.2. Samples: 14730780. Policy #0 lag: (min: 0.0, avg: 4.4, max: 9.0) [2024-08-05 15:21:13,119][15372] Avg episode reward: [(0, '30.918')] [2024-08-05 15:21:16,088][15444] Updated weights for policy 0, policy_version 7201 (0.0013) [2024-08-05 15:21:18,118][15372] Fps is (10 sec: 22937.5, 60 sec: 23210.7, 300 sec: 23076.4). Total num frames: 59039744. Throughput: 0: 5806.4. Samples: 14765340. Policy #0 lag: (min: 0.0, avg: 4.4, max: 9.0) [2024-08-05 15:21:18,125][15372] Avg episode reward: [(0, '30.690')] [2024-08-05 15:21:19,553][15444] Updated weights for policy 0, policy_version 7211 (0.0017) [2024-08-05 15:21:23,119][15372] Fps is (10 sec: 22118.3, 60 sec: 23074.1, 300 sec: 23076.4). Total num frames: 59146240. Throughput: 0: 5803.6. Samples: 14782970. Policy #0 lag: (min: 0.0, avg: 4.7, max: 9.0) [2024-08-05 15:21:23,126][15372] Avg episode reward: [(0, '31.275')] [2024-08-05 15:21:23,158][15444] Updated weights for policy 0, policy_version 7221 (0.0020) [2024-08-05 15:21:26,539][15444] Updated weights for policy 0, policy_version 7231 (0.0013) [2024-08-05 15:21:28,119][15372] Fps is (10 sec: 22936.8, 60 sec: 23210.7, 300 sec: 23132.0). Total num frames: 59269120. Throughput: 0: 5792.4. Samples: 14817530. Policy #0 lag: (min: 0.0, avg: 4.5, max: 8.0) [2024-08-05 15:21:28,127][15372] Avg episode reward: [(0, '31.114')] [2024-08-05 15:21:30,812][15444] Updated weights for policy 0, policy_version 7241 (0.0015) [2024-08-05 15:21:33,119][15372] Fps is (10 sec: 21298.7, 60 sec: 22801.0, 300 sec: 23021.0). Total num frames: 59359232. Throughput: 0: 5667.3. Samples: 14846390. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 15:21:33,127][15372] Avg episode reward: [(0, '31.693')] [2024-08-05 15:21:35,079][15444] Updated weights for policy 0, policy_version 7251 (0.0026) [2024-08-05 15:21:38,119][15372] Fps is (10 sec: 18842.2, 60 sec: 22528.0, 300 sec: 22937.6). Total num frames: 59457536. Throughput: 0: 5592.9. Samples: 14860800. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 15:21:38,126][15372] Avg episode reward: [(0, '31.849')] [2024-08-05 15:21:38,860][15444] Updated weights for policy 0, policy_version 7261 (0.0023) [2024-08-05 15:21:42,546][15444] Updated weights for policy 0, policy_version 7271 (0.0015) [2024-08-05 15:21:43,118][15372] Fps is (10 sec: 22119.0, 60 sec: 22528.0, 300 sec: 22965.4). Total num frames: 59580416. Throughput: 0: 5575.5. Samples: 14894610. Policy #0 lag: (min: 1.0, avg: 4.4, max: 8.0) [2024-08-05 15:21:43,119][15372] Avg episode reward: [(0, '31.856')] [2024-08-05 15:21:43,208][15417] Signal inference workers to stop experience collection... (2450 times) [2024-08-05 15:21:43,209][15417] Signal inference workers to resume experience collection... (2450 times) [2024-08-05 15:21:43,261][15444] InferenceWorker_p0-w0: stopping experience collection (2450 times) [2024-08-05 15:21:43,261][15444] InferenceWorker_p0-w0: resuming experience collection (2450 times) [2024-08-05 15:21:45,557][15444] Updated weights for policy 0, policy_version 7281 (0.0020) [2024-08-05 15:21:48,118][15372] Fps is (10 sec: 24576.0, 60 sec: 22666.6, 300 sec: 23020.9). Total num frames: 59703296. Throughput: 0: 5616.8. Samples: 14930880. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 15:21:48,119][15372] Avg episode reward: [(0, '32.571')] [2024-08-05 15:21:48,122][15417] Saving new best policy, reward=32.571! [2024-08-05 15:21:49,355][15444] Updated weights for policy 0, policy_version 7291 (0.0017) [2024-08-05 15:21:52,595][15444] Updated weights for policy 0, policy_version 7301 (0.0012) [2024-08-05 15:21:53,118][15372] Fps is (10 sec: 23756.9, 60 sec: 22801.1, 300 sec: 23020.9). Total num frames: 59817984. Throughput: 0: 5619.8. Samples: 14948800. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 15:21:53,119][15372] Avg episode reward: [(0, '32.486')] [2024-08-05 15:21:55,976][15444] Updated weights for policy 0, policy_version 7311 (0.0021) [2024-08-05 15:21:58,119][15372] Fps is (10 sec: 23756.8, 60 sec: 22937.6, 300 sec: 23020.9). Total num frames: 59940864. Throughput: 0: 5644.2. Samples: 14984770. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 15:21:58,127][15372] Avg episode reward: [(0, '31.533')] [2024-08-05 15:21:59,477][15444] Updated weights for policy 0, policy_version 7321 (0.0011) [2024-08-05 15:22:02,961][15444] Updated weights for policy 0, policy_version 7331 (0.0034) [2024-08-05 15:22:03,118][15372] Fps is (10 sec: 23756.8, 60 sec: 22801.4, 300 sec: 23020.9). Total num frames: 60055552. Throughput: 0: 5671.8. Samples: 15020570. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 15:22:03,119][15372] Avg episode reward: [(0, '31.528')] [2024-08-05 15:22:06,225][15444] Updated weights for policy 0, policy_version 7341 (0.0012) [2024-08-05 15:22:08,119][15372] Fps is (10 sec: 23756.4, 60 sec: 22801.0, 300 sec: 23076.5). Total num frames: 60178432. Throughput: 0: 5700.6. Samples: 15039500. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 15:22:08,119][15372] Avg episode reward: [(0, '31.958')] [2024-08-05 15:22:09,722][15444] Updated weights for policy 0, policy_version 7351 (0.0018) [2024-08-05 15:22:12,975][15444] Updated weights for policy 0, policy_version 7361 (0.0011) [2024-08-05 15:22:13,118][15372] Fps is (10 sec: 24576.1, 60 sec: 22937.6, 300 sec: 23076.5). Total num frames: 60301312. Throughput: 0: 5753.4. Samples: 15076430. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 15:22:13,119][15372] Avg episode reward: [(0, '31.737')] [2024-08-05 15:22:16,139][15444] Updated weights for policy 0, policy_version 7371 (0.0016) [2024-08-05 15:22:18,119][15372] Fps is (10 sec: 24576.2, 60 sec: 23074.1, 300 sec: 23104.2). Total num frames: 60424192. Throughput: 0: 5906.9. Samples: 15112200. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 15:22:18,126][15372] Avg episode reward: [(0, '32.391')] [2024-08-05 15:22:18,238][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000007377_60432384.pth... [2024-08-05 15:22:18,390][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000006699_54878208.pth [2024-08-05 15:22:19,666][15444] Updated weights for policy 0, policy_version 7381 (0.0020) [2024-08-05 15:22:23,118][15372] Fps is (10 sec: 23756.8, 60 sec: 23210.7, 300 sec: 23076.5). Total num frames: 60538880. Throughput: 0: 5986.9. Samples: 15130210. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 15:22:23,126][15372] Avg episode reward: [(0, '31.625')] [2024-08-05 15:22:23,265][15444] Updated weights for policy 0, policy_version 7391 (0.0016) [2024-08-05 15:22:26,464][15444] Updated weights for policy 0, policy_version 7401 (0.0020) [2024-08-05 15:22:28,118][15372] Fps is (10 sec: 23757.1, 60 sec: 23210.8, 300 sec: 23132.0). Total num frames: 60661760. Throughput: 0: 6029.1. Samples: 15165920. Policy #0 lag: (min: 1.0, avg: 4.3, max: 8.0) [2024-08-05 15:22:28,119][15372] Avg episode reward: [(0, '31.910')] [2024-08-05 15:22:30,000][15444] Updated weights for policy 0, policy_version 7411 (0.0026) [2024-08-05 15:22:33,118][15372] Fps is (10 sec: 24576.1, 60 sec: 23756.9, 300 sec: 23159.8). Total num frames: 60784640. Throughput: 0: 6033.1. Samples: 15202370. Policy #0 lag: (min: 1.0, avg: 4.0, max: 7.0) [2024-08-05 15:22:33,126][15372] Avg episode reward: [(0, '32.402')] [2024-08-05 15:22:33,343][15444] Updated weights for policy 0, policy_version 7421 (0.0034) [2024-08-05 15:22:36,601][15444] Updated weights for policy 0, policy_version 7431 (0.0029) [2024-08-05 15:22:38,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.4, 300 sec: 23159.8). Total num frames: 60907520. Throughput: 0: 6046.4. Samples: 15220890. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 15:22:38,126][15372] Avg episode reward: [(0, '31.473')] [2024-08-05 15:22:40,066][15444] Updated weights for policy 0, policy_version 7441 (0.0033) [2024-08-05 15:22:40,827][15417] Signal inference workers to stop experience collection... (2500 times) [2024-08-05 15:22:40,827][15417] Signal inference workers to resume experience collection... (2500 times) [2024-08-05 15:22:40,874][15444] InferenceWorker_p0-w0: stopping experience collection (2500 times) [2024-08-05 15:22:40,875][15444] InferenceWorker_p0-w0: resuming experience collection (2500 times) [2024-08-05 15:22:43,119][15372] Fps is (10 sec: 24574.8, 60 sec: 24166.2, 300 sec: 23215.3). Total num frames: 61030400. Throughput: 0: 6064.6. Samples: 15257680. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 15:22:43,119][15372] Avg episode reward: [(0, '30.558')] [2024-08-05 15:22:43,353][15444] Updated weights for policy 0, policy_version 7451 (0.0017) [2024-08-05 15:22:46,810][15444] Updated weights for policy 0, policy_version 7461 (0.0026) [2024-08-05 15:22:48,119][15372] Fps is (10 sec: 23756.7, 60 sec: 24029.9, 300 sec: 23243.1). Total num frames: 61145088. Throughput: 0: 6082.7. Samples: 15294290. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 15:22:48,119][15372] Avg episode reward: [(0, '31.764')] [2024-08-05 15:22:50,163][15444] Updated weights for policy 0, policy_version 7471 (0.0017) [2024-08-05 15:22:53,118][15372] Fps is (10 sec: 24577.3, 60 sec: 24302.9, 300 sec: 23270.8). Total num frames: 61276160. Throughput: 0: 6069.6. Samples: 15312630. Policy #0 lag: (min: 1.0, avg: 4.0, max: 8.0) [2024-08-05 15:22:53,119][15372] Avg episode reward: [(0, '31.574')] [2024-08-05 15:22:53,246][15444] Updated weights for policy 0, policy_version 7481 (0.0025) [2024-08-05 15:22:56,995][15444] Updated weights for policy 0, policy_version 7491 (0.0026) [2024-08-05 15:22:58,119][15372] Fps is (10 sec: 24575.7, 60 sec: 24166.4, 300 sec: 23243.1). Total num frames: 61390848. Throughput: 0: 6049.5. Samples: 15348660. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 15:22:58,119][15372] Avg episode reward: [(0, '32.745')] [2024-08-05 15:22:58,122][15417] Saving new best policy, reward=32.745! [2024-08-05 15:23:00,208][15444] Updated weights for policy 0, policy_version 7501 (0.0011) [2024-08-05 15:23:03,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24302.9, 300 sec: 23298.6). Total num frames: 61513728. Throughput: 0: 6064.5. Samples: 15385100. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 15:23:03,126][15372] Avg episode reward: [(0, '31.632')] [2024-08-05 15:23:03,595][15444] Updated weights for policy 0, policy_version 7511 (0.0011) [2024-08-05 15:23:07,254][15444] Updated weights for policy 0, policy_version 7521 (0.0034) [2024-08-05 15:23:08,118][15372] Fps is (10 sec: 24576.3, 60 sec: 24303.0, 300 sec: 23326.4). Total num frames: 61636608. Throughput: 0: 6055.5. Samples: 15402710. Policy #0 lag: (min: 0.0, avg: 4.2, max: 9.0) [2024-08-05 15:23:08,119][15372] Avg episode reward: [(0, '30.014')] [2024-08-05 15:23:10,335][15444] Updated weights for policy 0, policy_version 7531 (0.0013) [2024-08-05 15:23:13,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.4, 300 sec: 23326.4). Total num frames: 61751296. Throughput: 0: 6059.3. Samples: 15438590. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 15:23:13,126][15372] Avg episode reward: [(0, '30.381')] [2024-08-05 15:23:13,950][15444] Updated weights for policy 0, policy_version 7541 (0.0017) [2024-08-05 15:23:17,225][15444] Updated weights for policy 0, policy_version 7551 (0.0012) [2024-08-05 15:23:18,118][15372] Fps is (10 sec: 22937.7, 60 sec: 24029.9, 300 sec: 23326.4). Total num frames: 61865984. Throughput: 0: 6036.9. Samples: 15474030. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 15:23:18,126][15372] Avg episode reward: [(0, '30.922')] [2024-08-05 15:23:20,831][15444] Updated weights for policy 0, policy_version 7561 (0.0018) [2024-08-05 15:23:23,119][15372] Fps is (10 sec: 23755.7, 60 sec: 24166.2, 300 sec: 23326.3). Total num frames: 61988864. Throughput: 0: 6031.3. Samples: 15492300. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 15:23:23,127][15372] Avg episode reward: [(0, '30.605')] [2024-08-05 15:23:24,367][15444] Updated weights for policy 0, policy_version 7571 (0.0021) [2024-08-05 15:23:27,646][15444] Updated weights for policy 0, policy_version 7581 (0.0013) [2024-08-05 15:23:28,119][15372] Fps is (10 sec: 24575.7, 60 sec: 24166.4, 300 sec: 23381.9). Total num frames: 62111744. Throughput: 0: 6014.9. Samples: 15528350. Policy #0 lag: (min: 1.0, avg: 4.1, max: 7.0) [2024-08-05 15:23:28,119][15372] Avg episode reward: [(0, '31.292')] [2024-08-05 15:23:31,214][15444] Updated weights for policy 0, policy_version 7591 (0.0023) [2024-08-05 15:23:31,617][15417] Signal inference workers to stop experience collection... (2550 times) [2024-08-05 15:23:31,618][15417] Signal inference workers to resume experience collection... (2550 times) [2024-08-05 15:23:31,667][15444] InferenceWorker_p0-w0: stopping experience collection (2550 times) [2024-08-05 15:23:31,667][15444] InferenceWorker_p0-w0: resuming experience collection (2550 times) [2024-08-05 15:23:33,118][15372] Fps is (10 sec: 24577.1, 60 sec: 24166.4, 300 sec: 23409.7). Total num frames: 62234624. Throughput: 0: 5994.2. Samples: 15564030. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 15:23:33,119][15372] Avg episode reward: [(0, '32.683')] [2024-08-05 15:23:34,486][15444] Updated weights for policy 0, policy_version 7601 (0.0029) [2024-08-05 15:23:37,786][15444] Updated weights for policy 0, policy_version 7611 (0.0016) [2024-08-05 15:23:38,119][15372] Fps is (10 sec: 24576.1, 60 sec: 24166.4, 300 sec: 23437.4). Total num frames: 62357504. Throughput: 0: 6001.3. Samples: 15582690. Policy #0 lag: (min: 1.0, avg: 3.9, max: 7.0) [2024-08-05 15:23:38,119][15372] Avg episode reward: [(0, '31.588')] [2024-08-05 15:23:41,182][15444] Updated weights for policy 0, policy_version 7621 (0.0021) [2024-08-05 15:23:43,119][15372] Fps is (10 sec: 23755.9, 60 sec: 24029.9, 300 sec: 23437.4). Total num frames: 62472192. Throughput: 0: 5993.3. Samples: 15618360. Policy #0 lag: (min: 1.0, avg: 3.9, max: 7.0) [2024-08-05 15:23:43,127][15372] Avg episode reward: [(0, '30.503')] [2024-08-05 15:23:44,549][15444] Updated weights for policy 0, policy_version 7631 (0.0029) [2024-08-05 15:23:47,909][15444] Updated weights for policy 0, policy_version 7641 (0.0022) [2024-08-05 15:23:48,118][15372] Fps is (10 sec: 23757.1, 60 sec: 24166.4, 300 sec: 23465.2). Total num frames: 62595072. Throughput: 0: 5987.3. Samples: 15654530. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 15:23:48,119][15372] Avg episode reward: [(0, '30.343')] [2024-08-05 15:23:51,530][15444] Updated weights for policy 0, policy_version 7651 (0.0012) [2024-08-05 15:23:53,119][15372] Fps is (10 sec: 24576.8, 60 sec: 24029.8, 300 sec: 23493.0). Total num frames: 62717952. Throughput: 0: 6012.9. Samples: 15673290. Policy #0 lag: (min: 1.0, avg: 4.1, max: 7.0) [2024-08-05 15:23:53,119][15372] Avg episode reward: [(0, '31.829')] [2024-08-05 15:23:54,802][15444] Updated weights for policy 0, policy_version 7661 (0.0012) [2024-08-05 15:23:58,099][15444] Updated weights for policy 0, policy_version 7671 (0.0011) [2024-08-05 15:23:58,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.5, 300 sec: 23493.0). Total num frames: 62840832. Throughput: 0: 6024.7. Samples: 15709700. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 15:23:58,119][15372] Avg episode reward: [(0, '32.225')] [2024-08-05 15:24:01,503][15444] Updated weights for policy 0, policy_version 7681 (0.0024) [2024-08-05 15:24:03,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24029.9, 300 sec: 23465.2). Total num frames: 62955520. Throughput: 0: 6028.2. Samples: 15745300. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 15:24:03,126][15372] Avg episode reward: [(0, '31.630')] [2024-08-05 15:24:04,706][15444] Updated weights for policy 0, policy_version 7691 (0.0015) [2024-08-05 15:24:08,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.9, 300 sec: 23520.8). Total num frames: 63078400. Throughput: 0: 6046.3. Samples: 15764380. Policy #0 lag: (min: 0.0, avg: 3.3, max: 9.0) [2024-08-05 15:24:08,126][15372] Avg episode reward: [(0, '31.873')] [2024-08-05 15:24:08,457][15444] Updated weights for policy 0, policy_version 7701 (0.0011) [2024-08-05 15:24:11,776][15444] Updated weights for policy 0, policy_version 7711 (0.0011) [2024-08-05 15:24:13,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24166.4, 300 sec: 23548.5). Total num frames: 63201280. Throughput: 0: 6046.5. Samples: 15800440. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 15:24:13,119][15372] Avg episode reward: [(0, '31.570')] [2024-08-05 15:24:14,961][15444] Updated weights for policy 0, policy_version 7721 (0.0018) [2024-08-05 15:24:18,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.4, 300 sec: 23576.3). Total num frames: 63315968. Throughput: 0: 6047.1. Samples: 15836150. Policy #0 lag: (min: 1.0, avg: 3.4, max: 7.0) [2024-08-05 15:24:18,126][15372] Avg episode reward: [(0, '31.654')] [2024-08-05 15:24:18,130][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000007729_63315968.pth... [2024-08-05 15:24:18,280][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000007037_57647104.pth [2024-08-05 15:24:18,601][15444] Updated weights for policy 0, policy_version 7731 (0.0012) [2024-08-05 15:24:21,525][15417] Signal inference workers to stop experience collection... (2600 times) [2024-08-05 15:24:21,531][15417] Signal inference workers to resume experience collection... (2600 times) [2024-08-05 15:24:21,616][15444] InferenceWorker_p0-w0: stopping experience collection (2600 times) [2024-08-05 15:24:21,617][15444] InferenceWorker_p0-w0: resuming experience collection (2600 times) [2024-08-05 15:24:21,864][15444] Updated weights for policy 0, policy_version 7741 (0.0034) [2024-08-05 15:24:23,118][15372] Fps is (10 sec: 22937.5, 60 sec: 24030.0, 300 sec: 23548.5). Total num frames: 63430656. Throughput: 0: 6029.8. Samples: 15854030. Policy #0 lag: (min: 1.0, avg: 3.4, max: 7.0) [2024-08-05 15:24:23,119][15372] Avg episode reward: [(0, '31.871')] [2024-08-05 15:24:25,398][15444] Updated weights for policy 0, policy_version 7751 (0.0014) [2024-08-05 15:24:28,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24166.5, 300 sec: 23604.1). Total num frames: 63561728. Throughput: 0: 6022.9. Samples: 15889390. Policy #0 lag: (min: 1.0, avg: 3.8, max: 7.0) [2024-08-05 15:24:28,126][15372] Avg episode reward: [(0, '31.349')] [2024-08-05 15:24:28,913][15444] Updated weights for policy 0, policy_version 7761 (0.0017) [2024-08-05 15:24:32,449][15444] Updated weights for policy 0, policy_version 7771 (0.0017) [2024-08-05 15:24:33,119][15372] Fps is (10 sec: 24574.8, 60 sec: 24029.7, 300 sec: 23576.3). Total num frames: 63676416. Throughput: 0: 6009.5. Samples: 15924960. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 15:24:33,119][15372] Avg episode reward: [(0, '30.876')] [2024-08-05 15:24:35,834][15444] Updated weights for policy 0, policy_version 7781 (0.0022) [2024-08-05 15:24:38,118][15372] Fps is (10 sec: 22937.5, 60 sec: 23893.4, 300 sec: 23576.3). Total num frames: 63791104. Throughput: 0: 5992.2. Samples: 15942940. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 15:24:38,126][15372] Avg episode reward: [(0, '30.900')] [2024-08-05 15:24:39,259][15444] Updated weights for policy 0, policy_version 7791 (0.0018) [2024-08-05 15:24:42,566][15444] Updated weights for policy 0, policy_version 7801 (0.0028) [2024-08-05 15:24:43,118][15372] Fps is (10 sec: 23757.9, 60 sec: 24030.0, 300 sec: 23604.1). Total num frames: 63913984. Throughput: 0: 5985.8. Samples: 15979060. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 15:24:43,119][15372] Avg episode reward: [(0, '32.557')] [2024-08-05 15:24:46,173][15444] Updated weights for policy 0, policy_version 7811 (0.0011) [2024-08-05 15:24:48,118][15372] Fps is (10 sec: 23756.8, 60 sec: 23893.3, 300 sec: 23604.1). Total num frames: 64028672. Throughput: 0: 5981.3. Samples: 16014460. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 15:24:48,119][15372] Avg episode reward: [(0, '32.186')] [2024-08-05 15:24:49,372][15444] Updated weights for policy 0, policy_version 7821 (0.0015) [2024-08-05 15:24:52,921][15444] Updated weights for policy 0, policy_version 7831 (0.0030) [2024-08-05 15:24:53,119][15372] Fps is (10 sec: 24575.8, 60 sec: 24029.9, 300 sec: 23631.8). Total num frames: 64159744. Throughput: 0: 5973.3. Samples: 16033180. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 15:24:53,119][15372] Avg episode reward: [(0, '31.967')] [2024-08-05 15:24:56,383][15444] Updated weights for policy 0, policy_version 7841 (0.0025) [2024-08-05 15:24:58,118][15372] Fps is (10 sec: 24576.0, 60 sec: 23893.3, 300 sec: 23632.0). Total num frames: 64274432. Throughput: 0: 5962.4. Samples: 16068750. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 15:24:58,126][15372] Avg episode reward: [(0, '31.511')] [2024-08-05 15:24:59,644][15444] Updated weights for policy 0, policy_version 7851 (0.0017) [2024-08-05 15:25:03,051][15444] Updated weights for policy 0, policy_version 7861 (0.0019) [2024-08-05 15:25:03,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24029.9, 300 sec: 23659.7). Total num frames: 64397312. Throughput: 0: 5976.9. Samples: 16105110. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 15:25:03,119][15372] Avg episode reward: [(0, '30.694')] [2024-08-05 15:25:06,251][15444] Updated weights for policy 0, policy_version 7871 (0.0012) [2024-08-05 15:25:08,119][15372] Fps is (10 sec: 23755.6, 60 sec: 23893.1, 300 sec: 23659.6). Total num frames: 64512000. Throughput: 0: 5992.8. Samples: 16123710. Policy #0 lag: (min: 1.0, avg: 4.1, max: 7.0) [2024-08-05 15:25:08,127][15372] Avg episode reward: [(0, '31.614')] [2024-08-05 15:25:09,702][15444] Updated weights for policy 0, policy_version 7881 (0.0014) [2024-08-05 15:25:13,118][15372] Fps is (10 sec: 23756.8, 60 sec: 23893.3, 300 sec: 23687.4). Total num frames: 64634880. Throughput: 0: 6015.8. Samples: 16160100. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 15:25:13,126][15372] Avg episode reward: [(0, '33.084')] [2024-08-05 15:25:13,127][15417] Saving new best policy, reward=33.084! [2024-08-05 15:25:13,403][15444] Updated weights for policy 0, policy_version 7891 (0.0017) [2024-08-05 15:25:16,593][15444] Updated weights for policy 0, policy_version 7901 (0.0023) [2024-08-05 15:25:18,118][15372] Fps is (10 sec: 24577.3, 60 sec: 24029.9, 300 sec: 23715.2). Total num frames: 64757760. Throughput: 0: 6011.2. Samples: 16195460. Policy #0 lag: (min: 1.0, avg: 3.8, max: 8.0) [2024-08-05 15:25:18,126][15372] Avg episode reward: [(0, '32.897')] [2024-08-05 15:25:19,938][15444] Updated weights for policy 0, policy_version 7911 (0.0013) [2024-08-05 15:25:23,119][15372] Fps is (10 sec: 24575.8, 60 sec: 24166.4, 300 sec: 23742.9). Total num frames: 64880640. Throughput: 0: 6025.3. Samples: 16214080. Policy #0 lag: (min: 1.0, avg: 3.8, max: 8.0) [2024-08-05 15:25:23,126][15372] Avg episode reward: [(0, '33.099')] [2024-08-05 15:25:23,127][15417] Saving new best policy, reward=33.099! [2024-08-05 15:25:23,453][15444] Updated weights for policy 0, policy_version 7921 (0.0025) [2024-08-05 15:25:25,636][15417] Signal inference workers to stop experience collection... (2650 times) [2024-08-05 15:25:25,637][15417] Signal inference workers to resume experience collection... (2650 times) [2024-08-05 15:25:25,702][15444] InferenceWorker_p0-w0: stopping experience collection (2650 times) [2024-08-05 15:25:25,710][15444] InferenceWorker_p0-w0: resuming experience collection (2650 times) [2024-08-05 15:25:26,852][15444] Updated weights for policy 0, policy_version 7931 (0.0013) [2024-08-05 15:25:28,119][15372] Fps is (10 sec: 24575.5, 60 sec: 24029.8, 300 sec: 23770.7). Total num frames: 65003520. Throughput: 0: 6013.3. Samples: 16249660. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 15:25:28,119][15372] Avg episode reward: [(0, '32.609')] [2024-08-05 15:25:30,021][15444] Updated weights for policy 0, policy_version 7941 (0.0026) [2024-08-05 15:25:33,119][15372] Fps is (10 sec: 23756.6, 60 sec: 24030.0, 300 sec: 23770.7). Total num frames: 65118208. Throughput: 0: 6039.5. Samples: 16286240. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 15:25:33,126][15372] Avg episode reward: [(0, '32.461')] [2024-08-05 15:25:33,642][15444] Updated weights for policy 0, policy_version 7951 (0.0030) [2024-08-05 15:25:37,056][15444] Updated weights for policy 0, policy_version 7961 (0.0015) [2024-08-05 15:25:38,119][15372] Fps is (10 sec: 23757.2, 60 sec: 24166.4, 300 sec: 23770.7). Total num frames: 65241088. Throughput: 0: 6036.9. Samples: 16304840. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 15:25:38,119][15372] Avg episode reward: [(0, '31.898')] [2024-08-05 15:25:40,335][15444] Updated weights for policy 0, policy_version 7971 (0.0019) [2024-08-05 15:25:43,118][15372] Fps is (10 sec: 23757.2, 60 sec: 24029.9, 300 sec: 23771.1). Total num frames: 65355776. Throughput: 0: 6029.3. Samples: 16340070. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 15:25:43,119][15372] Avg episode reward: [(0, '30.828')] [2024-08-05 15:25:43,865][15444] Updated weights for policy 0, policy_version 7981 (0.0013) [2024-08-05 15:25:47,276][15444] Updated weights for policy 0, policy_version 7991 (0.0020) [2024-08-05 15:25:48,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24166.4, 300 sec: 23826.2). Total num frames: 65478656. Throughput: 0: 6023.1. Samples: 16376150. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 15:25:48,119][15372] Avg episode reward: [(0, '31.985')] [2024-08-05 15:25:50,633][15444] Updated weights for policy 0, policy_version 8001 (0.0017) [2024-08-05 15:25:53,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24029.9, 300 sec: 23854.0). Total num frames: 65601536. Throughput: 0: 6005.0. Samples: 16393930. Policy #0 lag: (min: 1.0, avg: 3.5, max: 7.0) [2024-08-05 15:25:53,119][15372] Avg episode reward: [(0, '32.637')] [2024-08-05 15:25:54,224][15444] Updated weights for policy 0, policy_version 8011 (0.0018) [2024-08-05 15:25:57,241][15444] Updated weights for policy 0, policy_version 8021 (0.0011) [2024-08-05 15:25:58,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24029.9, 300 sec: 23826.3). Total num frames: 65716224. Throughput: 0: 6016.4. Samples: 16430840. Policy #0 lag: (min: 1.0, avg: 3.5, max: 7.0) [2024-08-05 15:25:58,126][15372] Avg episode reward: [(0, '32.445')] [2024-08-05 15:26:00,813][15444] Updated weights for policy 0, policy_version 8031 (0.0023) [2024-08-05 15:26:03,119][15372] Fps is (10 sec: 23756.7, 60 sec: 24029.8, 300 sec: 23826.2). Total num frames: 65839104. Throughput: 0: 6034.4. Samples: 16467010. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 15:26:03,119][15372] Avg episode reward: [(0, '33.405')] [2024-08-05 15:26:03,120][15417] Saving new best policy, reward=33.405! [2024-08-05 15:26:04,262][15444] Updated weights for policy 0, policy_version 8041 (0.0017) [2024-08-05 15:26:07,716][15444] Updated weights for policy 0, policy_version 8051 (0.0021) [2024-08-05 15:26:08,119][15372] Fps is (10 sec: 24575.8, 60 sec: 24166.6, 300 sec: 23854.0). Total num frames: 65961984. Throughput: 0: 6013.3. Samples: 16484680. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 15:26:08,119][15372] Avg episode reward: [(0, '33.196')] [2024-08-05 15:26:10,830][15444] Updated weights for policy 0, policy_version 8061 (0.0010) [2024-08-05 15:26:12,612][15417] Signal inference workers to stop experience collection... (2700 times) [2024-08-05 15:26:12,623][15417] Signal inference workers to resume experience collection... (2700 times) [2024-08-05 15:26:12,656][15444] InferenceWorker_p0-w0: stopping experience collection (2700 times) [2024-08-05 15:26:12,656][15444] InferenceWorker_p0-w0: resuming experience collection (2700 times) [2024-08-05 15:26:13,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24166.4, 300 sec: 23881.8). Total num frames: 66084864. Throughput: 0: 6028.3. Samples: 16520930. Policy #0 lag: (min: 1.0, avg: 3.6, max: 8.0) [2024-08-05 15:26:13,119][15372] Avg episode reward: [(0, '32.477')] [2024-08-05 15:26:14,509][15444] Updated weights for policy 0, policy_version 8071 (0.0011) [2024-08-05 15:26:17,498][15444] Updated weights for policy 0, policy_version 8081 (0.0018) [2024-08-05 15:26:18,119][15372] Fps is (10 sec: 23756.6, 60 sec: 24029.8, 300 sec: 23909.5). Total num frames: 66199552. Throughput: 0: 6018.4. Samples: 16557070. Policy #0 lag: (min: 1.0, avg: 3.6, max: 8.0) [2024-08-05 15:26:18,119][15372] Avg episode reward: [(0, '32.430')] [2024-08-05 15:26:18,122][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000008081_66199552.pth... [2024-08-05 15:26:18,268][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000007377_60432384.pth [2024-08-05 15:26:21,107][15444] Updated weights for policy 0, policy_version 8091 (0.0021) [2024-08-05 15:26:23,119][15372] Fps is (10 sec: 24575.2, 60 sec: 24166.3, 300 sec: 23937.3). Total num frames: 66330624. Throughput: 0: 6021.1. Samples: 16575790. Policy #0 lag: (min: 0.0, avg: 4.2, max: 9.0) [2024-08-05 15:26:23,119][15372] Avg episode reward: [(0, '32.387')] [2024-08-05 15:26:24,626][15444] Updated weights for policy 0, policy_version 8101 (0.0026) [2024-08-05 15:26:27,795][15444] Updated weights for policy 0, policy_version 8111 (0.0033) [2024-08-05 15:26:28,118][15372] Fps is (10 sec: 24576.5, 60 sec: 24030.0, 300 sec: 24020.6). Total num frames: 66445312. Throughput: 0: 6040.9. Samples: 16611910. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 15:26:28,119][15372] Avg episode reward: [(0, '32.892')] [2024-08-05 15:26:31,679][15444] Updated weights for policy 0, policy_version 8121 (0.0010) [2024-08-05 15:26:33,118][15372] Fps is (10 sec: 22938.3, 60 sec: 24029.9, 300 sec: 24076.2). Total num frames: 66560000. Throughput: 0: 6030.7. Samples: 16647530. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 15:26:33,119][15372] Avg episode reward: [(0, '33.186')] [2024-08-05 15:26:34,683][15444] Updated weights for policy 0, policy_version 8131 (0.0018) [2024-08-05 15:26:38,121][15372] Fps is (10 sec: 23750.2, 60 sec: 24028.8, 300 sec: 24075.9). Total num frames: 66682880. Throughput: 0: 6046.1. Samples: 16666020. Policy #0 lag: (min: 0.0, avg: 4.4, max: 7.0) [2024-08-05 15:26:38,129][15372] Avg episode reward: [(0, '33.000')] [2024-08-05 15:26:38,277][15444] Updated weights for policy 0, policy_version 8141 (0.0011) [2024-08-05 15:26:41,903][15444] Updated weights for policy 0, policy_version 8151 (0.0014) [2024-08-05 15:26:43,118][15372] Fps is (10 sec: 25395.3, 60 sec: 24303.0, 300 sec: 24103.9). Total num frames: 66813952. Throughput: 0: 6032.9. Samples: 16702320. Policy #0 lag: (min: 1.0, avg: 3.6, max: 8.0) [2024-08-05 15:26:43,119][15372] Avg episode reward: [(0, '32.752')] [2024-08-05 15:26:44,905][15444] Updated weights for policy 0, policy_version 8161 (0.0013) [2024-08-05 15:26:48,119][15372] Fps is (10 sec: 23762.7, 60 sec: 24029.8, 300 sec: 24076.1). Total num frames: 66920448. Throughput: 0: 6023.3. Samples: 16738060. Policy #0 lag: (min: 1.0, avg: 4.8, max: 8.0) [2024-08-05 15:26:48,127][15372] Avg episode reward: [(0, '32.588')] [2024-08-05 15:26:48,586][15444] Updated weights for policy 0, policy_version 8171 (0.0012) [2024-08-05 15:26:48,692][15417] Signal inference workers to stop experience collection... (2750 times) [2024-08-05 15:26:48,693][15417] Signal inference workers to resume experience collection... (2750 times) [2024-08-05 15:26:48,721][15444] InferenceWorker_p0-w0: stopping experience collection (2750 times) [2024-08-05 15:26:48,721][15444] InferenceWorker_p0-w0: resuming experience collection (2750 times) [2024-08-05 15:26:51,497][15444] Updated weights for policy 0, policy_version 8181 (0.0020) [2024-08-05 15:26:53,119][15372] Fps is (10 sec: 22937.3, 60 sec: 24029.8, 300 sec: 24076.1). Total num frames: 67043328. Throughput: 0: 6050.4. Samples: 16756950. Policy #0 lag: (min: 1.0, avg: 4.8, max: 8.0) [2024-08-05 15:26:53,127][15372] Avg episode reward: [(0, '33.237')] [2024-08-05 15:26:55,071][15444] Updated weights for policy 0, policy_version 8191 (0.0014) [2024-08-05 15:26:58,118][15372] Fps is (10 sec: 25395.9, 60 sec: 24302.9, 300 sec: 24131.7). Total num frames: 67174400. Throughput: 0: 6050.4. Samples: 16793200. Policy #0 lag: (min: 0.0, avg: 4.5, max: 8.0) [2024-08-05 15:26:58,126][15372] Avg episode reward: [(0, '33.721')] [2024-08-05 15:26:58,129][15417] Saving new best policy, reward=33.721! [2024-08-05 15:26:58,800][15444] Updated weights for policy 0, policy_version 8201 (0.0013) [2024-08-05 15:27:01,911][15444] Updated weights for policy 0, policy_version 8211 (0.0012) [2024-08-05 15:27:03,119][15372] Fps is (10 sec: 24576.1, 60 sec: 24166.4, 300 sec: 24103.9). Total num frames: 67289088. Throughput: 0: 6034.0. Samples: 16828600. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 15:27:03,119][15372] Avg episode reward: [(0, '33.658')] [2024-08-05 15:27:05,219][15444] Updated weights for policy 0, policy_version 8221 (0.0012) [2024-08-05 15:27:08,118][15372] Fps is (10 sec: 22937.6, 60 sec: 24029.9, 300 sec: 24076.1). Total num frames: 67403776. Throughput: 0: 6018.5. Samples: 16846620. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 15:27:08,126][15372] Avg episode reward: [(0, '32.826')] [2024-08-05 15:27:08,740][15444] Updated weights for policy 0, policy_version 8231 (0.0018) [2024-08-05 15:27:12,346][15444] Updated weights for policy 0, policy_version 8241 (0.0011) [2024-08-05 15:27:13,119][15372] Fps is (10 sec: 23756.0, 60 sec: 24029.7, 300 sec: 24076.1). Total num frames: 67526656. Throughput: 0: 6011.3. Samples: 16882420. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 15:27:13,120][15372] Avg episode reward: [(0, '32.728')] [2024-08-05 15:27:15,560][15444] Updated weights for policy 0, policy_version 8251 (0.0014) [2024-08-05 15:27:18,119][15372] Fps is (10 sec: 24575.4, 60 sec: 24166.4, 300 sec: 24103.9). Total num frames: 67649536. Throughput: 0: 6020.6. Samples: 16918460. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 15:27:18,126][15372] Avg episode reward: [(0, '33.366')] [2024-08-05 15:27:19,042][15444] Updated weights for policy 0, policy_version 8261 (0.0020) [2024-08-05 15:27:22,440][15444] Updated weights for policy 0, policy_version 8271 (0.0012) [2024-08-05 15:27:23,118][15372] Fps is (10 sec: 23757.7, 60 sec: 23893.4, 300 sec: 24076.1). Total num frames: 67764224. Throughput: 0: 6014.1. Samples: 16936640. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 15:27:23,126][15372] Avg episode reward: [(0, '32.584')] [2024-08-05 15:27:25,830][15444] Updated weights for policy 0, policy_version 8281 (0.0011) [2024-08-05 15:27:28,118][15372] Fps is (10 sec: 23757.4, 60 sec: 24029.9, 300 sec: 24076.1). Total num frames: 67887104. Throughput: 0: 6001.8. Samples: 16972400. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 15:27:28,126][15372] Avg episode reward: [(0, '32.652')] [2024-08-05 15:27:29,297][15444] Updated weights for policy 0, policy_version 8291 (0.0032) [2024-08-05 15:27:32,790][15444] Updated weights for policy 0, policy_version 8301 (0.0029) [2024-08-05 15:27:33,119][15372] Fps is (10 sec: 24575.0, 60 sec: 24166.2, 300 sec: 24076.1). Total num frames: 68009984. Throughput: 0: 6012.6. Samples: 17008630. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 15:27:33,119][15372] Avg episode reward: [(0, '32.764')] [2024-08-05 15:27:35,908][15444] Updated weights for policy 0, policy_version 8311 (0.0014) [2024-08-05 15:27:38,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24167.5, 300 sec: 24076.2). Total num frames: 68132864. Throughput: 0: 5992.5. Samples: 17026610. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 15:27:38,126][15372] Avg episode reward: [(0, '32.690')] [2024-08-05 15:27:39,470][15444] Updated weights for policy 0, policy_version 8321 (0.0013) [2024-08-05 15:27:42,860][15444] Updated weights for policy 0, policy_version 8331 (0.0012) [2024-08-05 15:27:43,119][15372] Fps is (10 sec: 23757.5, 60 sec: 23893.3, 300 sec: 24076.1). Total num frames: 68247552. Throughput: 0: 5989.8. Samples: 17062740. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 15:27:43,119][15372] Avg episode reward: [(0, '32.707')] [2024-08-05 15:27:43,726][15417] Signal inference workers to stop experience collection... (2800 times) [2024-08-05 15:27:43,727][15417] Signal inference workers to resume experience collection... (2800 times) [2024-08-05 15:27:43,772][15444] InferenceWorker_p0-w0: stopping experience collection (2800 times) [2024-08-05 15:27:43,772][15444] InferenceWorker_p0-w0: resuming experience collection (2800 times) [2024-08-05 15:27:46,343][15444] Updated weights for policy 0, policy_version 8341 (0.0030) [2024-08-05 15:27:48,119][15372] Fps is (10 sec: 23756.4, 60 sec: 24166.4, 300 sec: 24048.4). Total num frames: 68370432. Throughput: 0: 6015.3. Samples: 17099290. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 15:27:48,119][15372] Avg episode reward: [(0, '33.021')] [2024-08-05 15:27:49,739][15444] Updated weights for policy 0, policy_version 8351 (0.0011) [2024-08-05 15:27:52,814][15444] Updated weights for policy 0, policy_version 8361 (0.0016) [2024-08-05 15:27:53,118][15372] Fps is (10 sec: 24576.4, 60 sec: 24166.4, 300 sec: 24076.2). Total num frames: 68493312. Throughput: 0: 6022.4. Samples: 17117630. Policy #0 lag: (min: 1.0, avg: 4.0, max: 8.0) [2024-08-05 15:27:53,119][15372] Avg episode reward: [(0, '33.312')] [2024-08-05 15:27:56,130][15444] Updated weights for policy 0, policy_version 8371 (0.0020) [2024-08-05 15:27:58,118][15372] Fps is (10 sec: 24576.4, 60 sec: 24029.9, 300 sec: 24076.2). Total num frames: 68616192. Throughput: 0: 6042.1. Samples: 17154310. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 15:27:58,126][15372] Avg episode reward: [(0, '32.409')] [2024-08-05 15:27:59,591][15444] Updated weights for policy 0, policy_version 8381 (0.0012) [2024-08-05 15:28:02,921][15444] Updated weights for policy 0, policy_version 8391 (0.0011) [2024-08-05 15:28:03,119][15372] Fps is (10 sec: 24575.4, 60 sec: 24166.3, 300 sec: 24076.1). Total num frames: 68739072. Throughput: 0: 6050.4. Samples: 17190730. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 15:28:03,119][15372] Avg episode reward: [(0, '31.904')] [2024-08-05 15:28:06,398][15444] Updated weights for policy 0, policy_version 8401 (0.0018) [2024-08-05 15:28:08,119][15372] Fps is (10 sec: 23756.6, 60 sec: 24166.4, 300 sec: 24076.1). Total num frames: 68853760. Throughput: 0: 6060.9. Samples: 17209380. Policy #0 lag: (min: 1.0, avg: 3.6, max: 7.0) [2024-08-05 15:28:08,126][15372] Avg episode reward: [(0, '32.837')] [2024-08-05 15:28:09,696][15444] Updated weights for policy 0, policy_version 8411 (0.0029) [2024-08-05 15:28:13,118][15372] Fps is (10 sec: 23757.3, 60 sec: 24166.5, 300 sec: 24103.9). Total num frames: 68976640. Throughput: 0: 6077.3. Samples: 17245880. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 15:28:13,126][15372] Avg episode reward: [(0, '33.673')] [2024-08-05 15:28:13,172][15444] Updated weights for policy 0, policy_version 8421 (0.0026) [2024-08-05 15:28:16,507][15444] Updated weights for policy 0, policy_version 8431 (0.0011) [2024-08-05 15:28:18,119][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.5, 300 sec: 24103.9). Total num frames: 69099520. Throughput: 0: 6078.0. Samples: 17282140. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 15:28:18,126][15372] Avg episode reward: [(0, '33.654')] [2024-08-05 15:28:18,138][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000008435_69099520.pth... [2024-08-05 15:28:18,250][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000007729_63315968.pth [2024-08-05 15:28:19,840][15444] Updated weights for policy 0, policy_version 8441 (0.0027) [2024-08-05 15:28:23,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24302.9, 300 sec: 24103.9). Total num frames: 69222400. Throughput: 0: 6085.1. Samples: 17300440. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 15:28:23,126][15372] Avg episode reward: [(0, '32.779')] [2024-08-05 15:28:23,143][15444] Updated weights for policy 0, policy_version 8451 (0.0028) [2024-08-05 15:28:26,824][15444] Updated weights for policy 0, policy_version 8461 (0.0027) [2024-08-05 15:28:28,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24302.9, 300 sec: 24103.9). Total num frames: 69345280. Throughput: 0: 6081.1. Samples: 17336390. Policy #0 lag: (min: 0.0, avg: 4.7, max: 10.0) [2024-08-05 15:28:28,119][15372] Avg episode reward: [(0, '33.318')] [2024-08-05 15:28:30,007][15444] Updated weights for policy 0, policy_version 8471 (0.0016) [2024-08-05 15:28:31,471][15417] Signal inference workers to stop experience collection... (2850 times) [2024-08-05 15:28:31,471][15417] Signal inference workers to resume experience collection... (2850 times) [2024-08-05 15:28:31,509][15444] InferenceWorker_p0-w0: stopping experience collection (2850 times) [2024-08-05 15:28:31,510][15444] InferenceWorker_p0-w0: resuming experience collection (2850 times) [2024-08-05 15:28:33,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24303.1, 300 sec: 24103.9). Total num frames: 69468160. Throughput: 0: 6095.4. Samples: 17373580. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 15:28:33,119][15372] Avg episode reward: [(0, '32.849')] [2024-08-05 15:28:33,272][15444] Updated weights for policy 0, policy_version 8481 (0.0012) [2024-08-05 15:28:36,755][15444] Updated weights for policy 0, policy_version 8491 (0.0024) [2024-08-05 15:28:38,119][15372] Fps is (10 sec: 24575.7, 60 sec: 24302.9, 300 sec: 24131.7). Total num frames: 69591040. Throughput: 0: 6093.3. Samples: 17391830. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 15:28:38,119][15372] Avg episode reward: [(0, '32.450')] [2024-08-05 15:28:39,875][15444] Updated weights for policy 0, policy_version 8501 (0.0013) [2024-08-05 15:28:43,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24439.5, 300 sec: 24131.7). Total num frames: 69713920. Throughput: 0: 6109.1. Samples: 17429220. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 15:28:43,126][15372] Avg episode reward: [(0, '33.117')] [2024-08-05 15:28:43,376][15444] Updated weights for policy 0, policy_version 8511 (0.0018) [2024-08-05 15:28:46,759][15444] Updated weights for policy 0, policy_version 8521 (0.0020) [2024-08-05 15:28:48,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24439.5, 300 sec: 24131.7). Total num frames: 69836800. Throughput: 0: 6093.4. Samples: 17464930. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 15:28:48,119][15372] Avg episode reward: [(0, '33.037')] [2024-08-05 15:28:49,977][15444] Updated weights for policy 0, policy_version 8531 (0.0013) [2024-08-05 15:28:53,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24439.5, 300 sec: 24131.7). Total num frames: 69959680. Throughput: 0: 6089.3. Samples: 17483400. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 15:28:53,119][15372] Avg episode reward: [(0, '33.104')] [2024-08-05 15:28:53,493][15444] Updated weights for policy 0, policy_version 8541 (0.0020) [2024-08-05 15:28:56,864][15444] Updated weights for policy 0, policy_version 8551 (0.0011) [2024-08-05 15:28:58,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24302.9, 300 sec: 24131.7). Total num frames: 70074368. Throughput: 0: 6081.1. Samples: 17519530. Policy #0 lag: (min: 1.0, avg: 3.7, max: 7.0) [2024-08-05 15:28:58,119][15372] Avg episode reward: [(0, '33.469')] [2024-08-05 15:29:00,292][15444] Updated weights for policy 0, policy_version 8561 (0.0020) [2024-08-05 15:29:03,119][15372] Fps is (10 sec: 23756.2, 60 sec: 24302.9, 300 sec: 24131.7). Total num frames: 70197248. Throughput: 0: 6077.5. Samples: 17555630. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 15:29:03,127][15372] Avg episode reward: [(0, '33.426')] [2024-08-05 15:29:03,666][15444] Updated weights for policy 0, policy_version 8571 (0.0013) [2024-08-05 15:29:07,003][15444] Updated weights for policy 0, policy_version 8581 (0.0011) [2024-08-05 15:29:08,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24439.5, 300 sec: 24131.7). Total num frames: 70320128. Throughput: 0: 6078.7. Samples: 17573980. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 15:29:08,119][15372] Avg episode reward: [(0, '33.926')] [2024-08-05 15:29:08,180][15417] Saving new best policy, reward=33.926! [2024-08-05 15:29:10,266][15444] Updated weights for policy 0, policy_version 8591 (0.0027) [2024-08-05 15:29:13,119][15372] Fps is (10 sec: 23757.1, 60 sec: 24302.9, 300 sec: 24131.7). Total num frames: 70434816. Throughput: 0: 6097.8. Samples: 17610790. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 15:29:13,126][15372] Avg episode reward: [(0, '34.182')] [2024-08-05 15:29:13,132][15417] Saving new best policy, reward=34.182! [2024-08-05 15:29:13,819][15444] Updated weights for policy 0, policy_version 8601 (0.0011) [2024-08-05 15:29:17,353][15444] Updated weights for policy 0, policy_version 8611 (0.0013) [2024-08-05 15:29:18,119][15372] Fps is (10 sec: 23756.2, 60 sec: 24302.9, 300 sec: 24159.4). Total num frames: 70557696. Throughput: 0: 6058.6. Samples: 17646220. Policy #0 lag: (min: 1.0, avg: 3.9, max: 7.0) [2024-08-05 15:29:18,119][15372] Avg episode reward: [(0, '34.002')] [2024-08-05 15:29:20,366][15444] Updated weights for policy 0, policy_version 8621 (0.0016) [2024-08-05 15:29:23,118][15372] Fps is (10 sec: 24576.5, 60 sec: 24303.0, 300 sec: 24131.7). Total num frames: 70680576. Throughput: 0: 6063.8. Samples: 17664700. Policy #0 lag: (min: 1.0, avg: 3.9, max: 7.0) [2024-08-05 15:29:23,119][15372] Avg episode reward: [(0, '33.137')] [2024-08-05 15:29:24,209][15444] Updated weights for policy 0, policy_version 8631 (0.0012) [2024-08-05 15:29:26,821][15417] Signal inference workers to stop experience collection... (2900 times) [2024-08-05 15:29:26,825][15417] Signal inference workers to resume experience collection... (2900 times) [2024-08-05 15:29:26,895][15444] InferenceWorker_p0-w0: stopping experience collection (2900 times) [2024-08-05 15:29:26,895][15444] InferenceWorker_p0-w0: resuming experience collection (2900 times) [2024-08-05 15:29:27,245][15444] Updated weights for policy 0, policy_version 8641 (0.0014) [2024-08-05 15:29:28,119][15372] Fps is (10 sec: 23755.6, 60 sec: 24166.1, 300 sec: 24131.7). Total num frames: 70795264. Throughput: 0: 6034.3. Samples: 17700770. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 15:29:28,127][15372] Avg episode reward: [(0, '32.919')] [2024-08-05 15:29:30,876][15444] Updated weights for policy 0, policy_version 8651 (0.0018) [2024-08-05 15:29:33,119][15372] Fps is (10 sec: 23756.3, 60 sec: 24166.3, 300 sec: 24159.4). Total num frames: 70918144. Throughput: 0: 6043.1. Samples: 17736870. Policy #0 lag: (min: 0.0, avg: 3.2, max: 8.0) [2024-08-05 15:29:33,126][15372] Avg episode reward: [(0, '33.160')] [2024-08-05 15:29:34,458][15444] Updated weights for policy 0, policy_version 8661 (0.0014) [2024-08-05 15:29:37,640][15444] Updated weights for policy 0, policy_version 8671 (0.0020) [2024-08-05 15:29:38,118][15372] Fps is (10 sec: 24577.9, 60 sec: 24166.5, 300 sec: 24159.5). Total num frames: 71041024. Throughput: 0: 6006.7. Samples: 17753700. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 15:29:38,119][15372] Avg episode reward: [(0, '34.193')] [2024-08-05 15:29:38,122][15417] Saving new best policy, reward=34.193! [2024-08-05 15:29:41,263][15444] Updated weights for policy 0, policy_version 8681 (0.0014) [2024-08-05 15:29:43,118][15372] Fps is (10 sec: 23757.3, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 71155712. Throughput: 0: 6004.2. Samples: 17789720. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 15:29:43,119][15372] Avg episode reward: [(0, '34.320')] [2024-08-05 15:29:43,119][15417] Saving new best policy, reward=34.320! [2024-08-05 15:29:44,378][15444] Updated weights for policy 0, policy_version 8691 (0.0026) [2024-08-05 15:29:48,120][15372] Fps is (10 sec: 22934.9, 60 sec: 23892.9, 300 sec: 24103.8). Total num frames: 71270400. Throughput: 0: 6010.6. Samples: 17826110. Policy #0 lag: (min: 0.0, avg: 4.1, max: 9.0) [2024-08-05 15:29:48,128][15372] Avg episode reward: [(0, '33.512')] [2024-08-05 15:29:48,146][15444] Updated weights for policy 0, policy_version 8701 (0.0014) [2024-08-05 15:29:51,455][15444] Updated weights for policy 0, policy_version 8711 (0.0020) [2024-08-05 15:29:53,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 71401472. Throughput: 0: 5997.1. Samples: 17843850. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 15:29:53,126][15372] Avg episode reward: [(0, '33.298')] [2024-08-05 15:29:54,706][15444] Updated weights for policy 0, policy_version 8721 (0.0014) [2024-08-05 15:29:58,118][15372] Fps is (10 sec: 24579.0, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 71516160. Throughput: 0: 5993.1. Samples: 17880480. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 15:29:58,126][15372] Avg episode reward: [(0, '33.784')] [2024-08-05 15:29:58,388][15444] Updated weights for policy 0, policy_version 8731 (0.0019) [2024-08-05 15:30:01,417][15444] Updated weights for policy 0, policy_version 8741 (0.0018) [2024-08-05 15:30:03,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24030.0, 300 sec: 24159.5). Total num frames: 71639040. Throughput: 0: 5995.4. Samples: 17916010. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 15:30:03,127][15372] Avg episode reward: [(0, '32.519')] [2024-08-05 15:30:04,929][15444] Updated weights for policy 0, policy_version 8751 (0.0033) [2024-08-05 15:30:05,611][15417] Signal inference workers to stop experience collection... (2950 times) [2024-08-05 15:30:05,612][15417] Signal inference workers to resume experience collection... (2950 times) [2024-08-05 15:30:05,658][15444] InferenceWorker_p0-w0: stopping experience collection (2950 times) [2024-08-05 15:30:05,658][15444] InferenceWorker_p0-w0: resuming experience collection (2950 times) [2024-08-05 15:30:08,119][15372] Fps is (10 sec: 24575.5, 60 sec: 24029.8, 300 sec: 24159.4). Total num frames: 71761920. Throughput: 0: 5991.8. Samples: 17934330. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 15:30:08,119][15372] Avg episode reward: [(0, '32.133')] [2024-08-05 15:30:08,386][15444] Updated weights for policy 0, policy_version 8761 (0.0012) [2024-08-05 15:30:11,677][15444] Updated weights for policy 0, policy_version 8771 (0.0010) [2024-08-05 15:30:13,119][15372] Fps is (10 sec: 24575.1, 60 sec: 24166.3, 300 sec: 24159.4). Total num frames: 71884800. Throughput: 0: 6001.4. Samples: 17970830. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 15:30:13,119][15372] Avg episode reward: [(0, '33.202')] [2024-08-05 15:30:15,096][15444] Updated weights for policy 0, policy_version 8781 (0.0011) [2024-08-05 15:30:18,119][15372] Fps is (10 sec: 24575.1, 60 sec: 24166.3, 300 sec: 24159.4). Total num frames: 72007680. Throughput: 0: 6024.0. Samples: 18007950. Policy #0 lag: (min: 0.0, avg: 3.1, max: 8.0) [2024-08-05 15:30:18,127][15372] Avg episode reward: [(0, '33.562')] [2024-08-05 15:30:18,175][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000008791_72015872.pth... [2024-08-05 15:30:18,189][15444] Updated weights for policy 0, policy_version 8791 (0.0011) [2024-08-05 15:30:18,297][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000008081_66199552.pth [2024-08-05 15:30:21,777][15444] Updated weights for policy 0, policy_version 8801 (0.0024) [2024-08-05 15:30:23,118][15372] Fps is (10 sec: 23757.8, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 72122368. Throughput: 0: 6046.4. Samples: 18025790. Policy #0 lag: (min: 0.0, avg: 4.0, max: 9.0) [2024-08-05 15:30:23,119][15372] Avg episode reward: [(0, '33.690')] [2024-08-05 15:30:25,358][15444] Updated weights for policy 0, policy_version 8811 (0.0018) [2024-08-05 15:30:28,118][15372] Fps is (10 sec: 23758.1, 60 sec: 24166.7, 300 sec: 24159.5). Total num frames: 72245248. Throughput: 0: 6028.0. Samples: 18060980. Policy #0 lag: (min: 0.0, avg: 4.0, max: 9.0) [2024-08-05 15:30:28,119][15372] Avg episode reward: [(0, '32.681')] [2024-08-05 15:30:28,818][15444] Updated weights for policy 0, policy_version 8821 (0.0013) [2024-08-05 15:30:32,138][15444] Updated weights for policy 0, policy_version 8831 (0.0019) [2024-08-05 15:30:33,119][15372] Fps is (10 sec: 23756.4, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 72359936. Throughput: 0: 6018.6. Samples: 18096940. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 15:30:33,119][15372] Avg episode reward: [(0, '32.930')] [2024-08-05 15:30:35,382][15444] Updated weights for policy 0, policy_version 8841 (0.0011) [2024-08-05 15:30:38,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24029.8, 300 sec: 24159.5). Total num frames: 72482816. Throughput: 0: 6036.0. Samples: 18115470. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 15:30:38,126][15372] Avg episode reward: [(0, '33.150')] [2024-08-05 15:30:39,170][15444] Updated weights for policy 0, policy_version 8851 (0.0039) [2024-08-05 15:30:42,410][15444] Updated weights for policy 0, policy_version 8861 (0.0043) [2024-08-05 15:30:43,118][15372] Fps is (10 sec: 23757.2, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 72597504. Throughput: 0: 6023.8. Samples: 18151550. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 15:30:43,126][15372] Avg episode reward: [(0, '33.480')] [2024-08-05 15:30:43,617][15417] Signal inference workers to stop experience collection... (3000 times) [2024-08-05 15:30:43,618][15417] Signal inference workers to resume experience collection... (3000 times) [2024-08-05 15:30:43,681][15444] InferenceWorker_p0-w0: stopping experience collection (3000 times) [2024-08-05 15:30:43,682][15444] InferenceWorker_p0-w0: resuming experience collection (3000 times) [2024-08-05 15:30:45,674][15444] Updated weights for policy 0, policy_version 8871 (0.0012) [2024-08-05 15:30:48,118][15372] Fps is (10 sec: 25395.5, 60 sec: 24440.0, 300 sec: 24187.2). Total num frames: 72736768. Throughput: 0: 6055.3. Samples: 18188500. Policy #0 lag: (min: 2.0, avg: 4.6, max: 8.0) [2024-08-05 15:30:48,119][15372] Avg episode reward: [(0, '34.025')] [2024-08-05 15:30:49,198][15444] Updated weights for policy 0, policy_version 8881 (0.0012) [2024-08-05 15:30:52,291][15444] Updated weights for policy 0, policy_version 8891 (0.0028) [2024-08-05 15:30:53,118][15372] Fps is (10 sec: 25395.2, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 72851456. Throughput: 0: 6042.3. Samples: 18206230. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 15:30:53,126][15372] Avg episode reward: [(0, '33.481')] [2024-08-05 15:30:55,873][15444] Updated weights for policy 0, policy_version 8901 (0.0013) [2024-08-05 15:30:58,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 72974336. Throughput: 0: 6047.8. Samples: 18242980. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 15:30:58,119][15372] Avg episode reward: [(0, '33.180')] [2024-08-05 15:30:58,959][15444] Updated weights for policy 0, policy_version 8911 (0.0012) [2024-08-05 15:31:02,411][15444] Updated weights for policy 0, policy_version 8921 (0.0021) [2024-08-05 15:31:03,120][15372] Fps is (10 sec: 23753.5, 60 sec: 24165.9, 300 sec: 24159.4). Total num frames: 73089024. Throughput: 0: 6024.3. Samples: 18279050. Policy #0 lag: (min: 0.0, avg: 4.4, max: 8.0) [2024-08-05 15:31:03,120][15372] Avg episode reward: [(0, '32.619')] [2024-08-05 15:31:05,912][15444] Updated weights for policy 0, policy_version 8931 (0.0014) [2024-08-05 15:31:08,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.5, 300 sec: 24159.5). Total num frames: 73211904. Throughput: 0: 6034.2. Samples: 18297330. Policy #0 lag: (min: 0.0, avg: 3.2, max: 7.0) [2024-08-05 15:31:08,126][15372] Avg episode reward: [(0, '32.587')] [2024-08-05 15:31:09,294][15444] Updated weights for policy 0, policy_version 8941 (0.0023) [2024-08-05 15:31:12,907][15444] Updated weights for policy 0, policy_version 8951 (0.0022) [2024-08-05 15:31:13,119][15372] Fps is (10 sec: 24578.5, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 73334784. Throughput: 0: 6055.5. Samples: 18333480. Policy #0 lag: (min: 0.0, avg: 3.2, max: 7.0) [2024-08-05 15:31:13,119][15372] Avg episode reward: [(0, '33.027')] [2024-08-05 15:31:16,051][15444] Updated weights for policy 0, policy_version 8961 (0.0026) [2024-08-05 15:31:18,118][15372] Fps is (10 sec: 23756.6, 60 sec: 24030.1, 300 sec: 24131.7). Total num frames: 73449472. Throughput: 0: 6058.7. Samples: 18369580. Policy #0 lag: (min: 0.0, avg: 4.1, max: 10.0) [2024-08-05 15:31:18,119][15372] Avg episode reward: [(0, '33.637')] [2024-08-05 15:31:19,556][15444] Updated weights for policy 0, policy_version 8971 (0.0021) [2024-08-05 15:31:22,452][15417] Signal inference workers to stop experience collection... (3050 times) [2024-08-05 15:31:22,460][15417] Signal inference workers to resume experience collection... (3050 times) [2024-08-05 15:31:22,517][15444] InferenceWorker_p0-w0: stopping experience collection (3050 times) [2024-08-05 15:31:22,518][15444] InferenceWorker_p0-w0: resuming experience collection (3050 times) [2024-08-05 15:31:22,773][15444] Updated weights for policy 0, policy_version 8981 (0.0018) [2024-08-05 15:31:23,119][15372] Fps is (10 sec: 23757.1, 60 sec: 24166.3, 300 sec: 24159.4). Total num frames: 73572352. Throughput: 0: 6041.8. Samples: 18387350. Policy #0 lag: (min: 0.0, avg: 3.2, max: 7.0) [2024-08-05 15:31:23,119][15372] Avg episode reward: [(0, '33.046')] [2024-08-05 15:31:26,290][15444] Updated weights for policy 0, policy_version 8991 (0.0023) [2024-08-05 15:31:28,119][15372] Fps is (10 sec: 23756.7, 60 sec: 24029.8, 300 sec: 24159.5). Total num frames: 73687040. Throughput: 0: 6042.9. Samples: 18423480. Policy #0 lag: (min: 0.0, avg: 3.2, max: 7.0) [2024-08-05 15:31:28,119][15372] Avg episode reward: [(0, '33.898')] [2024-08-05 15:31:29,892][15444] Updated weights for policy 0, policy_version 9001 (0.0012) [2024-08-05 15:31:32,991][15444] Updated weights for policy 0, policy_version 9011 (0.0015) [2024-08-05 15:31:33,118][15372] Fps is (10 sec: 24576.8, 60 sec: 24303.0, 300 sec: 24187.5). Total num frames: 73818112. Throughput: 0: 6032.5. Samples: 18459960. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 15:31:33,119][15372] Avg episode reward: [(0, '34.854')] [2024-08-05 15:31:33,186][15417] Saving new best policy, reward=34.854! [2024-08-05 15:31:36,541][15444] Updated weights for policy 0, policy_version 9021 (0.0015) [2024-08-05 15:31:38,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 73932800. Throughput: 0: 6034.2. Samples: 18477770. Policy #0 lag: (min: 0.0, avg: 3.1, max: 9.0) [2024-08-05 15:31:38,126][15372] Avg episode reward: [(0, '33.197')] [2024-08-05 15:31:39,755][15444] Updated weights for policy 0, policy_version 9031 (0.0015) [2024-08-05 15:31:43,118][15372] Fps is (10 sec: 23756.6, 60 sec: 24302.9, 300 sec: 24187.3). Total num frames: 74055680. Throughput: 0: 6035.1. Samples: 18514560. Policy #0 lag: (min: 0.0, avg: 3.1, max: 9.0) [2024-08-05 15:31:43,126][15372] Avg episode reward: [(0, '33.052')] [2024-08-05 15:31:43,441][15444] Updated weights for policy 0, policy_version 9041 (0.0029) [2024-08-05 15:31:46,843][15444] Updated weights for policy 0, policy_version 9051 (0.0022) [2024-08-05 15:31:48,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24029.8, 300 sec: 24187.2). Total num frames: 74178560. Throughput: 0: 6033.7. Samples: 18550560. Policy #0 lag: (min: 1.0, avg: 3.7, max: 7.0) [2024-08-05 15:31:48,119][15372] Avg episode reward: [(0, '34.076')] [2024-08-05 15:31:49,985][15444] Updated weights for policy 0, policy_version 9061 (0.0011) [2024-08-05 15:31:53,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 74293248. Throughput: 0: 6044.0. Samples: 18569310. Policy #0 lag: (min: 1.0, avg: 3.7, max: 7.0) [2024-08-05 15:31:53,126][15372] Avg episode reward: [(0, '33.733')] [2024-08-05 15:31:53,664][15444] Updated weights for policy 0, policy_version 9071 (0.0026) [2024-08-05 15:31:54,018][15417] Signal inference workers to stop experience collection... (3100 times) [2024-08-05 15:31:54,019][15417] Signal inference workers to resume experience collection... (3100 times) [2024-08-05 15:31:54,062][15444] InferenceWorker_p0-w0: stopping experience collection (3100 times) [2024-08-05 15:31:54,062][15444] InferenceWorker_p0-w0: resuming experience collection (3100 times) [2024-08-05 15:31:56,682][15444] Updated weights for policy 0, policy_version 9081 (0.0027) [2024-08-05 15:31:58,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 74416128. Throughput: 0: 6034.0. Samples: 18605010. Policy #0 lag: (min: 1.0, avg: 3.7, max: 7.0) [2024-08-05 15:31:58,119][15372] Avg episode reward: [(0, '34.288')] [2024-08-05 15:32:00,331][15444] Updated weights for policy 0, policy_version 9091 (0.0019) [2024-08-05 15:32:03,118][15372] Fps is (10 sec: 25395.3, 60 sec: 24303.5, 300 sec: 24215.0). Total num frames: 74547200. Throughput: 0: 6028.7. Samples: 18640870. Policy #0 lag: (min: 2.0, avg: 5.5, max: 10.0) [2024-08-05 15:32:03,119][15372] Avg episode reward: [(0, '34.133')] [2024-08-05 15:32:04,052][15444] Updated weights for policy 0, policy_version 9101 (0.0029) [2024-08-05 15:32:06,903][15444] Updated weights for policy 0, policy_version 9111 (0.0029) [2024-08-05 15:32:08,119][15372] Fps is (10 sec: 23754.6, 60 sec: 24029.5, 300 sec: 24159.4). Total num frames: 74653696. Throughput: 0: 6043.9. Samples: 18659330. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 15:32:08,127][15372] Avg episode reward: [(0, '33.498')] [2024-08-05 15:32:10,684][15444] Updated weights for policy 0, policy_version 9121 (0.0012) [2024-08-05 15:32:13,118][15372] Fps is (10 sec: 22937.7, 60 sec: 24030.0, 300 sec: 24159.5). Total num frames: 74776576. Throughput: 0: 6037.4. Samples: 18695160. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 15:32:13,119][15372] Avg episode reward: [(0, '33.929')] [2024-08-05 15:32:13,800][15444] Updated weights for policy 0, policy_version 9131 (0.0016) [2024-08-05 15:32:17,362][15444] Updated weights for policy 0, policy_version 9141 (0.0012) [2024-08-05 15:32:18,119][15372] Fps is (10 sec: 25396.9, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 74907648. Throughput: 0: 6022.4. Samples: 18730970. Policy #0 lag: (min: 0.0, avg: 5.0, max: 9.0) [2024-08-05 15:32:18,119][15372] Avg episode reward: [(0, '34.013')] [2024-08-05 15:32:18,122][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000009144_74907648.pth... [2024-08-05 15:32:18,223][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000008435_69099520.pth [2024-08-05 15:32:19,129][15417] Signal inference workers to stop experience collection... (3150 times) [2024-08-05 15:32:19,134][15417] Signal inference workers to resume experience collection... (3150 times) [2024-08-05 15:32:19,185][15444] InferenceWorker_p0-w0: stopping experience collection (3150 times) [2024-08-05 15:32:19,189][15444] InferenceWorker_p0-w0: resuming experience collection (3150 times) [2024-08-05 15:32:20,971][15444] Updated weights for policy 0, policy_version 9151 (0.0010) [2024-08-05 15:32:23,119][15372] Fps is (10 sec: 23755.8, 60 sec: 24029.8, 300 sec: 24159.4). Total num frames: 75014144. Throughput: 0: 6044.8. Samples: 18749790. Policy #0 lag: (min: 1.0, avg: 3.2, max: 7.0) [2024-08-05 15:32:23,119][15372] Avg episode reward: [(0, '33.492')] [2024-08-05 15:32:23,923][15444] Updated weights for policy 0, policy_version 9161 (0.0029) [2024-08-05 15:32:27,625][15444] Updated weights for policy 0, policy_version 9171 (0.0020) [2024-08-05 15:32:28,118][15372] Fps is (10 sec: 23757.3, 60 sec: 24303.0, 300 sec: 24187.3). Total num frames: 75145216. Throughput: 0: 6024.0. Samples: 18785640. Policy #0 lag: (min: 1.0, avg: 3.2, max: 7.0) [2024-08-05 15:32:28,119][15372] Avg episode reward: [(0, '34.294')] [2024-08-05 15:32:30,667][15444] Updated weights for policy 0, policy_version 9181 (0.0010) [2024-08-05 15:32:33,119][15372] Fps is (10 sec: 25395.6, 60 sec: 24166.3, 300 sec: 24187.2). Total num frames: 75268096. Throughput: 0: 6024.0. Samples: 18821640. Policy #0 lag: (min: 0.0, avg: 4.4, max: 9.0) [2024-08-05 15:32:33,126][15372] Avg episode reward: [(0, '34.473')] [2024-08-05 15:32:34,377][15444] Updated weights for policy 0, policy_version 9191 (0.0014) [2024-08-05 15:32:37,700][15444] Updated weights for policy 0, policy_version 9201 (0.0022) [2024-08-05 15:32:38,118][15372] Fps is (10 sec: 22937.4, 60 sec: 24029.8, 300 sec: 24159.5). Total num frames: 75374592. Throughput: 0: 6007.1. Samples: 18839630. Policy #0 lag: (min: 0.0, avg: 3.0, max: 7.0) [2024-08-05 15:32:38,119][15372] Avg episode reward: [(0, '33.599')] [2024-08-05 15:32:40,982][15444] Updated weights for policy 0, policy_version 9211 (0.0013) [2024-08-05 15:32:43,119][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.3, 300 sec: 24187.2). Total num frames: 75505664. Throughput: 0: 6008.0. Samples: 18875370. Policy #0 lag: (min: 0.0, avg: 3.0, max: 7.0) [2024-08-05 15:32:43,119][15372] Avg episode reward: [(0, '33.429')] [2024-08-05 15:32:44,700][15444] Updated weights for policy 0, policy_version 9221 (0.0014) [2024-08-05 15:32:47,864][15444] Updated weights for policy 0, policy_version 9231 (0.0024) [2024-08-05 15:32:48,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 75620352. Throughput: 0: 6012.0. Samples: 18911410. Policy #0 lag: (min: 0.0, avg: 4.2, max: 9.0) [2024-08-05 15:32:48,119][15372] Avg episode reward: [(0, '33.863')] [2024-08-05 15:32:51,299][15444] Updated weights for policy 0, policy_version 9241 (0.0015) [2024-08-05 15:32:53,119][15372] Fps is (10 sec: 23756.6, 60 sec: 24166.3, 300 sec: 24159.4). Total num frames: 75743232. Throughput: 0: 6021.6. Samples: 18930300. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 15:32:53,126][15372] Avg episode reward: [(0, '33.221')] [2024-08-05 15:32:54,703][15444] Updated weights for policy 0, policy_version 9251 (0.0020) [2024-08-05 15:32:58,071][15444] Updated weights for policy 0, policy_version 9261 (0.0022) [2024-08-05 15:32:58,119][15372] Fps is (10 sec: 24575.6, 60 sec: 24166.3, 300 sec: 24159.5). Total num frames: 75866112. Throughput: 0: 6033.7. Samples: 18966680. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 15:32:58,119][15372] Avg episode reward: [(0, '33.038')] [2024-08-05 15:33:00,593][15417] Signal inference workers to stop experience collection... (3200 times) [2024-08-05 15:33:00,598][15417] Signal inference workers to resume experience collection... (3200 times) [2024-08-05 15:33:00,665][15444] InferenceWorker_p0-w0: stopping experience collection (3200 times) [2024-08-05 15:33:00,666][15444] InferenceWorker_p0-w0: resuming experience collection (3200 times) [2024-08-05 15:33:01,325][15444] Updated weights for policy 0, policy_version 9271 (0.0011) [2024-08-05 15:33:03,118][15372] Fps is (10 sec: 24576.6, 60 sec: 24029.9, 300 sec: 24187.2). Total num frames: 75988992. Throughput: 0: 6042.5. Samples: 19002880. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 15:33:03,119][15372] Avg episode reward: [(0, '33.874')] [2024-08-05 15:33:04,653][15444] Updated weights for policy 0, policy_version 9281 (0.0015) [2024-08-05 15:33:08,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24166.7, 300 sec: 24159.5). Total num frames: 76103680. Throughput: 0: 6045.4. Samples: 19021830. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 15:33:08,123][15444] Updated weights for policy 0, policy_version 9291 (0.0017) [2024-08-05 15:33:08,126][15372] Avg episode reward: [(0, '33.494')] [2024-08-05 15:33:11,240][15444] Updated weights for policy 0, policy_version 9301 (0.0023) [2024-08-05 15:33:13,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 76234752. Throughput: 0: 6059.8. Samples: 19058330. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 15:33:13,126][15372] Avg episode reward: [(0, '34.216')] [2024-08-05 15:33:14,681][15444] Updated weights for policy 0, policy_version 9311 (0.0023) [2024-08-05 15:33:18,089][15444] Updated weights for policy 0, policy_version 9321 (0.0017) [2024-08-05 15:33:18,118][15372] Fps is (10 sec: 25395.4, 60 sec: 24166.5, 300 sec: 24187.2). Total num frames: 76357632. Throughput: 0: 6077.6. Samples: 19095130. Policy #0 lag: (min: 1.0, avg: 4.0, max: 7.0) [2024-08-05 15:33:18,119][15372] Avg episode reward: [(0, '34.852')] [2024-08-05 15:33:21,345][15444] Updated weights for policy 0, policy_version 9331 (0.0013) [2024-08-05 15:33:23,119][15372] Fps is (10 sec: 24575.5, 60 sec: 24439.5, 300 sec: 24187.2). Total num frames: 76480512. Throughput: 0: 6089.8. Samples: 19113670. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 15:33:23,127][15372] Avg episode reward: [(0, '33.172')] [2024-08-05 15:33:24,857][15444] Updated weights for policy 0, policy_version 9341 (0.0013) [2024-08-05 15:33:28,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 76595200. Throughput: 0: 6105.1. Samples: 19150100. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 15:33:28,126][15372] Avg episode reward: [(0, '33.661')] [2024-08-05 15:33:28,237][15444] Updated weights for policy 0, policy_version 9351 (0.0012) [2024-08-05 15:33:31,629][15444] Updated weights for policy 0, policy_version 9361 (0.0013) [2024-08-05 15:33:33,118][15372] Fps is (10 sec: 23757.3, 60 sec: 24166.5, 300 sec: 24159.5). Total num frames: 76718080. Throughput: 0: 6098.0. Samples: 19185820. Policy #0 lag: (min: 1.0, avg: 4.1, max: 7.0) [2024-08-05 15:33:33,126][15372] Avg episode reward: [(0, '34.427')] [2024-08-05 15:33:34,978][15444] Updated weights for policy 0, policy_version 9371 (0.0026) [2024-08-05 15:33:38,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24439.5, 300 sec: 24159.5). Total num frames: 76840960. Throughput: 0: 6094.9. Samples: 19204570. Policy #0 lag: (min: 1.0, avg: 3.6, max: 8.0) [2024-08-05 15:33:38,126][15372] Avg episode reward: [(0, '33.710')] [2024-08-05 15:33:38,251][15444] Updated weights for policy 0, policy_version 9381 (0.0019) [2024-08-05 15:33:41,984][15444] Updated weights for policy 0, policy_version 9391 (0.0020) [2024-08-05 15:33:43,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24303.0, 300 sec: 24159.5). Total num frames: 76963840. Throughput: 0: 6097.8. Samples: 19241080. Policy #0 lag: (min: 1.0, avg: 3.6, max: 8.0) [2024-08-05 15:33:43,119][15372] Avg episode reward: [(0, '33.908')] [2024-08-05 15:33:44,906][15444] Updated weights for policy 0, policy_version 9401 (0.0013) [2024-08-05 15:33:48,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24302.9, 300 sec: 24131.7). Total num frames: 77078528. Throughput: 0: 6099.6. Samples: 19277360. Policy #0 lag: (min: 0.0, avg: 4.3, max: 8.0) [2024-08-05 15:33:48,126][15372] Avg episode reward: [(0, '34.274')] [2024-08-05 15:33:48,427][15444] Updated weights for policy 0, policy_version 9411 (0.0019) [2024-08-05 15:33:52,065][15444] Updated weights for policy 0, policy_version 9421 (0.0012) [2024-08-05 15:33:53,119][15372] Fps is (10 sec: 24573.7, 60 sec: 24439.2, 300 sec: 24187.2). Total num frames: 77209600. Throughput: 0: 6082.1. Samples: 19295530. Policy #0 lag: (min: 0.0, avg: 4.3, max: 8.0) [2024-08-05 15:33:53,120][15372] Avg episode reward: [(0, '34.321')] [2024-08-05 15:33:55,095][15444] Updated weights for policy 0, policy_version 9431 (0.0035) [2024-08-05 15:33:58,128][15372] Fps is (10 sec: 24552.6, 60 sec: 24299.1, 300 sec: 24158.7). Total num frames: 77324288. Throughput: 0: 6080.0. Samples: 19331990. Policy #0 lag: (min: 0.0, avg: 5.0, max: 10.0) [2024-08-05 15:33:58,136][15372] Avg episode reward: [(0, '33.150')] [2024-08-05 15:33:58,797][15444] Updated weights for policy 0, policy_version 9441 (0.0023) [2024-08-05 15:33:59,149][15417] Signal inference workers to stop experience collection... (3250 times) [2024-08-05 15:33:59,150][15417] Signal inference workers to resume experience collection... (3250 times) [2024-08-05 15:33:59,193][15444] InferenceWorker_p0-w0: stopping experience collection (3250 times) [2024-08-05 15:33:59,193][15444] InferenceWorker_p0-w0: resuming experience collection (3250 times) [2024-08-05 15:34:02,012][15444] Updated weights for policy 0, policy_version 9451 (0.0022) [2024-08-05 15:34:03,118][15372] Fps is (10 sec: 22939.6, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 77438976. Throughput: 0: 6072.9. Samples: 19368410. Policy #0 lag: (min: 2.0, avg: 3.8, max: 8.0) [2024-08-05 15:34:03,126][15372] Avg episode reward: [(0, '32.984')] [2024-08-05 15:34:05,378][15444] Updated weights for policy 0, policy_version 9461 (0.0013) [2024-08-05 15:34:08,119][15372] Fps is (10 sec: 24599.1, 60 sec: 24439.4, 300 sec: 24187.2). Total num frames: 77570048. Throughput: 0: 6043.6. Samples: 19385630. Policy #0 lag: (min: 2.0, avg: 3.8, max: 8.0) [2024-08-05 15:34:08,126][15372] Avg episode reward: [(0, '33.388')] [2024-08-05 15:34:09,078][15444] Updated weights for policy 0, policy_version 9471 (0.0013) [2024-08-05 15:34:12,070][15444] Updated weights for policy 0, policy_version 9481 (0.0027) [2024-08-05 15:34:13,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 77684736. Throughput: 0: 6042.0. Samples: 19421990. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 15:34:13,126][15372] Avg episode reward: [(0, '33.666')] [2024-08-05 15:34:15,771][15444] Updated weights for policy 0, policy_version 9491 (0.0020) [2024-08-05 15:34:18,119][15372] Fps is (10 sec: 23756.4, 60 sec: 24166.3, 300 sec: 24159.4). Total num frames: 77807616. Throughput: 0: 6053.7. Samples: 19458240. Policy #0 lag: (min: 1.0, avg: 4.5, max: 8.0) [2024-08-05 15:34:18,126][15372] Avg episode reward: [(0, '32.774')] [2024-08-05 15:34:18,133][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000009498_77807616.pth... [2024-08-05 15:34:18,258][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000008791_72015872.pth [2024-08-05 15:34:19,097][15444] Updated weights for policy 0, policy_version 9501 (0.0013) [2024-08-05 15:34:22,476][15444] Updated weights for policy 0, policy_version 9511 (0.0013) [2024-08-05 15:34:23,118][15372] Fps is (10 sec: 24575.8, 60 sec: 24166.4, 300 sec: 24187.3). Total num frames: 77930496. Throughput: 0: 6019.3. Samples: 19475440. Policy #0 lag: (min: 1.0, avg: 4.5, max: 8.0) [2024-08-05 15:34:23,119][15372] Avg episode reward: [(0, '34.154')] [2024-08-05 15:34:25,982][15444] Updated weights for policy 0, policy_version 9521 (0.0019) [2024-08-05 15:34:28,118][15372] Fps is (10 sec: 23757.4, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 78045184. Throughput: 0: 6008.9. Samples: 19511480. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 15:34:28,119][15372] Avg episode reward: [(0, '34.520')] [2024-08-05 15:34:29,356][15444] Updated weights for policy 0, policy_version 9531 (0.0022) [2024-08-05 15:34:32,831][15444] Updated weights for policy 0, policy_version 9541 (0.0014) [2024-08-05 15:34:33,119][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.4, 300 sec: 24159.4). Total num frames: 78168064. Throughput: 0: 6002.0. Samples: 19547450. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 15:34:33,119][15372] Avg episode reward: [(0, '33.889')] [2024-08-05 15:34:35,546][15417] Signal inference workers to stop experience collection... (3300 times) [2024-08-05 15:34:35,547][15417] Signal inference workers to resume experience collection... (3300 times) [2024-08-05 15:34:35,585][15444] InferenceWorker_p0-w0: stopping experience collection (3300 times) [2024-08-05 15:34:35,586][15444] InferenceWorker_p0-w0: resuming experience collection (3300 times) [2024-08-05 15:34:35,931][15444] Updated weights for policy 0, policy_version 9551 (0.0017) [2024-08-05 15:34:38,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 78290944. Throughput: 0: 6011.5. Samples: 19566040. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 15:34:38,119][15372] Avg episode reward: [(0, '34.493')] [2024-08-05 15:34:39,372][15444] Updated weights for policy 0, policy_version 9561 (0.0020) [2024-08-05 15:34:42,921][15444] Updated weights for policy 0, policy_version 9571 (0.0015) [2024-08-05 15:34:43,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24029.8, 300 sec: 24187.3). Total num frames: 78405632. Throughput: 0: 6028.2. Samples: 19603200. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 15:34:43,119][15372] Avg episode reward: [(0, '33.801')] [2024-08-05 15:34:46,001][15444] Updated weights for policy 0, policy_version 9581 (0.0011) [2024-08-05 15:34:48,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 78536704. Throughput: 0: 6014.0. Samples: 19639040. Policy #0 lag: (min: 1.0, avg: 3.2, max: 8.0) [2024-08-05 15:34:48,119][15372] Avg episode reward: [(0, '34.217')] [2024-08-05 15:34:49,781][15444] Updated weights for policy 0, policy_version 9591 (0.0017) [2024-08-05 15:34:53,064][15444] Updated weights for policy 0, policy_version 9601 (0.0026) [2024-08-05 15:34:53,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24030.2, 300 sec: 24187.2). Total num frames: 78651392. Throughput: 0: 6025.8. Samples: 19656790. Policy #0 lag: (min: 1.0, avg: 3.2, max: 8.0) [2024-08-05 15:34:53,119][15372] Avg episode reward: [(0, '34.284')] [2024-08-05 15:34:56,324][15444] Updated weights for policy 0, policy_version 9611 (0.0021) [2024-08-05 15:34:58,118][15372] Fps is (10 sec: 22937.4, 60 sec: 24033.7, 300 sec: 24159.5). Total num frames: 78766080. Throughput: 0: 6014.7. Samples: 19692650. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 15:34:58,126][15372] Avg episode reward: [(0, '33.627')] [2024-08-05 15:35:00,040][15444] Updated weights for policy 0, policy_version 9621 (0.0014) [2024-08-05 15:35:03,119][15372] Fps is (10 sec: 23756.4, 60 sec: 24166.3, 300 sec: 24159.5). Total num frames: 78888960. Throughput: 0: 6010.2. Samples: 19728700. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 15:35:03,126][15372] Avg episode reward: [(0, '33.763')] [2024-08-05 15:35:03,232][15444] Updated weights for policy 0, policy_version 9631 (0.0019) [2024-08-05 15:35:06,670][15444] Updated weights for policy 0, policy_version 9641 (0.0025) [2024-08-05 15:35:08,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 79011840. Throughput: 0: 6029.6. Samples: 19746770. Policy #0 lag: (min: 0.0, avg: 4.9, max: 9.0) [2024-08-05 15:35:08,126][15372] Avg episode reward: [(0, '34.038')] [2024-08-05 15:35:10,599][15444] Updated weights for policy 0, policy_version 9651 (0.0014) [2024-08-05 15:35:13,119][15372] Fps is (10 sec: 22118.2, 60 sec: 23756.7, 300 sec: 24076.2). Total num frames: 79110144. Throughput: 0: 5949.1. Samples: 19779190. Policy #0 lag: (min: 0.0, avg: 3.2, max: 8.0) [2024-08-05 15:35:13,127][15372] Avg episode reward: [(0, '33.756')] [2024-08-05 15:35:14,228][15444] Updated weights for policy 0, policy_version 9661 (0.0020) [2024-08-05 15:35:17,855][15444] Updated weights for policy 0, policy_version 9671 (0.0016) [2024-08-05 15:35:18,118][15372] Fps is (10 sec: 21299.3, 60 sec: 23620.4, 300 sec: 24076.1). Total num frames: 79224832. Throughput: 0: 5913.3. Samples: 19813550. Policy #0 lag: (min: 0.0, avg: 3.2, max: 8.0) [2024-08-05 15:35:18,119][15372] Avg episode reward: [(0, '33.475')] [2024-08-05 15:35:21,037][15444] Updated weights for policy 0, policy_version 9681 (0.0020) [2024-08-05 15:35:23,118][15372] Fps is (10 sec: 23757.3, 60 sec: 23620.3, 300 sec: 24076.1). Total num frames: 79347712. Throughput: 0: 5904.9. Samples: 19831760. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 15:35:23,119][15372] Avg episode reward: [(0, '32.599')] [2024-08-05 15:35:23,157][15417] Signal inference workers to stop experience collection... (3350 times) [2024-08-05 15:35:23,158][15417] Signal inference workers to resume experience collection... (3350 times) [2024-08-05 15:35:23,200][15444] InferenceWorker_p0-w0: stopping experience collection (3350 times) [2024-08-05 15:35:23,200][15444] InferenceWorker_p0-w0: resuming experience collection (3350 times) [2024-08-05 15:35:24,931][15444] Updated weights for policy 0, policy_version 9691 (0.0019) [2024-08-05 15:35:28,119][15372] Fps is (10 sec: 23756.6, 60 sec: 23620.2, 300 sec: 24076.2). Total num frames: 79462400. Throughput: 0: 5856.9. Samples: 19866760. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 15:35:28,127][15372] Avg episode reward: [(0, '33.362')] [2024-08-05 15:35:28,400][15444] Updated weights for policy 0, policy_version 9701 (0.0015) [2024-08-05 15:35:31,779][15444] Updated weights for policy 0, policy_version 9711 (0.0030) [2024-08-05 15:35:33,119][15372] Fps is (10 sec: 22936.5, 60 sec: 23483.6, 300 sec: 24048.3). Total num frames: 79577088. Throughput: 0: 5811.3. Samples: 19900550. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 15:35:33,119][15372] Avg episode reward: [(0, '34.890')] [2024-08-05 15:35:33,120][15417] Saving new best policy, reward=34.890! [2024-08-05 15:35:35,403][15444] Updated weights for policy 0, policy_version 9721 (0.0010) [2024-08-05 15:35:38,119][15372] Fps is (10 sec: 23756.7, 60 sec: 23483.7, 300 sec: 24076.1). Total num frames: 79699968. Throughput: 0: 5828.2. Samples: 19919060. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 15:35:38,119][15372] Avg episode reward: [(0, '34.491')] [2024-08-05 15:35:38,991][15444] Updated weights for policy 0, policy_version 9731 (0.0014) [2024-08-05 15:35:41,979][15444] Updated weights for policy 0, policy_version 9741 (0.0012) [2024-08-05 15:35:43,118][15372] Fps is (10 sec: 23758.1, 60 sec: 23483.8, 300 sec: 23992.8). Total num frames: 79814656. Throughput: 0: 5834.0. Samples: 19955180. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 15:35:43,119][15372] Avg episode reward: [(0, '33.944')] [2024-08-05 15:35:45,539][15444] Updated weights for policy 0, policy_version 9751 (0.0011) [2024-08-05 15:35:48,118][15372] Fps is (10 sec: 23757.2, 60 sec: 23347.2, 300 sec: 24020.6). Total num frames: 79937536. Throughput: 0: 5828.5. Samples: 19990980. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 15:35:48,127][15372] Avg episode reward: [(0, '34.427')] [2024-08-05 15:35:48,817][15444] Updated weights for policy 0, policy_version 9761 (0.0017) [2024-08-05 15:35:52,395][15444] Updated weights for policy 0, policy_version 9771 (0.0013) [2024-08-05 15:35:53,118][15372] Fps is (10 sec: 24576.0, 60 sec: 23483.7, 300 sec: 24020.6). Total num frames: 80060416. Throughput: 0: 5827.6. Samples: 20009010. Policy #0 lag: (min: 1.0, avg: 3.2, max: 8.0) [2024-08-05 15:35:53,119][15372] Avg episode reward: [(0, '33.710')] [2024-08-05 15:35:55,834][15444] Updated weights for policy 0, policy_version 9781 (0.0011) [2024-08-05 15:35:58,118][15372] Fps is (10 sec: 24575.9, 60 sec: 23620.3, 300 sec: 24048.5). Total num frames: 80183296. Throughput: 0: 5903.6. Samples: 20044850. Policy #0 lag: (min: 1.0, avg: 3.8, max: 8.0) [2024-08-05 15:35:58,126][15372] Avg episode reward: [(0, '34.268')] [2024-08-05 15:35:59,139][15444] Updated weights for policy 0, policy_version 9791 (0.0011) [2024-08-05 15:36:02,641][15444] Updated weights for policy 0, policy_version 9801 (0.0030) [2024-08-05 15:36:03,118][15372] Fps is (10 sec: 23756.7, 60 sec: 23483.8, 300 sec: 24020.6). Total num frames: 80297984. Throughput: 0: 5933.8. Samples: 20080570. Policy #0 lag: (min: 1.0, avg: 3.8, max: 8.0) [2024-08-05 15:36:03,119][15372] Avg episode reward: [(0, '33.752')] [2024-08-05 15:36:06,010][15444] Updated weights for policy 0, policy_version 9811 (0.0020) [2024-08-05 15:36:08,118][15372] Fps is (10 sec: 23756.8, 60 sec: 23483.7, 300 sec: 24020.6). Total num frames: 80420864. Throughput: 0: 5934.9. Samples: 20098830. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 15:36:08,119][15372] Avg episode reward: [(0, '33.814')] [2024-08-05 15:36:09,594][15444] Updated weights for policy 0, policy_version 9821 (0.0011) [2024-08-05 15:36:09,912][15417] Signal inference workers to stop experience collection... (3400 times) [2024-08-05 15:36:09,912][15417] Signal inference workers to resume experience collection... (3400 times) [2024-08-05 15:36:09,957][15444] InferenceWorker_p0-w0: stopping experience collection (3400 times) [2024-08-05 15:36:09,958][15444] InferenceWorker_p0-w0: resuming experience collection (3400 times) [2024-08-05 15:36:13,019][15444] Updated weights for policy 0, policy_version 9831 (0.0019) [2024-08-05 15:36:13,124][15372] Fps is (10 sec: 23743.8, 60 sec: 23754.7, 300 sec: 24020.2). Total num frames: 80535552. Throughput: 0: 5962.2. Samples: 20135090. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 15:36:13,124][15372] Avg episode reward: [(0, '34.402')] [2024-08-05 15:36:16,162][15444] Updated weights for policy 0, policy_version 9841 (0.0011) [2024-08-05 15:36:18,118][15372] Fps is (10 sec: 22937.6, 60 sec: 23756.8, 300 sec: 23992.9). Total num frames: 80650240. Throughput: 0: 6014.1. Samples: 20171180. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 15:36:18,126][15372] Avg episode reward: [(0, '33.307')] [2024-08-05 15:36:18,130][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000009846_80658432.pth... [2024-08-05 15:36:18,279][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000009144_74907648.pth [2024-08-05 15:36:19,598][15444] Updated weights for policy 0, policy_version 9851 (0.0019) [2024-08-05 15:36:22,933][15444] Updated weights for policy 0, policy_version 9861 (0.0023) [2024-08-05 15:36:23,119][15372] Fps is (10 sec: 24589.2, 60 sec: 23893.3, 300 sec: 24048.4). Total num frames: 80781312. Throughput: 0: 6000.4. Samples: 20189080. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 15:36:23,119][15372] Avg episode reward: [(0, '33.610')] [2024-08-05 15:36:26,582][15444] Updated weights for policy 0, policy_version 9871 (0.0026) [2024-08-05 15:36:28,120][15372] Fps is (10 sec: 24573.0, 60 sec: 23892.9, 300 sec: 23992.7). Total num frames: 80896000. Throughput: 0: 5997.8. Samples: 20225090. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 15:36:28,120][15372] Avg episode reward: [(0, '34.488')] [2024-08-05 15:36:29,983][15444] Updated weights for policy 0, policy_version 9881 (0.0020) [2024-08-05 15:36:33,034][15444] Updated weights for policy 0, policy_version 9891 (0.0011) [2024-08-05 15:36:33,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24166.6, 300 sec: 24048.4). Total num frames: 81027072. Throughput: 0: 6020.7. Samples: 20261910. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 15:36:33,119][15372] Avg episode reward: [(0, '34.838')] [2024-08-05 15:36:36,712][15444] Updated weights for policy 0, policy_version 9901 (0.0020) [2024-08-05 15:36:38,119][15372] Fps is (10 sec: 24578.6, 60 sec: 24029.9, 300 sec: 24020.6). Total num frames: 81141760. Throughput: 0: 6019.8. Samples: 20279900. Policy #0 lag: (min: 0.0, avg: 4.0, max: 9.0) [2024-08-05 15:36:38,119][15372] Avg episode reward: [(0, '34.903')] [2024-08-05 15:36:38,198][15417] Saving new best policy, reward=34.903! [2024-08-05 15:36:39,874][15444] Updated weights for policy 0, policy_version 9911 (0.0014) [2024-08-05 15:36:43,119][15372] Fps is (10 sec: 23755.3, 60 sec: 24166.1, 300 sec: 24020.6). Total num frames: 81264640. Throughput: 0: 6046.1. Samples: 20316930. Policy #0 lag: (min: 0.0, avg: 4.0, max: 9.0) [2024-08-05 15:36:43,127][15372] Avg episode reward: [(0, '33.574')] [2024-08-05 15:36:43,259][15444] Updated weights for policy 0, policy_version 9921 (0.0012) [2024-08-05 15:36:46,801][15444] Updated weights for policy 0, policy_version 9931 (0.0030) [2024-08-05 15:36:48,118][15372] Fps is (10 sec: 24576.4, 60 sec: 24166.4, 300 sec: 24048.4). Total num frames: 81387520. Throughput: 0: 6053.1. Samples: 20352960. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 15:36:48,119][15372] Avg episode reward: [(0, '34.183')] [2024-08-05 15:36:49,924][15444] Updated weights for policy 0, policy_version 9941 (0.0018) [2024-08-05 15:36:51,845][15417] Signal inference workers to stop experience collection... (3450 times) [2024-08-05 15:36:51,847][15417] Signal inference workers to resume experience collection... (3450 times) [2024-08-05 15:36:51,923][15444] InferenceWorker_p0-w0: stopping experience collection (3450 times) [2024-08-05 15:36:51,923][15444] InferenceWorker_p0-w0: resuming experience collection (3450 times) [2024-08-05 15:36:53,119][15372] Fps is (10 sec: 24576.4, 60 sec: 24166.2, 300 sec: 24048.3). Total num frames: 81510400. Throughput: 0: 6051.0. Samples: 20371130. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 15:36:53,119][15372] Avg episode reward: [(0, '34.963')] [2024-08-05 15:36:53,120][15417] Saving new best policy, reward=34.963! [2024-08-05 15:36:53,473][15444] Updated weights for policy 0, policy_version 9951 (0.0016) [2024-08-05 15:36:56,541][15444] Updated weights for policy 0, policy_version 9961 (0.0027) [2024-08-05 15:36:58,118][15372] Fps is (10 sec: 24575.8, 60 sec: 24166.4, 300 sec: 24020.6). Total num frames: 81633280. Throughput: 0: 6065.0. Samples: 20407980. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 15:36:58,127][15372] Avg episode reward: [(0, '33.541')] [2024-08-05 15:37:00,042][15444] Updated weights for policy 0, policy_version 9971 (0.0011) [2024-08-05 15:37:03,118][15372] Fps is (10 sec: 24577.3, 60 sec: 24302.9, 300 sec: 24076.2). Total num frames: 81756160. Throughput: 0: 6082.7. Samples: 20444900. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 15:37:03,126][15372] Avg episode reward: [(0, '34.194')] [2024-08-05 15:37:03,461][15444] Updated weights for policy 0, policy_version 9981 (0.0037) [2024-08-05 15:37:06,663][15444] Updated weights for policy 0, policy_version 9991 (0.0025) [2024-08-05 15:37:08,119][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.4, 300 sec: 24048.4). Total num frames: 81870848. Throughput: 0: 6100.9. Samples: 20463620. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 15:37:08,126][15372] Avg episode reward: [(0, '35.047')] [2024-08-05 15:37:08,152][15417] Saving new best policy, reward=35.047! [2024-08-05 15:37:10,165][15444] Updated weights for policy 0, policy_version 10001 (0.0014) [2024-08-05 15:37:13,119][15372] Fps is (10 sec: 24575.7, 60 sec: 24441.6, 300 sec: 24048.4). Total num frames: 82001920. Throughput: 0: 6114.4. Samples: 20500230. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 15:37:13,120][15372] Avg episode reward: [(0, '34.700')] [2024-08-05 15:37:13,278][15444] Updated weights for policy 0, policy_version 10011 (0.0018) [2024-08-05 15:37:16,840][15444] Updated weights for policy 0, policy_version 10021 (0.0012) [2024-08-05 15:37:18,118][15372] Fps is (10 sec: 25395.5, 60 sec: 24576.0, 300 sec: 24103.9). Total num frames: 82124800. Throughput: 0: 6096.2. Samples: 20536240. Policy #0 lag: (min: 1.0, avg: 3.8, max: 7.0) [2024-08-05 15:37:18,119][15372] Avg episode reward: [(0, '34.440')] [2024-08-05 15:37:20,246][15444] Updated weights for policy 0, policy_version 10031 (0.0019) [2024-08-05 15:37:23,119][15372] Fps is (10 sec: 23756.2, 60 sec: 24302.8, 300 sec: 24048.3). Total num frames: 82239488. Throughput: 0: 6111.7. Samples: 20554930. Policy #0 lag: (min: 1.0, avg: 3.8, max: 7.0) [2024-08-05 15:37:23,119][15372] Avg episode reward: [(0, '34.534')] [2024-08-05 15:37:23,645][15444] Updated weights for policy 0, policy_version 10041 (0.0017) [2024-08-05 15:37:26,880][15444] Updated weights for policy 0, policy_version 10051 (0.0027) [2024-08-05 15:37:28,119][15372] Fps is (10 sec: 22936.8, 60 sec: 24303.3, 300 sec: 24020.6). Total num frames: 82354176. Throughput: 0: 6079.8. Samples: 20590520. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 15:37:28,119][15372] Avg episode reward: [(0, '34.588')] [2024-08-05 15:37:30,310][15444] Updated weights for policy 0, policy_version 10061 (0.0019) [2024-08-05 15:37:33,118][15372] Fps is (10 sec: 25396.2, 60 sec: 24439.5, 300 sec: 24131.7). Total num frames: 82493440. Throughput: 0: 6095.6. Samples: 20627260. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 15:37:33,119][15372] Avg episode reward: [(0, '33.793')] [2024-08-05 15:37:33,746][15444] Updated weights for policy 0, policy_version 10071 (0.0024) [2024-08-05 15:37:36,970][15444] Updated weights for policy 0, policy_version 10081 (0.0022) [2024-08-05 15:37:38,118][15372] Fps is (10 sec: 24576.8, 60 sec: 24303.0, 300 sec: 24048.4). Total num frames: 82599936. Throughput: 0: 6104.5. Samples: 20645830. Policy #0 lag: (min: 0.0, avg: 3.6, max: 9.0) [2024-08-05 15:37:38,119][15372] Avg episode reward: [(0, '34.655')] [2024-08-05 15:37:40,598][15444] Updated weights for policy 0, policy_version 10091 (0.0021) [2024-08-05 15:37:42,549][15417] Signal inference workers to stop experience collection... (3500 times) [2024-08-05 15:37:42,550][15417] Signal inference workers to resume experience collection... (3500 times) [2024-08-05 15:37:42,604][15444] InferenceWorker_p0-w0: stopping experience collection (3500 times) [2024-08-05 15:37:42,605][15444] InferenceWorker_p0-w0: resuming experience collection (3500 times) [2024-08-05 15:37:43,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24439.8, 300 sec: 24103.9). Total num frames: 82731008. Throughput: 0: 6088.7. Samples: 20681970. Policy #0 lag: (min: 0.0, avg: 3.4, max: 8.0) [2024-08-05 15:37:43,119][15372] Avg episode reward: [(0, '34.607')] [2024-08-05 15:37:43,524][15444] Updated weights for policy 0, policy_version 10101 (0.0016) [2024-08-05 15:37:47,172][15444] Updated weights for policy 0, policy_version 10111 (0.0034) [2024-08-05 15:37:48,118][15372] Fps is (10 sec: 25395.2, 60 sec: 24439.5, 300 sec: 24103.9). Total num frames: 82853888. Throughput: 0: 6081.3. Samples: 20718560. Policy #0 lag: (min: 0.0, avg: 3.4, max: 8.0) [2024-08-05 15:37:48,119][15372] Avg episode reward: [(0, '34.193')] [2024-08-05 15:37:50,699][15444] Updated weights for policy 0, policy_version 10121 (0.0012) [2024-08-05 15:37:53,119][15372] Fps is (10 sec: 23756.3, 60 sec: 24303.1, 300 sec: 24076.1). Total num frames: 82968576. Throughput: 0: 6092.0. Samples: 20737760. Policy #0 lag: (min: 0.0, avg: 3.1, max: 7.0) [2024-08-05 15:37:53,127][15372] Avg episode reward: [(0, '33.844')] [2024-08-05 15:37:53,715][15444] Updated weights for policy 0, policy_version 10131 (0.0012) [2024-08-05 15:37:57,600][15444] Updated weights for policy 0, policy_version 10141 (0.0018) [2024-08-05 15:37:58,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24302.9, 300 sec: 24076.1). Total num frames: 83091456. Throughput: 0: 6059.1. Samples: 20772890. Policy #0 lag: (min: 0.0, avg: 3.1, max: 7.0) [2024-08-05 15:37:58,119][15372] Avg episode reward: [(0, '33.440')] [2024-08-05 15:38:00,660][15444] Updated weights for policy 0, policy_version 10151 (0.0024) [2024-08-05 15:38:03,118][15372] Fps is (10 sec: 23757.2, 60 sec: 24166.4, 300 sec: 24076.2). Total num frames: 83206144. Throughput: 0: 6050.7. Samples: 20808520. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 15:38:03,126][15372] Avg episode reward: [(0, '34.497')] [2024-08-05 15:38:04,263][15444] Updated weights for policy 0, policy_version 10161 (0.0011) [2024-08-05 15:38:07,975][15444] Updated weights for policy 0, policy_version 10171 (0.0012) [2024-08-05 15:38:08,118][15372] Fps is (10 sec: 22937.6, 60 sec: 24166.4, 300 sec: 24020.6). Total num frames: 83320832. Throughput: 0: 6046.7. Samples: 20827030. Policy #0 lag: (min: 0.0, avg: 4.3, max: 9.0) [2024-08-05 15:38:08,119][15372] Avg episode reward: [(0, '35.718')] [2024-08-05 15:38:08,191][15417] Saving new best policy, reward=35.718! [2024-08-05 15:38:10,534][15417] Signal inference workers to stop experience collection... (3550 times) [2024-08-05 15:38:10,535][15417] Signal inference workers to resume experience collection... (3550 times) [2024-08-05 15:38:10,609][15444] InferenceWorker_p0-w0: stopping experience collection (3550 times) [2024-08-05 15:38:10,609][15444] InferenceWorker_p0-w0: resuming experience collection (3550 times) [2024-08-05 15:38:10,878][15444] Updated weights for policy 0, policy_version 10181 (0.0048) [2024-08-05 15:38:13,118][15372] Fps is (10 sec: 25395.1, 60 sec: 24303.0, 300 sec: 24076.1). Total num frames: 83460096. Throughput: 0: 6055.8. Samples: 20863030. Policy #0 lag: (min: 0.0, avg: 4.3, max: 9.0) [2024-08-05 15:38:13,119][15372] Avg episode reward: [(0, '35.971')] [2024-08-05 15:38:13,119][15417] Saving new best policy, reward=35.971! [2024-08-05 15:38:14,546][15444] Updated weights for policy 0, policy_version 10191 (0.0011) [2024-08-05 15:38:17,589][15444] Updated weights for policy 0, policy_version 10201 (0.0015) [2024-08-05 15:38:18,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24029.8, 300 sec: 24020.6). Total num frames: 83566592. Throughput: 0: 6040.9. Samples: 20899100. Policy #0 lag: (min: 0.0, avg: 4.7, max: 9.0) [2024-08-05 15:38:18,120][15372] Avg episode reward: [(0, '35.001')] [2024-08-05 15:38:18,138][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000010201_83566592.pth... [2024-08-05 15:38:18,279][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000009498_77807616.pth [2024-08-05 15:38:21,211][15444] Updated weights for policy 0, policy_version 10211 (0.0016) [2024-08-05 15:38:23,118][15372] Fps is (10 sec: 22937.7, 60 sec: 24166.5, 300 sec: 24048.4). Total num frames: 83689472. Throughput: 0: 6035.8. Samples: 20917440. Policy #0 lag: (min: 0.0, avg: 4.7, max: 9.0) [2024-08-05 15:38:23,119][15372] Avg episode reward: [(0, '34.186')] [2024-08-05 15:38:24,846][15444] Updated weights for policy 0, policy_version 10221 (0.0018) [2024-08-05 15:38:27,846][15444] Updated weights for policy 0, policy_version 10231 (0.0025) [2024-08-05 15:38:28,119][15372] Fps is (10 sec: 25394.6, 60 sec: 24439.5, 300 sec: 24076.1). Total num frames: 83820544. Throughput: 0: 6055.5. Samples: 20954470. Policy #0 lag: (min: 1.0, avg: 4.5, max: 8.0) [2024-08-05 15:38:28,119][15372] Avg episode reward: [(0, '33.895')] [2024-08-05 15:38:31,627][15444] Updated weights for policy 0, policy_version 10241 (0.0013) [2024-08-05 15:38:33,118][15372] Fps is (10 sec: 23756.8, 60 sec: 23893.3, 300 sec: 24020.6). Total num frames: 83927040. Throughput: 0: 6012.2. Samples: 20989110. Policy #0 lag: (min: 2.0, avg: 3.9, max: 8.0) [2024-08-05 15:38:33,119][15372] Avg episode reward: [(0, '34.044')] [2024-08-05 15:38:34,566][15444] Updated weights for policy 0, policy_version 10251 (0.0032) [2024-08-05 15:38:38,118][15372] Fps is (10 sec: 22938.3, 60 sec: 24166.4, 300 sec: 24020.6). Total num frames: 84049920. Throughput: 0: 6013.1. Samples: 21008350. Policy #0 lag: (min: 2.0, avg: 3.9, max: 8.0) [2024-08-05 15:38:38,126][15372] Avg episode reward: [(0, '34.726')] [2024-08-05 15:38:38,243][15444] Updated weights for policy 0, policy_version 10261 (0.0034) [2024-08-05 15:38:38,356][15417] Signal inference workers to stop experience collection... (3600 times) [2024-08-05 15:38:38,357][15417] Signal inference workers to resume experience collection... (3600 times) [2024-08-05 15:38:38,404][15444] InferenceWorker_p0-w0: stopping experience collection (3600 times) [2024-08-05 15:38:38,416][15444] InferenceWorker_p0-w0: resuming experience collection (3600 times) [2024-08-05 15:38:41,830][15444] Updated weights for policy 0, policy_version 10271 (0.0021) [2024-08-05 15:38:43,119][15372] Fps is (10 sec: 24575.5, 60 sec: 24029.8, 300 sec: 24048.4). Total num frames: 84172800. Throughput: 0: 6020.6. Samples: 21043820. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 15:38:43,119][15372] Avg episode reward: [(0, '34.006')] [2024-08-05 15:38:44,913][15444] Updated weights for policy 0, policy_version 10281 (0.0014) [2024-08-05 15:38:48,118][15372] Fps is (10 sec: 23756.7, 60 sec: 23893.3, 300 sec: 23992.9). Total num frames: 84287488. Throughput: 0: 6012.0. Samples: 21079060. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 15:38:48,126][15372] Avg episode reward: [(0, '34.369')] [2024-08-05 15:38:48,598][15444] Updated weights for policy 0, policy_version 10291 (0.0017) [2024-08-05 15:38:51,739][15444] Updated weights for policy 0, policy_version 10301 (0.0011) [2024-08-05 15:38:53,118][15372] Fps is (10 sec: 23757.3, 60 sec: 24029.9, 300 sec: 24021.4). Total num frames: 84410368. Throughput: 0: 6017.8. Samples: 21097830. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 15:38:53,119][15372] Avg episode reward: [(0, '34.666')] [2024-08-05 15:38:55,334][15444] Updated weights for policy 0, policy_version 10311 (0.0033) [2024-08-05 15:38:58,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24029.9, 300 sec: 24048.4). Total num frames: 84533248. Throughput: 0: 6014.5. Samples: 21133680. Policy #0 lag: (min: 0.0, avg: 4.2, max: 9.0) [2024-08-05 15:38:58,119][15372] Avg episode reward: [(0, '35.164')] [2024-08-05 15:38:58,672][15444] Updated weights for policy 0, policy_version 10321 (0.0012) [2024-08-05 15:39:01,950][15444] Updated weights for policy 0, policy_version 10331 (0.0018) [2024-08-05 15:39:03,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24166.4, 300 sec: 24020.6). Total num frames: 84656128. Throughput: 0: 6010.0. Samples: 21169550. Policy #0 lag: (min: 0.0, avg: 4.2, max: 9.0) [2024-08-05 15:39:03,119][15372] Avg episode reward: [(0, '34.651')] [2024-08-05 15:39:05,569][15444] Updated weights for policy 0, policy_version 10341 (0.0022) [2024-08-05 15:39:08,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24302.9, 300 sec: 24048.4). Total num frames: 84779008. Throughput: 0: 6015.5. Samples: 21188140. Policy #0 lag: (min: 1.0, avg: 4.1, max: 7.0) [2024-08-05 15:39:08,126][15372] Avg episode reward: [(0, '34.690')] [2024-08-05 15:39:08,926][15444] Updated weights for policy 0, policy_version 10351 (0.0016) [2024-08-05 15:39:12,444][15444] Updated weights for policy 0, policy_version 10361 (0.0012) [2024-08-05 15:39:13,120][15372] Fps is (10 sec: 23752.9, 60 sec: 23892.7, 300 sec: 24020.5). Total num frames: 84893696. Throughput: 0: 5982.7. Samples: 21223700. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 15:39:13,128][15372] Avg episode reward: [(0, '34.676')] [2024-08-05 15:39:15,582][15444] Updated weights for policy 0, policy_version 10371 (0.0023) [2024-08-05 15:39:18,119][15372] Fps is (10 sec: 23756.4, 60 sec: 24166.3, 300 sec: 24020.6). Total num frames: 85016576. Throughput: 0: 6003.8. Samples: 21259280. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 15:39:18,126][15372] Avg episode reward: [(0, '33.922')] [2024-08-05 15:39:18,810][15417] Signal inference workers to stop experience collection... (3650 times) [2024-08-05 15:39:18,818][15417] Signal inference workers to resume experience collection... (3650 times) [2024-08-05 15:39:18,856][15444] InferenceWorker_p0-w0: stopping experience collection (3650 times) [2024-08-05 15:39:18,861][15444] InferenceWorker_p0-w0: resuming experience collection (3650 times) [2024-08-05 15:39:19,164][15444] Updated weights for policy 0, policy_version 10381 (0.0021) [2024-08-05 15:39:22,719][15444] Updated weights for policy 0, policy_version 10391 (0.0019) [2024-08-05 15:39:23,119][15372] Fps is (10 sec: 23760.5, 60 sec: 24029.8, 300 sec: 24020.6). Total num frames: 85131264. Throughput: 0: 5983.1. Samples: 21277590. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 15:39:23,119][15372] Avg episode reward: [(0, '33.847')] [2024-08-05 15:39:26,090][15444] Updated weights for policy 0, policy_version 10401 (0.0015) [2024-08-05 15:39:28,119][15372] Fps is (10 sec: 23756.5, 60 sec: 23893.3, 300 sec: 24020.6). Total num frames: 85254144. Throughput: 0: 5988.4. Samples: 21313300. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 15:39:28,121][15372] Avg episode reward: [(0, '34.550')] [2024-08-05 15:39:29,652][15444] Updated weights for policy 0, policy_version 10411 (0.0011) [2024-08-05 15:39:32,780][15444] Updated weights for policy 0, policy_version 10421 (0.0016) [2024-08-05 15:39:33,119][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.4, 300 sec: 24020.6). Total num frames: 85377024. Throughput: 0: 6009.1. Samples: 21349470. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 15:39:33,119][15372] Avg episode reward: [(0, '35.394')] [2024-08-05 15:39:36,154][15444] Updated weights for policy 0, policy_version 10431 (0.0018) [2024-08-05 15:39:38,128][15372] Fps is (10 sec: 23735.1, 60 sec: 24026.1, 300 sec: 24019.8). Total num frames: 85491712. Throughput: 0: 6000.7. Samples: 21367920. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 15:39:38,136][15372] Avg episode reward: [(0, '35.528')] [2024-08-05 15:39:39,836][15444] Updated weights for policy 0, policy_version 10441 (0.0030) [2024-08-05 15:39:42,688][15444] Updated weights for policy 0, policy_version 10451 (0.0010) [2024-08-05 15:39:43,119][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.9, 300 sec: 23992.8). Total num frames: 85614592. Throughput: 0: 6031.1. Samples: 21405080. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 15:39:43,119][15372] Avg episode reward: [(0, '35.059')] [2024-08-05 15:39:46,399][15444] Updated weights for policy 0, policy_version 10461 (0.0014) [2024-08-05 15:39:48,118][15372] Fps is (10 sec: 23779.3, 60 sec: 24029.9, 300 sec: 23992.8). Total num frames: 85729280. Throughput: 0: 6019.6. Samples: 21440430. Policy #0 lag: (min: 0.0, avg: 3.2, max: 7.0) [2024-08-05 15:39:48,119][15372] Avg episode reward: [(0, '35.518')] [2024-08-05 15:39:49,699][15444] Updated weights for policy 0, policy_version 10471 (0.0020) [2024-08-05 15:39:53,001][15444] Updated weights for policy 0, policy_version 10481 (0.0017) [2024-08-05 15:39:53,121][15372] Fps is (10 sec: 24571.1, 60 sec: 24165.5, 300 sec: 24048.2). Total num frames: 85860352. Throughput: 0: 6027.5. Samples: 21459390. Policy #0 lag: (min: 0.0, avg: 3.2, max: 7.0) [2024-08-05 15:39:53,121][15372] Avg episode reward: [(0, '35.046')] [2024-08-05 15:39:56,644][15444] Updated weights for policy 0, policy_version 10491 (0.0012) [2024-08-05 15:39:58,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24029.9, 300 sec: 24020.6). Total num frames: 85975040. Throughput: 0: 6037.8. Samples: 21495390. Policy #0 lag: (min: 0.0, avg: 3.1, max: 7.0) [2024-08-05 15:39:58,119][15372] Avg episode reward: [(0, '35.555')] [2024-08-05 15:39:59,552][15444] Updated weights for policy 0, policy_version 10501 (0.0034) [2024-08-05 15:40:00,824][15417] Signal inference workers to stop experience collection... (3700 times) [2024-08-05 15:40:00,826][15417] Signal inference workers to resume experience collection... (3700 times) [2024-08-05 15:40:00,878][15444] InferenceWorker_p0-w0: stopping experience collection (3700 times) [2024-08-05 15:40:00,885][15444] InferenceWorker_p0-w0: resuming experience collection (3700 times) [2024-08-05 15:40:03,118][15372] Fps is (10 sec: 23761.9, 60 sec: 24029.9, 300 sec: 24020.6). Total num frames: 86097920. Throughput: 0: 6065.8. Samples: 21532240. Policy #0 lag: (min: 0.0, avg: 3.1, max: 7.0) [2024-08-05 15:40:03,119][15372] Avg episode reward: [(0, '34.369')] [2024-08-05 15:40:03,235][15444] Updated weights for policy 0, policy_version 10511 (0.0012) [2024-08-05 15:40:06,523][15444] Updated weights for policy 0, policy_version 10521 (0.0018) [2024-08-05 15:40:08,118][15372] Fps is (10 sec: 25395.2, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 86228992. Throughput: 0: 6068.5. Samples: 21550670. Policy #0 lag: (min: 0.0, avg: 3.1, max: 7.0) [2024-08-05 15:40:08,119][15372] Avg episode reward: [(0, '32.833')] [2024-08-05 15:40:09,757][15444] Updated weights for policy 0, policy_version 10531 (0.0028) [2024-08-05 15:40:13,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24167.1, 300 sec: 24131.7). Total num frames: 86343680. Throughput: 0: 6092.5. Samples: 21587460. Policy #0 lag: (min: 0.0, avg: 3.2, max: 7.0) [2024-08-05 15:40:13,126][15372] Avg episode reward: [(0, '33.054')] [2024-08-05 15:40:13,501][15444] Updated weights for policy 0, policy_version 10541 (0.0023) [2024-08-05 15:40:16,395][15444] Updated weights for policy 0, policy_version 10551 (0.0036) [2024-08-05 15:40:18,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.5, 300 sec: 24131.7). Total num frames: 86466560. Throughput: 0: 6078.7. Samples: 21623010. Policy #0 lag: (min: 0.0, avg: 3.2, max: 7.0) [2024-08-05 15:40:18,119][15372] Avg episode reward: [(0, '34.899')] [2024-08-05 15:40:18,123][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000010555_86466560.pth... [2024-08-05 15:40:18,257][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000009846_80658432.pth [2024-08-05 15:40:20,244][15444] Updated weights for policy 0, policy_version 10561 (0.0013) [2024-08-05 15:40:23,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24303.0, 300 sec: 24159.5). Total num frames: 86589440. Throughput: 0: 6074.4. Samples: 21641210. Policy #0 lag: (min: 0.0, avg: 3.4, max: 8.0) [2024-08-05 15:40:23,126][15372] Avg episode reward: [(0, '36.040')] [2024-08-05 15:40:23,127][15417] Saving new best policy, reward=36.040! [2024-08-05 15:40:23,344][15444] Updated weights for policy 0, policy_version 10571 (0.0015) [2024-08-05 15:40:26,819][15444] Updated weights for policy 0, policy_version 10581 (0.0015) [2024-08-05 15:40:28,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.5, 300 sec: 24159.5). Total num frames: 86704128. Throughput: 0: 6043.3. Samples: 21677030. Policy #0 lag: (min: 0.0, avg: 3.4, max: 8.0) [2024-08-05 15:40:28,119][15372] Avg episode reward: [(0, '36.111')] [2024-08-05 15:40:28,197][15417] Saving new best policy, reward=36.111! [2024-08-05 15:40:30,207][15444] Updated weights for policy 0, policy_version 10591 (0.0018) [2024-08-05 15:40:33,118][15372] Fps is (10 sec: 22937.6, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 86818816. Throughput: 0: 6065.3. Samples: 21713370. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 15:40:33,126][15372] Avg episode reward: [(0, '35.675')] [2024-08-05 15:40:33,613][15444] Updated weights for policy 0, policy_version 10601 (0.0012) [2024-08-05 15:40:35,539][15417] Signal inference workers to stop experience collection... (3750 times) [2024-08-05 15:40:35,540][15417] Signal inference workers to resume experience collection... (3750 times) [2024-08-05 15:40:35,592][15444] InferenceWorker_p0-w0: stopping experience collection (3750 times) [2024-08-05 15:40:35,595][15444] InferenceWorker_p0-w0: resuming experience collection (3750 times) [2024-08-05 15:40:37,155][15444] Updated weights for policy 0, policy_version 10611 (0.0016) [2024-08-05 15:40:38,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24306.8, 300 sec: 24187.2). Total num frames: 86949888. Throughput: 0: 6035.6. Samples: 21730980. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 15:40:38,119][15372] Avg episode reward: [(0, '35.812')] [2024-08-05 15:40:40,219][15444] Updated weights for policy 0, policy_version 10621 (0.0021) [2024-08-05 15:40:43,118][15372] Fps is (10 sec: 25395.0, 60 sec: 24303.0, 300 sec: 24187.2). Total num frames: 87072768. Throughput: 0: 6068.9. Samples: 21768490. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 15:40:43,126][15372] Avg episode reward: [(0, '35.756')] [2024-08-05 15:40:43,826][15444] Updated weights for policy 0, policy_version 10631 (0.0020) [2024-08-05 15:40:47,139][15444] Updated weights for policy 0, policy_version 10641 (0.0012) [2024-08-05 15:40:48,119][15372] Fps is (10 sec: 24575.1, 60 sec: 24439.3, 300 sec: 24187.2). Total num frames: 87195648. Throughput: 0: 6050.4. Samples: 21804510. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 15:40:48,119][15372] Avg episode reward: [(0, '35.415')] [2024-08-05 15:40:50,330][15444] Updated weights for policy 0, policy_version 10651 (0.0011) [2024-08-05 15:40:53,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24303.8, 300 sec: 24187.2). Total num frames: 87318528. Throughput: 0: 6055.8. Samples: 21823180. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 15:40:53,119][15372] Avg episode reward: [(0, '35.173')] [2024-08-05 15:40:53,771][15444] Updated weights for policy 0, policy_version 10661 (0.0017) [2024-08-05 15:40:57,032][15444] Updated weights for policy 0, policy_version 10671 (0.0012) [2024-08-05 15:40:58,119][15372] Fps is (10 sec: 23757.2, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 87433216. Throughput: 0: 6047.1. Samples: 21859580. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 15:40:58,119][15372] Avg episode reward: [(0, '34.634')] [2024-08-05 15:41:00,633][15444] Updated weights for policy 0, policy_version 10681 (0.0015) [2024-08-05 15:41:03,119][15372] Fps is (10 sec: 24574.7, 60 sec: 24439.2, 300 sec: 24215.0). Total num frames: 87564288. Throughput: 0: 6085.5. Samples: 21896860. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 15:41:03,119][15372] Avg episode reward: [(0, '34.091')] [2024-08-05 15:41:03,545][15444] Updated weights for policy 0, policy_version 10691 (0.0021) [2024-08-05 15:41:07,205][15444] Updated weights for policy 0, policy_version 10701 (0.0020) [2024-08-05 15:41:08,118][15372] Fps is (10 sec: 25395.7, 60 sec: 24302.9, 300 sec: 24243.2). Total num frames: 87687168. Throughput: 0: 6069.3. Samples: 21914330. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 15:41:08,119][15372] Avg episode reward: [(0, '34.191')] [2024-08-05 15:41:09,438][15417] Signal inference workers to stop experience collection... (3800 times) [2024-08-05 15:41:09,438][15417] Signal inference workers to resume experience collection... (3800 times) [2024-08-05 15:41:09,502][15444] InferenceWorker_p0-w0: stopping experience collection (3800 times) [2024-08-05 15:41:09,503][15444] InferenceWorker_p0-w0: resuming experience collection (3800 times) [2024-08-05 15:41:10,976][15444] Updated weights for policy 0, policy_version 10711 (0.0023) [2024-08-05 15:41:13,118][15372] Fps is (10 sec: 22939.0, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 87793664. Throughput: 0: 6092.5. Samples: 21951190. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 15:41:13,119][15372] Avg episode reward: [(0, '34.984')] [2024-08-05 15:41:13,814][15444] Updated weights for policy 0, policy_version 10721 (0.0025) [2024-08-05 15:41:17,547][15444] Updated weights for policy 0, policy_version 10731 (0.0012) [2024-08-05 15:41:18,119][15372] Fps is (10 sec: 23755.7, 60 sec: 24302.7, 300 sec: 24215.0). Total num frames: 87924736. Throughput: 0: 6089.0. Samples: 21987380. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 15:41:18,120][15372] Avg episode reward: [(0, '35.013')] [2024-08-05 15:41:20,595][15444] Updated weights for policy 0, policy_version 10741 (0.0010) [2024-08-05 15:41:23,119][15372] Fps is (10 sec: 25393.8, 60 sec: 24302.7, 300 sec: 24242.8). Total num frames: 88047616. Throughput: 0: 6105.9. Samples: 22005750. Policy #0 lag: (min: 1.0, avg: 3.4, max: 7.0) [2024-08-05 15:41:23,127][15372] Avg episode reward: [(0, '34.290')] [2024-08-05 15:41:24,089][15444] Updated weights for policy 0, policy_version 10751 (0.0026) [2024-08-05 15:41:27,634][15444] Updated weights for policy 0, policy_version 10761 (0.0018) [2024-08-05 15:41:28,118][15372] Fps is (10 sec: 24577.1, 60 sec: 24439.5, 300 sec: 24215.0). Total num frames: 88170496. Throughput: 0: 6085.1. Samples: 22042320. Policy #0 lag: (min: 1.0, avg: 3.8, max: 9.0) [2024-08-05 15:41:28,119][15372] Avg episode reward: [(0, '34.057')] [2024-08-05 15:41:30,874][15444] Updated weights for policy 0, policy_version 10771 (0.0015) [2024-08-05 15:41:33,119][15372] Fps is (10 sec: 23756.5, 60 sec: 24439.2, 300 sec: 24215.0). Total num frames: 88285184. Throughput: 0: 6067.3. Samples: 22077540. Policy #0 lag: (min: 1.0, avg: 3.8, max: 9.0) [2024-08-05 15:41:33,128][15372] Avg episode reward: [(0, '33.878')] [2024-08-05 15:41:34,410][15444] Updated weights for policy 0, policy_version 10781 (0.0023) [2024-08-05 15:41:37,952][15444] Updated weights for policy 0, policy_version 10791 (0.0019) [2024-08-05 15:41:38,118][15372] Fps is (10 sec: 22937.6, 60 sec: 24166.4, 300 sec: 24187.3). Total num frames: 88399872. Throughput: 0: 6058.2. Samples: 22095800. Policy #0 lag: (min: 0.0, avg: 3.1, max: 7.0) [2024-08-05 15:41:38,119][15372] Avg episode reward: [(0, '35.408')] [2024-08-05 15:41:40,921][15444] Updated weights for policy 0, policy_version 10801 (0.0011) [2024-08-05 15:41:43,118][15372] Fps is (10 sec: 24577.6, 60 sec: 24303.0, 300 sec: 24215.0). Total num frames: 88530944. Throughput: 0: 6045.1. Samples: 22131610. Policy #0 lag: (min: 0.0, avg: 3.1, max: 7.0) [2024-08-05 15:41:43,126][15372] Avg episode reward: [(0, '35.372')] [2024-08-05 15:41:44,736][15444] Updated weights for policy 0, policy_version 10811 (0.0021) [2024-08-05 15:41:47,976][15444] Updated weights for policy 0, policy_version 10821 (0.0013) [2024-08-05 15:41:48,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.5, 300 sec: 24187.3). Total num frames: 88645632. Throughput: 0: 6016.5. Samples: 22167600. Policy #0 lag: (min: 0.0, avg: 2.9, max: 7.0) [2024-08-05 15:41:48,119][15372] Avg episode reward: [(0, '34.987')] [2024-08-05 15:41:49,048][15417] Signal inference workers to stop experience collection... (3850 times) [2024-08-05 15:41:49,048][15417] Signal inference workers to resume experience collection... (3850 times) [2024-08-05 15:41:49,105][15444] InferenceWorker_p0-w0: stopping experience collection (3850 times) [2024-08-05 15:41:49,106][15444] InferenceWorker_p0-w0: resuming experience collection (3850 times) [2024-08-05 15:41:51,165][15444] Updated weights for policy 0, policy_version 10831 (0.0024) [2024-08-05 15:41:53,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 88768512. Throughput: 0: 6046.2. Samples: 22186410. Policy #0 lag: (min: 0.0, avg: 2.9, max: 7.0) [2024-08-05 15:41:53,119][15372] Avg episode reward: [(0, '34.404')] [2024-08-05 15:41:54,769][15444] Updated weights for policy 0, policy_version 10841 (0.0019) [2024-08-05 15:41:58,065][15444] Updated weights for policy 0, policy_version 10851 (0.0016) [2024-08-05 15:41:58,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24303.0, 300 sec: 24187.2). Total num frames: 88891392. Throughput: 0: 6050.9. Samples: 22223480. Policy #0 lag: (min: 1.0, avg: 4.3, max: 9.0) [2024-08-05 15:41:58,119][15372] Avg episode reward: [(0, '34.366')] [2024-08-05 15:42:01,335][15444] Updated weights for policy 0, policy_version 10861 (0.0011) [2024-08-05 15:42:03,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24166.6, 300 sec: 24215.0). Total num frames: 89014272. Throughput: 0: 6049.6. Samples: 22259610. Policy #0 lag: (min: 2.0, avg: 4.0, max: 8.0) [2024-08-05 15:42:03,126][15372] Avg episode reward: [(0, '35.853')] [2024-08-05 15:42:04,633][15444] Updated weights for policy 0, policy_version 10871 (0.0011) [2024-08-05 15:42:07,890][15444] Updated weights for policy 0, policy_version 10881 (0.0024) [2024-08-05 15:42:08,119][15372] Fps is (10 sec: 24575.6, 60 sec: 24166.3, 300 sec: 24187.2). Total num frames: 89137152. Throughput: 0: 6065.6. Samples: 22278700. Policy #0 lag: (min: 2.0, avg: 4.0, max: 8.0) [2024-08-05 15:42:08,121][15372] Avg episode reward: [(0, '36.368')] [2024-08-05 15:42:08,153][15417] Saving new best policy, reward=36.368! [2024-08-05 15:42:11,522][15444] Updated weights for policy 0, policy_version 10891 (0.0019) [2024-08-05 15:42:13,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24302.9, 300 sec: 24159.5). Total num frames: 89251840. Throughput: 0: 6044.2. Samples: 22314310. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 15:42:13,119][15372] Avg episode reward: [(0, '33.754')] [2024-08-05 15:42:14,730][15444] Updated weights for policy 0, policy_version 10901 (0.0010) [2024-08-05 15:42:18,118][15372] Fps is (10 sec: 23757.2, 60 sec: 24166.6, 300 sec: 24187.3). Total num frames: 89374720. Throughput: 0: 6080.3. Samples: 22351150. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 15:42:18,126][15372] Avg episode reward: [(0, '34.064')] [2024-08-05 15:42:18,141][15444] Updated weights for policy 0, policy_version 10911 (0.0025) [2024-08-05 15:42:18,145][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000010911_89382912.pth... [2024-08-05 15:42:18,290][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000010201_83566592.pth [2024-08-05 15:42:21,727][15444] Updated weights for policy 0, policy_version 10921 (0.0026) [2024-08-05 15:42:23,119][15372] Fps is (10 sec: 24575.7, 60 sec: 24166.5, 300 sec: 24215.0). Total num frames: 89497600. Throughput: 0: 6066.2. Samples: 22368780. Policy #0 lag: (min: 1.0, avg: 4.4, max: 8.0) [2024-08-05 15:42:23,119][15372] Avg episode reward: [(0, '34.721')] [2024-08-05 15:42:24,957][15444] Updated weights for policy 0, policy_version 10931 (0.0013) [2024-08-05 15:42:28,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 89620480. Throughput: 0: 6076.7. Samples: 22405060. Policy #0 lag: (min: 1.0, avg: 4.7, max: 9.0) [2024-08-05 15:42:28,126][15372] Avg episode reward: [(0, '34.681')] [2024-08-05 15:42:28,561][15444] Updated weights for policy 0, policy_version 10941 (0.0015) [2024-08-05 15:42:28,878][15417] Signal inference workers to stop experience collection... (3900 times) [2024-08-05 15:42:28,878][15417] Signal inference workers to resume experience collection... (3900 times) [2024-08-05 15:42:28,951][15444] InferenceWorker_p0-w0: stopping experience collection (3900 times) [2024-08-05 15:42:28,956][15444] InferenceWorker_p0-w0: resuming experience collection (3900 times) [2024-08-05 15:42:31,806][15444] Updated weights for policy 0, policy_version 10951 (0.0012) [2024-08-05 15:42:33,119][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.6, 300 sec: 24187.2). Total num frames: 89735168. Throughput: 0: 6081.8. Samples: 22441280. Policy #0 lag: (min: 1.0, avg: 4.7, max: 9.0) [2024-08-05 15:42:33,127][15372] Avg episode reward: [(0, '34.742')] [2024-08-05 15:42:35,160][15444] Updated weights for policy 0, policy_version 10961 (0.0028) [2024-08-05 15:42:38,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24439.5, 300 sec: 24187.2). Total num frames: 89866240. Throughput: 0: 6077.3. Samples: 22459890. Policy #0 lag: (min: 1.0, avg: 4.0, max: 7.0) [2024-08-05 15:42:38,126][15372] Avg episode reward: [(0, '35.452')] [2024-08-05 15:42:38,459][15444] Updated weights for policy 0, policy_version 10971 (0.0025) [2024-08-05 15:42:41,910][15444] Updated weights for policy 0, policy_version 10981 (0.0043) [2024-08-05 15:42:43,118][15372] Fps is (10 sec: 24576.3, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 89980928. Throughput: 0: 6061.8. Samples: 22496260. Policy #0 lag: (min: 1.0, avg: 4.0, max: 7.0) [2024-08-05 15:42:43,119][15372] Avg episode reward: [(0, '36.273')] [2024-08-05 15:42:45,198][15444] Updated weights for policy 0, policy_version 10991 (0.0011) [2024-08-05 15:42:48,118][15372] Fps is (10 sec: 23756.6, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 90103808. Throughput: 0: 6059.5. Samples: 22532290. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 15:42:48,126][15372] Avg episode reward: [(0, '36.044')] [2024-08-05 15:42:48,636][15444] Updated weights for policy 0, policy_version 11001 (0.0020) [2024-08-05 15:42:52,135][15444] Updated weights for policy 0, policy_version 11011 (0.0025) [2024-08-05 15:42:53,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 90226688. Throughput: 0: 6034.7. Samples: 22550260. Policy #0 lag: (min: 1.0, avg: 4.0, max: 8.0) [2024-08-05 15:42:53,119][15372] Avg episode reward: [(0, '35.252')] [2024-08-05 15:42:55,525][15444] Updated weights for policy 0, policy_version 11021 (0.0013) [2024-08-05 15:42:58,119][15372] Fps is (10 sec: 23755.4, 60 sec: 24166.1, 300 sec: 24187.2). Total num frames: 90341376. Throughput: 0: 6039.9. Samples: 22586110. Policy #0 lag: (min: 1.0, avg: 4.0, max: 8.0) [2024-08-05 15:42:58,127][15372] Avg episode reward: [(0, '34.725')] [2024-08-05 15:42:59,015][15444] Updated weights for policy 0, policy_version 11031 (0.0022) [2024-08-05 15:43:02,311][15444] Updated weights for policy 0, policy_version 11041 (0.0021) [2024-08-05 15:43:03,119][15372] Fps is (10 sec: 23756.0, 60 sec: 24166.2, 300 sec: 24215.0). Total num frames: 90464256. Throughput: 0: 6027.7. Samples: 22622400. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 15:43:03,127][15372] Avg episode reward: [(0, '34.789')] [2024-08-05 15:43:05,538][15444] Updated weights for policy 0, policy_version 11051 (0.0011) [2024-08-05 15:43:08,119][15372] Fps is (10 sec: 23757.7, 60 sec: 24029.8, 300 sec: 24131.7). Total num frames: 90578944. Throughput: 0: 6051.1. Samples: 22641080. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 15:43:08,119][15372] Avg episode reward: [(0, '34.909')] [2024-08-05 15:43:09,019][15444] Updated weights for policy 0, policy_version 11061 (0.0013) [2024-08-05 15:43:12,516][15444] Updated weights for policy 0, policy_version 11071 (0.0011) [2024-08-05 15:43:13,118][15372] Fps is (10 sec: 23757.8, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 90701824. Throughput: 0: 6042.5. Samples: 22676970. Policy #0 lag: (min: 0.0, avg: 4.1, max: 9.0) [2024-08-05 15:43:13,119][15372] Avg episode reward: [(0, '34.111')] [2024-08-05 15:43:15,567][15444] Updated weights for policy 0, policy_version 11081 (0.0014) [2024-08-05 15:43:18,118][15372] Fps is (10 sec: 25395.8, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 90832896. Throughput: 0: 6042.0. Samples: 22713170. Policy #0 lag: (min: 0.0, avg: 4.1, max: 9.0) [2024-08-05 15:43:18,126][15372] Avg episode reward: [(0, '35.201')] [2024-08-05 15:43:19,257][15444] Updated weights for policy 0, policy_version 11091 (0.0014) [2024-08-05 15:43:21,681][15417] Signal inference workers to stop experience collection... (3950 times) [2024-08-05 15:43:21,682][15417] Signal inference workers to resume experience collection... (3950 times) [2024-08-05 15:43:21,762][15444] InferenceWorker_p0-w0: stopping experience collection (3950 times) [2024-08-05 15:43:21,762][15444] InferenceWorker_p0-w0: resuming experience collection (3950 times) [2024-08-05 15:43:22,589][15444] Updated weights for policy 0, policy_version 11101 (0.0011) [2024-08-05 15:43:23,118][15372] Fps is (10 sec: 24575.8, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 90947584. Throughput: 0: 6030.0. Samples: 22731240. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 15:43:23,119][15372] Avg episode reward: [(0, '35.710')] [2024-08-05 15:43:25,700][15444] Updated weights for policy 0, policy_version 11111 (0.0011) [2024-08-05 15:43:28,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 91070464. Throughput: 0: 6039.3. Samples: 22768030. Policy #0 lag: (min: 1.0, avg: 3.5, max: 8.0) [2024-08-05 15:43:28,126][15372] Avg episode reward: [(0, '34.929')] [2024-08-05 15:43:29,429][15444] Updated weights for policy 0, policy_version 11121 (0.0017) [2024-08-05 15:43:32,669][15444] Updated weights for policy 0, policy_version 11131 (0.0020) [2024-08-05 15:43:33,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24303.0, 300 sec: 24215.0). Total num frames: 91193344. Throughput: 0: 6042.0. Samples: 22804180. Policy #0 lag: (min: 1.0, avg: 3.5, max: 8.0) [2024-08-05 15:43:33,119][15372] Avg episode reward: [(0, '34.539')] [2024-08-05 15:43:35,965][15444] Updated weights for policy 0, policy_version 11141 (0.0034) [2024-08-05 15:43:38,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 91316224. Throughput: 0: 6048.7. Samples: 22822450. Policy #0 lag: (min: 0.0, avg: 4.3, max: 8.0) [2024-08-05 15:43:38,126][15372] Avg episode reward: [(0, '34.441')] [2024-08-05 15:43:39,510][15444] Updated weights for policy 0, policy_version 11151 (0.0023) [2024-08-05 15:43:42,968][15444] Updated weights for policy 0, policy_version 11161 (0.0028) [2024-08-05 15:43:43,119][15372] Fps is (10 sec: 23755.6, 60 sec: 24166.2, 300 sec: 24215.0). Total num frames: 91430912. Throughput: 0: 6053.6. Samples: 22858520. Policy #0 lag: (min: 0.0, avg: 4.3, max: 8.0) [2024-08-05 15:43:43,119][15372] Avg episode reward: [(0, '34.487')] [2024-08-05 15:43:46,202][15444] Updated weights for policy 0, policy_version 11171 (0.0021) [2024-08-05 15:43:48,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 91553792. Throughput: 0: 6043.6. Samples: 22894360. Policy #0 lag: (min: 1.0, avg: 4.4, max: 8.0) [2024-08-05 15:43:48,126][15372] Avg episode reward: [(0, '35.387')] [2024-08-05 15:43:49,722][15444] Updated weights for policy 0, policy_version 11181 (0.0019) [2024-08-05 15:43:52,946][15444] Updated weights for policy 0, policy_version 11191 (0.0024) [2024-08-05 15:43:53,118][15372] Fps is (10 sec: 24577.1, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 91676672. Throughput: 0: 6039.4. Samples: 22912850. Policy #0 lag: (min: 1.0, avg: 4.4, max: 8.0) [2024-08-05 15:43:53,119][15372] Avg episode reward: [(0, '35.361')] [2024-08-05 15:43:56,349][15444] Updated weights for policy 0, policy_version 11201 (0.0012) [2024-08-05 15:43:58,119][15372] Fps is (10 sec: 23755.7, 60 sec: 24166.5, 300 sec: 24187.2). Total num frames: 91791360. Throughput: 0: 6048.2. Samples: 22949140. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 15:43:58,127][15372] Avg episode reward: [(0, '34.405')] [2024-08-05 15:43:59,957][15444] Updated weights for policy 0, policy_version 11211 (0.0014) [2024-08-05 15:44:03,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.5, 300 sec: 24187.2). Total num frames: 91914240. Throughput: 0: 6061.1. Samples: 22985920. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 15:44:03,126][15372] Avg episode reward: [(0, '33.877')] [2024-08-05 15:44:03,121][15444] Updated weights for policy 0, policy_version 11221 (0.0019) [2024-08-05 15:44:06,626][15444] Updated weights for policy 0, policy_version 11231 (0.0033) [2024-08-05 15:44:08,118][15372] Fps is (10 sec: 24577.1, 60 sec: 24303.0, 300 sec: 24215.1). Total num frames: 92037120. Throughput: 0: 6071.3. Samples: 23004450. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 15:44:08,126][15372] Avg episode reward: [(0, '35.316')] [2024-08-05 15:44:09,902][15444] Updated weights for policy 0, policy_version 11241 (0.0014) [2024-08-05 15:44:13,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 92160000. Throughput: 0: 6059.1. Samples: 23040690. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 15:44:13,126][15372] Avg episode reward: [(0, '35.014')] [2024-08-05 15:44:13,285][15444] Updated weights for policy 0, policy_version 11251 (0.0017) [2024-08-05 15:44:16,700][15444] Updated weights for policy 0, policy_version 11261 (0.0015) [2024-08-05 15:44:18,119][15372] Fps is (10 sec: 24574.4, 60 sec: 24166.1, 300 sec: 24242.7). Total num frames: 92282880. Throughput: 0: 6048.6. Samples: 23076370. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 15:44:18,119][15372] Avg episode reward: [(0, '34.628')] [2024-08-05 15:44:18,123][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000011265_92282880.pth... [2024-08-05 15:44:18,231][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000010555_86466560.pth [2024-08-05 15:44:19,854][15444] Updated weights for policy 0, policy_version 11271 (0.0011) [2024-08-05 15:44:23,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24302.9, 300 sec: 24242.8). Total num frames: 92405760. Throughput: 0: 6059.3. Samples: 23095120. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 15:44:23,126][15372] Avg episode reward: [(0, '35.216')] [2024-08-05 15:44:23,461][15444] Updated weights for policy 0, policy_version 11281 (0.0011) [2024-08-05 15:44:26,162][15417] Signal inference workers to stop experience collection... (4000 times) [2024-08-05 15:44:26,162][15417] Signal inference workers to resume experience collection... (4000 times) [2024-08-05 15:44:26,236][15444] InferenceWorker_p0-w0: stopping experience collection (4000 times) [2024-08-05 15:44:26,236][15444] InferenceWorker_p0-w0: resuming experience collection (4000 times) [2024-08-05 15:44:27,060][15444] Updated weights for policy 0, policy_version 11291 (0.0017) [2024-08-05 15:44:28,118][15372] Fps is (10 sec: 24577.5, 60 sec: 24302.9, 300 sec: 24242.8). Total num frames: 92528640. Throughput: 0: 6057.2. Samples: 23131090. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 15:44:28,119][15372] Avg episode reward: [(0, '35.577')] [2024-08-05 15:44:30,058][15444] Updated weights for policy 0, policy_version 11301 (0.0018) [2024-08-05 15:44:33,121][15372] Fps is (10 sec: 23750.8, 60 sec: 24165.3, 300 sec: 24243.3). Total num frames: 92643328. Throughput: 0: 6070.3. Samples: 23167540. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 15:44:33,129][15372] Avg episode reward: [(0, '35.468')] [2024-08-05 15:44:33,579][15444] Updated weights for policy 0, policy_version 11311 (0.0013) [2024-08-05 15:44:37,064][15444] Updated weights for policy 0, policy_version 11321 (0.0012) [2024-08-05 15:44:38,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24242.8). Total num frames: 92766208. Throughput: 0: 6062.7. Samples: 23185670. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 15:44:38,119][15372] Avg episode reward: [(0, '36.505')] [2024-08-05 15:44:38,129][15417] Saving new best policy, reward=36.505! [2024-08-05 15:44:40,446][15444] Updated weights for policy 0, policy_version 11331 (0.0032) [2024-08-05 15:44:43,132][15372] Fps is (10 sec: 24549.2, 60 sec: 24297.7, 300 sec: 24269.4). Total num frames: 92889088. Throughput: 0: 6061.6. Samples: 23221990. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 15:44:43,132][15372] Avg episode reward: [(0, '36.460')] [2024-08-05 15:44:43,768][15444] Updated weights for policy 0, policy_version 11341 (0.0028) [2024-08-05 15:44:46,972][15444] Updated weights for policy 0, policy_version 11351 (0.0019) [2024-08-05 15:44:48,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.4, 300 sec: 24215.2). Total num frames: 93003776. Throughput: 0: 6048.0. Samples: 23258080. Policy #0 lag: (min: 1.0, avg: 3.6, max: 7.0) [2024-08-05 15:44:48,119][15372] Avg episode reward: [(0, '35.389')] [2024-08-05 15:44:50,556][15444] Updated weights for policy 0, policy_version 11361 (0.0012) [2024-08-05 15:44:53,118][15372] Fps is (10 sec: 23788.9, 60 sec: 24166.4, 300 sec: 24242.8). Total num frames: 93126656. Throughput: 0: 6051.3. Samples: 23276760. Policy #0 lag: (min: 1.0, avg: 3.6, max: 7.0) [2024-08-05 15:44:53,119][15372] Avg episode reward: [(0, '34.748')] [2024-08-05 15:44:53,809][15444] Updated weights for policy 0, policy_version 11371 (0.0022) [2024-08-05 15:44:57,366][15444] Updated weights for policy 0, policy_version 11381 (0.0017) [2024-08-05 15:44:58,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24303.1, 300 sec: 24242.8). Total num frames: 93249536. Throughput: 0: 6045.8. Samples: 23312750. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 15:44:58,119][15372] Avg episode reward: [(0, '34.986')] [2024-08-05 15:45:00,456][15444] Updated weights for policy 0, policy_version 11391 (0.0012) [2024-08-05 15:45:03,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 93372416. Throughput: 0: 6058.5. Samples: 23349000. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 15:45:03,128][15372] Avg episode reward: [(0, '34.322')] [2024-08-05 15:45:04,150][15444] Updated weights for policy 0, policy_version 11401 (0.0022) [2024-08-05 15:45:07,741][15444] Updated weights for policy 0, policy_version 11411 (0.0013) [2024-08-05 15:45:08,118][15372] Fps is (10 sec: 23756.6, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 93487104. Throughput: 0: 6041.1. Samples: 23366970. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 15:45:08,119][15372] Avg episode reward: [(0, '35.055')] [2024-08-05 15:45:10,855][15444] Updated weights for policy 0, policy_version 11421 (0.0021) [2024-08-05 15:45:10,973][15417] Signal inference workers to stop experience collection... (4050 times) [2024-08-05 15:45:10,974][15417] Signal inference workers to resume experience collection... (4050 times) [2024-08-05 15:45:11,022][15444] InferenceWorker_p0-w0: stopping experience collection (4050 times) [2024-08-05 15:45:11,027][15444] InferenceWorker_p0-w0: resuming experience collection (4050 times) [2024-08-05 15:45:13,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 93609984. Throughput: 0: 6039.8. Samples: 23402880. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 15:45:13,126][15372] Avg episode reward: [(0, '35.647')] [2024-08-05 15:45:14,266][15444] Updated weights for policy 0, policy_version 11431 (0.0028) [2024-08-05 15:45:17,306][15444] Updated weights for policy 0, policy_version 11441 (0.0012) [2024-08-05 15:45:18,119][15372] Fps is (10 sec: 24575.1, 60 sec: 24166.5, 300 sec: 24215.0). Total num frames: 93732864. Throughput: 0: 6050.7. Samples: 23439810. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 15:45:18,127][15372] Avg episode reward: [(0, '34.914')] [2024-08-05 15:45:20,835][15444] Updated weights for policy 0, policy_version 11451 (0.0012) [2024-08-05 15:45:23,118][15372] Fps is (10 sec: 25395.1, 60 sec: 24303.0, 300 sec: 24270.5). Total num frames: 93863936. Throughput: 0: 6065.1. Samples: 23458600. Policy #0 lag: (min: 0.0, avg: 4.3, max: 8.0) [2024-08-05 15:45:23,119][15372] Avg episode reward: [(0, '35.155')] [2024-08-05 15:45:24,279][15444] Updated weights for policy 0, policy_version 11461 (0.0017) [2024-08-05 15:45:27,614][15444] Updated weights for policy 0, policy_version 11471 (0.0011) [2024-08-05 15:45:28,118][15372] Fps is (10 sec: 24577.0, 60 sec: 24166.4, 300 sec: 24270.5). Total num frames: 93978624. Throughput: 0: 6072.9. Samples: 23495190. Policy #0 lag: (min: 0.0, avg: 4.3, max: 8.0) [2024-08-05 15:45:28,119][15372] Avg episode reward: [(0, '34.867')] [2024-08-05 15:45:30,981][15444] Updated weights for policy 0, policy_version 11481 (0.0026) [2024-08-05 15:45:33,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24304.0, 300 sec: 24242.8). Total num frames: 94101504. Throughput: 0: 6070.0. Samples: 23531230. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 15:45:33,119][15372] Avg episode reward: [(0, '34.894')] [2024-08-05 15:45:34,471][15444] Updated weights for policy 0, policy_version 11491 (0.0021) [2024-08-05 15:45:37,815][15444] Updated weights for policy 0, policy_version 11501 (0.0018) [2024-08-05 15:45:38,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 94216192. Throughput: 0: 6056.7. Samples: 23549310. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 15:45:38,119][15372] Avg episode reward: [(0, '35.297')] [2024-08-05 15:45:41,144][15444] Updated weights for policy 0, policy_version 11511 (0.0015) [2024-08-05 15:45:43,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24171.8, 300 sec: 24215.0). Total num frames: 94339072. Throughput: 0: 6060.2. Samples: 23585460. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 15:45:43,126][15372] Avg episode reward: [(0, '34.881')] [2024-08-05 15:45:44,441][15444] Updated weights for policy 0, policy_version 11521 (0.0020) [2024-08-05 15:45:48,011][15444] Updated weights for policy 0, policy_version 11531 (0.0020) [2024-08-05 15:45:48,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 94461952. Throughput: 0: 6065.1. Samples: 23621930. Policy #0 lag: (min: 0.0, avg: 4.1, max: 9.0) [2024-08-05 15:45:48,119][15372] Avg episode reward: [(0, '35.102')] [2024-08-05 15:45:51,452][15444] Updated weights for policy 0, policy_version 11541 (0.0020) [2024-08-05 15:45:53,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24302.9, 300 sec: 24242.8). Total num frames: 94584832. Throughput: 0: 6080.0. Samples: 23640570. Policy #0 lag: (min: 0.0, avg: 4.1, max: 9.0) [2024-08-05 15:45:53,126][15372] Avg episode reward: [(0, '34.745')] [2024-08-05 15:45:54,672][15444] Updated weights for policy 0, policy_version 11551 (0.0021) [2024-08-05 15:45:57,949][15444] Updated weights for policy 0, policy_version 11561 (0.0009) [2024-08-05 15:45:58,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 94707712. Throughput: 0: 6094.2. Samples: 23677120. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 15:45:58,119][15372] Avg episode reward: [(0, '34.357')] [2024-08-05 15:46:01,422][15444] Updated weights for policy 0, policy_version 11571 (0.0012) [2024-08-05 15:46:03,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 94822400. Throughput: 0: 6083.8. Samples: 23713580. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 15:46:03,126][15372] Avg episode reward: [(0, '34.650')] [2024-08-05 15:46:04,085][15417] Signal inference workers to stop experience collection... (4100 times) [2024-08-05 15:46:04,085][15417] Signal inference workers to resume experience collection... (4100 times) [2024-08-05 15:46:04,124][15444] InferenceWorker_p0-w0: stopping experience collection (4100 times) [2024-08-05 15:46:04,124][15444] InferenceWorker_p0-w0: resuming experience collection (4100 times) [2024-08-05 15:46:04,662][15444] Updated weights for policy 0, policy_version 11581 (0.0030) [2024-08-05 15:46:07,896][15444] Updated weights for policy 0, policy_version 11591 (0.0011) [2024-08-05 15:46:08,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24439.5, 300 sec: 24270.5). Total num frames: 94953472. Throughput: 0: 6075.6. Samples: 23732000. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 15:46:08,119][15372] Avg episode reward: [(0, '33.843')] [2024-08-05 15:46:11,531][15444] Updated weights for policy 0, policy_version 11601 (0.0013) [2024-08-05 15:46:13,119][15372] Fps is (10 sec: 24575.4, 60 sec: 24302.8, 300 sec: 24215.0). Total num frames: 95068160. Throughput: 0: 6074.0. Samples: 23768520. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 15:46:13,119][15372] Avg episode reward: [(0, '34.651')] [2024-08-05 15:46:14,506][15444] Updated weights for policy 0, policy_version 11611 (0.0023) [2024-08-05 15:46:18,009][15444] Updated weights for policy 0, policy_version 11621 (0.0011) [2024-08-05 15:46:18,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24439.6, 300 sec: 24242.8). Total num frames: 95199232. Throughput: 0: 6097.1. Samples: 23805600. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 15:46:18,119][15372] Avg episode reward: [(0, '34.707')] [2024-08-05 15:46:18,122][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000011621_95199232.pth... [2024-08-05 15:46:18,249][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000010911_89382912.pth [2024-08-05 15:46:21,538][15444] Updated weights for policy 0, policy_version 11631 (0.0024) [2024-08-05 15:46:23,118][15372] Fps is (10 sec: 25395.8, 60 sec: 24303.0, 300 sec: 24242.8). Total num frames: 95322112. Throughput: 0: 6107.8. Samples: 23824160. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 15:46:23,126][15372] Avg episode reward: [(0, '35.698')] [2024-08-05 15:46:24,776][15444] Updated weights for policy 0, policy_version 11641 (0.0010) [2024-08-05 15:46:28,119][15372] Fps is (10 sec: 23756.4, 60 sec: 24302.8, 300 sec: 24242.8). Total num frames: 95436800. Throughput: 0: 6120.4. Samples: 23860880. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 15:46:28,126][15372] Avg episode reward: [(0, '35.540')] [2024-08-05 15:46:28,231][15444] Updated weights for policy 0, policy_version 11651 (0.0029) [2024-08-05 15:46:31,462][15444] Updated weights for policy 0, policy_version 11661 (0.0013) [2024-08-05 15:46:33,119][15372] Fps is (10 sec: 23755.4, 60 sec: 24302.7, 300 sec: 24270.5). Total num frames: 95559680. Throughput: 0: 6102.6. Samples: 23896550. Policy #0 lag: (min: 0.0, avg: 2.9, max: 8.0) [2024-08-05 15:46:33,127][15372] Avg episode reward: [(0, '35.406')] [2024-08-05 15:46:35,099][15444] Updated weights for policy 0, policy_version 11671 (0.0010) [2024-08-05 15:46:38,118][15372] Fps is (10 sec: 24576.6, 60 sec: 24439.5, 300 sec: 24242.8). Total num frames: 95682560. Throughput: 0: 6103.6. Samples: 23915230. Policy #0 lag: (min: 0.0, avg: 2.9, max: 8.0) [2024-08-05 15:46:38,126][15372] Avg episode reward: [(0, '35.223')] [2024-08-05 15:46:38,190][15444] Updated weights for policy 0, policy_version 11681 (0.0015) [2024-08-05 15:46:41,654][15444] Updated weights for policy 0, policy_version 11691 (0.0012) [2024-08-05 15:46:43,119][15372] Fps is (10 sec: 24577.0, 60 sec: 24439.4, 300 sec: 24270.5). Total num frames: 95805440. Throughput: 0: 6093.3. Samples: 23951320. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 15:46:43,126][15372] Avg episode reward: [(0, '34.395')] [2024-08-05 15:46:44,885][15444] Updated weights for policy 0, policy_version 11701 (0.0017) [2024-08-05 15:46:48,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24302.9, 300 sec: 24242.8). Total num frames: 95920128. Throughput: 0: 6098.7. Samples: 23988020. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 15:46:48,126][15372] Avg episode reward: [(0, '34.798')] [2024-08-05 15:46:48,509][15444] Updated weights for policy 0, policy_version 11711 (0.0011) [2024-08-05 15:46:51,720][15444] Updated weights for policy 0, policy_version 11721 (0.0015) [2024-08-05 15:46:53,118][15372] Fps is (10 sec: 24576.4, 60 sec: 24439.5, 300 sec: 24270.5). Total num frames: 96051200. Throughput: 0: 6093.8. Samples: 24006220. Policy #0 lag: (min: 2.0, avg: 4.2, max: 8.0) [2024-08-05 15:46:53,119][15372] Avg episode reward: [(0, '35.310')] [2024-08-05 15:46:54,556][15417] Signal inference workers to stop experience collection... (4150 times) [2024-08-05 15:46:54,561][15417] Signal inference workers to resume experience collection... (4150 times) [2024-08-05 15:46:54,623][15444] InferenceWorker_p0-w0: stopping experience collection (4150 times) [2024-08-05 15:46:54,623][15444] InferenceWorker_p0-w0: resuming experience collection (4150 times) [2024-08-05 15:46:55,120][15444] Updated weights for policy 0, policy_version 11731 (0.0016) [2024-08-05 15:46:58,119][15372] Fps is (10 sec: 24575.7, 60 sec: 24302.9, 300 sec: 24242.8). Total num frames: 96165888. Throughput: 0: 6100.9. Samples: 24043060. Policy #0 lag: (min: 2.0, avg: 4.2, max: 8.0) [2024-08-05 15:46:58,119][15372] Avg episode reward: [(0, '35.181')] [2024-08-05 15:46:58,759][15444] Updated weights for policy 0, policy_version 11741 (0.0038) [2024-08-05 15:47:01,824][15444] Updated weights for policy 0, policy_version 11751 (0.0031) [2024-08-05 15:47:03,119][15372] Fps is (10 sec: 23756.0, 60 sec: 24439.3, 300 sec: 24242.8). Total num frames: 96288768. Throughput: 0: 6066.9. Samples: 24078610. Policy #0 lag: (min: 0.0, avg: 4.3, max: 9.0) [2024-08-05 15:47:03,119][15372] Avg episode reward: [(0, '35.689')] [2024-08-05 15:47:05,337][15444] Updated weights for policy 0, policy_version 11761 (0.0019) [2024-08-05 15:47:08,124][15372] Fps is (10 sec: 24562.6, 60 sec: 24300.7, 300 sec: 24270.1). Total num frames: 96411648. Throughput: 0: 6061.9. Samples: 24096980. Policy #0 lag: (min: 0.0, avg: 4.3, max: 9.0) [2024-08-05 15:47:08,124][15372] Avg episode reward: [(0, '35.664')] [2024-08-05 15:47:08,771][15444] Updated weights for policy 0, policy_version 11771 (0.0014) [2024-08-05 15:47:11,883][15444] Updated weights for policy 0, policy_version 11781 (0.0012) [2024-08-05 15:47:13,118][15372] Fps is (10 sec: 24576.8, 60 sec: 24439.6, 300 sec: 24270.5). Total num frames: 96534528. Throughput: 0: 6055.1. Samples: 24133360. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 15:47:13,119][15372] Avg episode reward: [(0, '34.321')] [2024-08-05 15:47:15,445][15444] Updated weights for policy 0, policy_version 11791 (0.0039) [2024-08-05 15:47:18,118][15372] Fps is (10 sec: 24589.8, 60 sec: 24303.0, 300 sec: 24270.5). Total num frames: 96657408. Throughput: 0: 6078.5. Samples: 24170080. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 15:47:18,126][15372] Avg episode reward: [(0, '34.555')] [2024-08-05 15:47:18,710][15444] Updated weights for policy 0, policy_version 11801 (0.0024) [2024-08-05 15:47:22,182][15444] Updated weights for policy 0, policy_version 11811 (0.0022) [2024-08-05 15:47:23,119][15372] Fps is (10 sec: 23755.5, 60 sec: 24166.2, 300 sec: 24242.7). Total num frames: 96772096. Throughput: 0: 6055.9. Samples: 24187750. Policy #0 lag: (min: 0.0, avg: 4.5, max: 8.0) [2024-08-05 15:47:23,120][15372] Avg episode reward: [(0, '35.163')] [2024-08-05 15:47:25,591][15444] Updated weights for policy 0, policy_version 11821 (0.0011) [2024-08-05 15:47:28,119][15372] Fps is (10 sec: 22937.2, 60 sec: 24166.4, 300 sec: 24242.8). Total num frames: 96886784. Throughput: 0: 6070.9. Samples: 24224510. Policy #0 lag: (min: 0.0, avg: 4.5, max: 8.0) [2024-08-05 15:47:28,119][15372] Avg episode reward: [(0, '35.856')] [2024-08-05 15:47:28,746][15444] Updated weights for policy 0, policy_version 11831 (0.0038) [2024-08-05 15:47:30,763][15417] Signal inference workers to stop experience collection... (4200 times) [2024-08-05 15:47:30,764][15417] Signal inference workers to resume experience collection... (4200 times) [2024-08-05 15:47:30,828][15444] InferenceWorker_p0-w0: stopping experience collection (4200 times) [2024-08-05 15:47:30,829][15444] InferenceWorker_p0-w0: resuming experience collection (4200 times) [2024-08-05 15:47:32,474][15444] Updated weights for policy 0, policy_version 11841 (0.0031) [2024-08-05 15:47:33,118][15372] Fps is (10 sec: 24577.4, 60 sec: 24303.2, 300 sec: 24242.8). Total num frames: 97017856. Throughput: 0: 6069.6. Samples: 24261150. Policy #0 lag: (min: 0.0, avg: 3.2, max: 7.0) [2024-08-05 15:47:33,119][15372] Avg episode reward: [(0, '36.038')] [2024-08-05 15:47:35,514][15444] Updated weights for policy 0, policy_version 11851 (0.0034) [2024-08-05 15:47:38,118][15372] Fps is (10 sec: 25395.6, 60 sec: 24302.9, 300 sec: 24270.5). Total num frames: 97140736. Throughput: 0: 6074.4. Samples: 24279570. Policy #0 lag: (min: 0.0, avg: 4.1, max: 9.0) [2024-08-05 15:47:38,126][15372] Avg episode reward: [(0, '35.835')] [2024-08-05 15:47:38,980][15444] Updated weights for policy 0, policy_version 11861 (0.0013) [2024-08-05 15:47:42,601][15444] Updated weights for policy 0, policy_version 11871 (0.0010) [2024-08-05 15:47:43,118][15372] Fps is (10 sec: 24575.8, 60 sec: 24303.0, 300 sec: 24270.5). Total num frames: 97263616. Throughput: 0: 6048.7. Samples: 24315250. Policy #0 lag: (min: 0.0, avg: 4.1, max: 9.0) [2024-08-05 15:47:43,119][15372] Avg episode reward: [(0, '35.428')] [2024-08-05 15:47:45,690][15444] Updated weights for policy 0, policy_version 11881 (0.0011) [2024-08-05 15:47:48,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24439.5, 300 sec: 24270.5). Total num frames: 97386496. Throughput: 0: 6067.4. Samples: 24351640. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 15:47:48,126][15372] Avg episode reward: [(0, '35.750')] [2024-08-05 15:47:49,207][15444] Updated weights for policy 0, policy_version 11891 (0.0015) [2024-08-05 15:47:52,607][15444] Updated weights for policy 0, policy_version 11901 (0.0011) [2024-08-05 15:47:53,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.4, 300 sec: 24270.6). Total num frames: 97501184. Throughput: 0: 6063.9. Samples: 24369820. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 15:47:53,119][15372] Avg episode reward: [(0, '35.673')] [2024-08-05 15:47:56,060][15444] Updated weights for policy 0, policy_version 11911 (0.0012) [2024-08-05 15:47:58,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24303.0, 300 sec: 24270.6). Total num frames: 97624064. Throughput: 0: 6054.2. Samples: 24405800. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 15:47:58,119][15372] Avg episode reward: [(0, '35.706')] [2024-08-05 15:47:59,651][15444] Updated weights for policy 0, policy_version 11921 (0.0025) [2024-08-05 15:48:03,023][15444] Updated weights for policy 0, policy_version 11931 (0.0015) [2024-08-05 15:48:03,118][15372] Fps is (10 sec: 23756.6, 60 sec: 24166.5, 300 sec: 24270.6). Total num frames: 97738752. Throughput: 0: 6025.1. Samples: 24441210. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 15:48:03,119][15372] Avg episode reward: [(0, '36.090')] [2024-08-05 15:48:06,262][15444] Updated weights for policy 0, policy_version 11941 (0.0011) [2024-08-05 15:48:08,118][15372] Fps is (10 sec: 22937.4, 60 sec: 24032.1, 300 sec: 24242.8). Total num frames: 97853440. Throughput: 0: 6045.6. Samples: 24459800. Policy #0 lag: (min: 0.0, avg: 4.3, max: 9.0) [2024-08-05 15:48:08,126][15372] Avg episode reward: [(0, '35.728')] [2024-08-05 15:48:09,547][15444] Updated weights for policy 0, policy_version 11951 (0.0011) [2024-08-05 15:48:13,055][15444] Updated weights for policy 0, policy_version 11961 (0.0016) [2024-08-05 15:48:13,119][15372] Fps is (10 sec: 24575.9, 60 sec: 24166.3, 300 sec: 24242.8). Total num frames: 97984512. Throughput: 0: 6038.5. Samples: 24496240. Policy #0 lag: (min: 0.0, avg: 4.3, max: 9.0) [2024-08-05 15:48:13,119][15372] Avg episode reward: [(0, '34.914')] [2024-08-05 15:48:14,708][15417] Signal inference workers to stop experience collection... (4250 times) [2024-08-05 15:48:14,710][15417] Signal inference workers to resume experience collection... (4250 times) [2024-08-05 15:48:14,783][15444] InferenceWorker_p0-w0: stopping experience collection (4250 times) [2024-08-05 15:48:14,784][15444] InferenceWorker_p0-w0: resuming experience collection (4250 times) [2024-08-05 15:48:16,504][15444] Updated weights for policy 0, policy_version 11971 (0.0012) [2024-08-05 15:48:18,118][15372] Fps is (10 sec: 25395.4, 60 sec: 24166.4, 300 sec: 24270.5). Total num frames: 98107392. Throughput: 0: 6039.8. Samples: 24532940. Policy #0 lag: (min: 0.0, avg: 4.3, max: 8.0) [2024-08-05 15:48:18,119][15372] Avg episode reward: [(0, '35.410')] [2024-08-05 15:48:18,122][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000011976_98107392.pth... [2024-08-05 15:48:18,275][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000011265_92282880.pth [2024-08-05 15:48:19,629][15444] Updated weights for policy 0, policy_version 11981 (0.0019) [2024-08-05 15:48:23,118][15372] Fps is (10 sec: 23757.1, 60 sec: 24166.6, 300 sec: 24242.8). Total num frames: 98222080. Throughput: 0: 6041.3. Samples: 24551430. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 15:48:23,119][15372] Avg episode reward: [(0, '35.560')] [2024-08-05 15:48:23,254][15444] Updated weights for policy 0, policy_version 11991 (0.0019) [2024-08-05 15:48:26,354][15444] Updated weights for policy 0, policy_version 12001 (0.0030) [2024-08-05 15:48:28,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24439.5, 300 sec: 24270.5). Total num frames: 98353152. Throughput: 0: 6052.5. Samples: 24587610. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 15:48:28,119][15372] Avg episode reward: [(0, '35.763')] [2024-08-05 15:48:29,936][15444] Updated weights for policy 0, policy_version 12011 (0.0020) [2024-08-05 15:48:33,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.4, 300 sec: 24242.8). Total num frames: 98467840. Throughput: 0: 6048.5. Samples: 24623820. Policy #0 lag: (min: 1.0, avg: 3.3, max: 7.0) [2024-08-05 15:48:33,126][15372] Avg episode reward: [(0, '34.768')] [2024-08-05 15:48:33,213][15444] Updated weights for policy 0, policy_version 12021 (0.0029) [2024-08-05 15:48:36,468][15444] Updated weights for policy 0, policy_version 12031 (0.0034) [2024-08-05 15:48:38,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.4, 300 sec: 24270.6). Total num frames: 98590720. Throughput: 0: 6067.1. Samples: 24642840. Policy #0 lag: (min: 1.0, avg: 3.3, max: 7.0) [2024-08-05 15:48:38,126][15372] Avg episode reward: [(0, '35.132')] [2024-08-05 15:48:40,051][15444] Updated weights for policy 0, policy_version 12041 (0.0018) [2024-08-05 15:48:43,121][15372] Fps is (10 sec: 24569.8, 60 sec: 24165.4, 300 sec: 24270.3). Total num frames: 98713600. Throughput: 0: 6071.9. Samples: 24679050. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 15:48:43,129][15372] Avg episode reward: [(0, '35.420')] [2024-08-05 15:48:43,299][15444] Updated weights for policy 0, policy_version 12051 (0.0016) [2024-08-05 15:48:46,558][15444] Updated weights for policy 0, policy_version 12061 (0.0021) [2024-08-05 15:48:48,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24166.4, 300 sec: 24270.5). Total num frames: 98836480. Throughput: 0: 6087.3. Samples: 24715140. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 15:48:48,126][15372] Avg episode reward: [(0, '35.115')] [2024-08-05 15:48:49,843][15444] Updated weights for policy 0, policy_version 12071 (0.0033) [2024-08-05 15:48:53,119][15372] Fps is (10 sec: 24580.9, 60 sec: 24302.7, 300 sec: 24298.3). Total num frames: 98959360. Throughput: 0: 6091.9. Samples: 24733940. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 15:48:53,127][15372] Avg episode reward: [(0, '35.488')] [2024-08-05 15:48:53,373][15444] Updated weights for policy 0, policy_version 12081 (0.0012) [2024-08-05 15:48:56,864][15444] Updated weights for policy 0, policy_version 12091 (0.0018) [2024-08-05 15:48:57,242][15417] Signal inference workers to stop experience collection... (4300 times) [2024-08-05 15:48:57,243][15417] Signal inference workers to resume experience collection... (4300 times) [2024-08-05 15:48:57,281][15444] InferenceWorker_p0-w0: stopping experience collection (4300 times) [2024-08-05 15:48:57,286][15444] InferenceWorker_p0-w0: resuming experience collection (4300 times) [2024-08-05 15:48:58,118][15372] Fps is (10 sec: 24575.8, 60 sec: 24302.9, 300 sec: 24298.3). Total num frames: 99082240. Throughput: 0: 6091.8. Samples: 24770370. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 15:48:58,119][15372] Avg episode reward: [(0, '36.258')] [2024-08-05 15:49:00,115][15444] Updated weights for policy 0, policy_version 12101 (0.0018) [2024-08-05 15:49:03,119][15372] Fps is (10 sec: 23757.6, 60 sec: 24302.9, 300 sec: 24270.5). Total num frames: 99196928. Throughput: 0: 6077.3. Samples: 24806420. Policy #0 lag: (min: 1.0, avg: 4.5, max: 9.0) [2024-08-05 15:49:03,126][15372] Avg episode reward: [(0, '34.776')] [2024-08-05 15:49:03,447][15444] Updated weights for policy 0, policy_version 12111 (0.0014) [2024-08-05 15:49:06,943][15444] Updated weights for policy 0, policy_version 12121 (0.0017) [2024-08-05 15:49:08,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24439.5, 300 sec: 24270.5). Total num frames: 99319808. Throughput: 0: 6073.3. Samples: 24824730. Policy #0 lag: (min: 1.0, avg: 4.5, max: 9.0) [2024-08-05 15:49:08,119][15372] Avg episode reward: [(0, '34.407')] [2024-08-05 15:49:10,189][15444] Updated weights for policy 0, policy_version 12131 (0.0011) [2024-08-05 15:49:13,119][15372] Fps is (10 sec: 24575.7, 60 sec: 24302.9, 300 sec: 24270.6). Total num frames: 99442688. Throughput: 0: 6083.5. Samples: 24861370. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 15:49:13,127][15372] Avg episode reward: [(0, '34.943')] [2024-08-05 15:49:13,606][15444] Updated weights for policy 0, policy_version 12141 (0.0012) [2024-08-05 15:49:17,103][15444] Updated weights for policy 0, policy_version 12151 (0.0024) [2024-08-05 15:49:18,119][15372] Fps is (10 sec: 24575.5, 60 sec: 24302.9, 300 sec: 24270.5). Total num frames: 99565568. Throughput: 0: 6079.5. Samples: 24897400. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 15:49:18,119][15372] Avg episode reward: [(0, '35.329')] [2024-08-05 15:49:20,281][15444] Updated weights for policy 0, policy_version 12161 (0.0014) [2024-08-05 15:49:23,118][15372] Fps is (10 sec: 23757.3, 60 sec: 24302.9, 300 sec: 24242.8). Total num frames: 99680256. Throughput: 0: 6071.3. Samples: 24916050. Policy #0 lag: (min: 1.0, avg: 3.9, max: 7.0) [2024-08-05 15:49:23,119][15372] Avg episode reward: [(0, '35.895')] [2024-08-05 15:49:23,661][15444] Updated weights for policy 0, policy_version 12171 (0.0012) [2024-08-05 15:49:27,364][15444] Updated weights for policy 0, policy_version 12181 (0.0017) [2024-08-05 15:49:28,119][15372] Fps is (10 sec: 24576.1, 60 sec: 24302.9, 300 sec: 24298.5). Total num frames: 99811328. Throughput: 0: 6070.3. Samples: 24952200. Policy #0 lag: (min: 0.0, avg: 4.5, max: 8.0) [2024-08-05 15:49:28,119][15372] Avg episode reward: [(0, '35.744')] [2024-08-05 15:49:30,470][15444] Updated weights for policy 0, policy_version 12191 (0.0011) [2024-08-05 15:49:33,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24302.9, 300 sec: 24270.5). Total num frames: 99926016. Throughput: 0: 6066.2. Samples: 24988120. Policy #0 lag: (min: 0.0, avg: 4.5, max: 8.0) [2024-08-05 15:49:33,126][15372] Avg episode reward: [(0, '34.793')] [2024-08-05 15:49:34,135][15444] Updated weights for policy 0, policy_version 12201 (0.0013) [2024-08-05 15:49:36,263][15417] Signal inference workers to stop experience collection... (4350 times) [2024-08-05 15:49:36,264][15417] Signal inference workers to resume experience collection... (4350 times) [2024-08-05 15:49:36,303][15444] InferenceWorker_p0-w0: stopping experience collection (4350 times) [2024-08-05 15:49:36,303][15444] InferenceWorker_p0-w0: resuming experience collection (4350 times) [2024-08-05 15:49:37,352][15444] Updated weights for policy 0, policy_version 12211 (0.0019) [2024-08-05 15:49:38,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24302.9, 300 sec: 24271.6). Total num frames: 100048896. Throughput: 0: 6052.7. Samples: 25006310. Policy #0 lag: (min: 2.0, avg: 4.4, max: 9.0) [2024-08-05 15:49:38,119][15372] Avg episode reward: [(0, '34.135')] [2024-08-05 15:49:40,531][15444] Updated weights for policy 0, policy_version 12221 (0.0011) [2024-08-05 15:49:43,118][15372] Fps is (10 sec: 25395.2, 60 sec: 24440.5, 300 sec: 24326.1). Total num frames: 100179968. Throughput: 0: 6073.1. Samples: 25043660. Policy #0 lag: (min: 2.0, avg: 4.4, max: 9.0) [2024-08-05 15:49:43,119][15372] Avg episode reward: [(0, '34.508')] [2024-08-05 15:49:44,181][15444] Updated weights for policy 0, policy_version 12231 (0.0022) [2024-08-05 15:49:47,272][15444] Updated weights for policy 0, policy_version 12241 (0.0020) [2024-08-05 15:49:48,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24270.5). Total num frames: 100286464. Throughput: 0: 6047.4. Samples: 25078550. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 15:49:48,126][15372] Avg episode reward: [(0, '35.372')] [2024-08-05 15:49:50,930][15444] Updated weights for policy 0, policy_version 12251 (0.0012) [2024-08-05 15:49:53,119][15372] Fps is (10 sec: 22935.9, 60 sec: 24166.3, 300 sec: 24270.5). Total num frames: 100409344. Throughput: 0: 6050.3. Samples: 25097000. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 15:49:53,119][15372] Avg episode reward: [(0, '35.787')] [2024-08-05 15:49:54,249][15444] Updated weights for policy 0, policy_version 12261 (0.0015) [2024-08-05 15:49:57,594][15444] Updated weights for policy 0, policy_version 12271 (0.0017) [2024-08-05 15:49:58,119][15372] Fps is (10 sec: 24575.8, 60 sec: 24166.4, 300 sec: 24270.5). Total num frames: 100532224. Throughput: 0: 6047.8. Samples: 25133520. Policy #0 lag: (min: 1.0, avg: 3.5, max: 8.0) [2024-08-05 15:49:58,119][15372] Avg episode reward: [(0, '35.403')] [2024-08-05 15:50:01,246][15444] Updated weights for policy 0, policy_version 12281 (0.0011) [2024-08-05 15:50:03,118][15372] Fps is (10 sec: 24577.9, 60 sec: 24303.0, 300 sec: 24298.3). Total num frames: 100655104. Throughput: 0: 6038.0. Samples: 25169110. Policy #0 lag: (min: 1.0, avg: 3.5, max: 8.0) [2024-08-05 15:50:03,119][15372] Avg episode reward: [(0, '34.686')] [2024-08-05 15:50:04,363][15444] Updated weights for policy 0, policy_version 12291 (0.0011) [2024-08-05 15:50:07,745][15444] Updated weights for policy 0, policy_version 12301 (0.0022) [2024-08-05 15:50:08,119][15372] Fps is (10 sec: 24574.9, 60 sec: 24302.7, 300 sec: 24298.3). Total num frames: 100777984. Throughput: 0: 6045.5. Samples: 25188100. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 15:50:08,119][15372] Avg episode reward: [(0, '35.438')] [2024-08-05 15:50:10,862][15444] Updated weights for policy 0, policy_version 12311 (0.0020) [2024-08-05 15:50:13,123][15372] Fps is (10 sec: 23746.3, 60 sec: 24164.7, 300 sec: 24270.2). Total num frames: 100892672. Throughput: 0: 6045.6. Samples: 25224280. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 15:50:13,131][15372] Avg episode reward: [(0, '35.495')] [2024-08-05 15:50:14,641][15444] Updated weights for policy 0, policy_version 12321 (0.0021) [2024-08-05 15:50:14,752][15417] Signal inference workers to stop experience collection... (4400 times) [2024-08-05 15:50:14,753][15417] Signal inference workers to resume experience collection... (4400 times) [2024-08-05 15:50:14,814][15444] InferenceWorker_p0-w0: stopping experience collection (4400 times) [2024-08-05 15:50:14,814][15444] InferenceWorker_p0-w0: resuming experience collection (4400 times) [2024-08-05 15:50:17,927][15444] Updated weights for policy 0, policy_version 12331 (0.0019) [2024-08-05 15:50:18,118][15372] Fps is (10 sec: 23758.0, 60 sec: 24166.4, 300 sec: 24242.8). Total num frames: 101015552. Throughput: 0: 6053.3. Samples: 25260520. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 15:50:18,119][15372] Avg episode reward: [(0, '35.665')] [2024-08-05 15:50:18,122][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000012331_101015552.pth... [2024-08-05 15:50:18,261][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000011621_95199232.pth [2024-08-05 15:50:21,261][15444] Updated weights for policy 0, policy_version 12341 (0.0031) [2024-08-05 15:50:23,118][15372] Fps is (10 sec: 24586.9, 60 sec: 24303.0, 300 sec: 24270.5). Total num frames: 101138432. Throughput: 0: 6054.2. Samples: 25278750. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 15:50:23,126][15372] Avg episode reward: [(0, '35.910')] [2024-08-05 15:50:24,803][15444] Updated weights for policy 0, policy_version 12351 (0.0021) [2024-08-05 15:50:28,110][15444] Updated weights for policy 0, policy_version 12361 (0.0011) [2024-08-05 15:50:28,119][15372] Fps is (10 sec: 24574.5, 60 sec: 24166.2, 300 sec: 24270.5). Total num frames: 101261312. Throughput: 0: 6037.9. Samples: 25315370. Policy #0 lag: (min: 0.0, avg: 4.5, max: 9.0) [2024-08-05 15:50:28,119][15372] Avg episode reward: [(0, '35.835')] [2024-08-05 15:50:31,408][15444] Updated weights for policy 0, policy_version 12371 (0.0025) [2024-08-05 15:50:33,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24270.5). Total num frames: 101376000. Throughput: 0: 6054.0. Samples: 25350980. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 15:50:33,127][15372] Avg episode reward: [(0, '35.215')] [2024-08-05 15:50:34,930][15444] Updated weights for policy 0, policy_version 12381 (0.0029) [2024-08-05 15:50:38,050][15444] Updated weights for policy 0, policy_version 12391 (0.0022) [2024-08-05 15:50:38,121][15372] Fps is (10 sec: 24571.8, 60 sec: 24302.0, 300 sec: 24298.1). Total num frames: 101507072. Throughput: 0: 6057.1. Samples: 25369580. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 15:50:38,121][15372] Avg episode reward: [(0, '34.752')] [2024-08-05 15:50:41,539][15444] Updated weights for policy 0, policy_version 12401 (0.0012) [2024-08-05 15:50:43,119][15372] Fps is (10 sec: 24575.5, 60 sec: 24029.8, 300 sec: 24270.5). Total num frames: 101621760. Throughput: 0: 6053.8. Samples: 25405940. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 15:50:43,126][15372] Avg episode reward: [(0, '34.036')] [2024-08-05 15:50:45,011][15444] Updated weights for policy 0, policy_version 12411 (0.0021) [2024-08-05 15:50:48,118][15372] Fps is (10 sec: 23762.4, 60 sec: 24302.9, 300 sec: 24270.5). Total num frames: 101744640. Throughput: 0: 6073.8. Samples: 25442430. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 15:50:48,126][15372] Avg episode reward: [(0, '34.080')] [2024-08-05 15:50:48,145][15444] Updated weights for policy 0, policy_version 12421 (0.0020) [2024-08-05 15:50:51,789][15444] Updated weights for policy 0, policy_version 12431 (0.0021) [2024-08-05 15:50:52,246][15417] Signal inference workers to stop experience collection... (4450 times) [2024-08-05 15:50:52,246][15417] Signal inference workers to resume experience collection... (4450 times) [2024-08-05 15:50:52,311][15444] InferenceWorker_p0-w0: stopping experience collection (4450 times) [2024-08-05 15:50:52,312][15444] InferenceWorker_p0-w0: resuming experience collection (4450 times) [2024-08-05 15:50:53,119][15372] Fps is (10 sec: 24576.2, 60 sec: 24303.2, 300 sec: 24270.5). Total num frames: 101867520. Throughput: 0: 6056.9. Samples: 25460660. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 15:50:53,119][15372] Avg episode reward: [(0, '35.380')] [2024-08-05 15:50:54,985][15444] Updated weights for policy 0, policy_version 12441 (0.0026) [2024-08-05 15:50:58,119][15372] Fps is (10 sec: 23756.5, 60 sec: 24166.4, 300 sec: 24270.5). Total num frames: 101982208. Throughput: 0: 6060.6. Samples: 25496980. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 15:50:58,127][15372] Avg episode reward: [(0, '34.640')] [2024-08-05 15:50:58,513][15444] Updated weights for policy 0, policy_version 12451 (0.0015) [2024-08-05 15:51:01,972][15444] Updated weights for policy 0, policy_version 12461 (0.0015) [2024-08-05 15:51:03,119][15372] Fps is (10 sec: 23756.1, 60 sec: 24166.2, 300 sec: 24242.7). Total num frames: 102105088. Throughput: 0: 6050.8. Samples: 25532810. Policy #0 lag: (min: 1.0, avg: 3.9, max: 9.0) [2024-08-05 15:51:03,119][15372] Avg episode reward: [(0, '33.414')] [2024-08-05 15:51:05,183][15444] Updated weights for policy 0, policy_version 12471 (0.0012) [2024-08-05 15:51:08,119][15372] Fps is (10 sec: 24575.3, 60 sec: 24166.4, 300 sec: 24270.5). Total num frames: 102227968. Throughput: 0: 6053.9. Samples: 25551180. Policy #0 lag: (min: 1.0, avg: 3.9, max: 9.0) [2024-08-05 15:51:08,127][15372] Avg episode reward: [(0, '34.583')] [2024-08-05 15:51:08,909][15444] Updated weights for policy 0, policy_version 12481 (0.0011) [2024-08-05 15:51:12,078][15444] Updated weights for policy 0, policy_version 12491 (0.0021) [2024-08-05 15:51:13,119][15372] Fps is (10 sec: 23757.0, 60 sec: 24168.0, 300 sec: 24215.0). Total num frames: 102342656. Throughput: 0: 6024.3. Samples: 25586460. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 15:51:13,119][15372] Avg episode reward: [(0, '35.702')] [2024-08-05 15:51:15,576][15444] Updated weights for policy 0, policy_version 12501 (0.0023) [2024-08-05 15:51:18,118][15372] Fps is (10 sec: 24576.9, 60 sec: 24302.9, 300 sec: 24242.8). Total num frames: 102473728. Throughput: 0: 6044.7. Samples: 25622990. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 15:51:18,126][15372] Avg episode reward: [(0, '35.262')] [2024-08-05 15:51:19,048][15444] Updated weights for policy 0, policy_version 12511 (0.0014) [2024-08-05 15:51:22,310][15444] Updated weights for policy 0, policy_version 12521 (0.0018) [2024-08-05 15:51:23,119][15372] Fps is (10 sec: 24576.6, 60 sec: 24166.3, 300 sec: 24242.8). Total num frames: 102588416. Throughput: 0: 6030.1. Samples: 25640920. Policy #0 lag: (min: 1.0, avg: 3.2, max: 7.0) [2024-08-05 15:51:23,119][15372] Avg episode reward: [(0, '34.714')] [2024-08-05 15:51:26,042][15444] Updated weights for policy 0, policy_version 12531 (0.0016) [2024-08-05 15:51:28,127][15372] Fps is (10 sec: 22917.9, 60 sec: 24026.7, 300 sec: 24214.3). Total num frames: 102703104. Throughput: 0: 6013.8. Samples: 25676610. Policy #0 lag: (min: 1.0, avg: 3.2, max: 7.0) [2024-08-05 15:51:28,127][15372] Avg episode reward: [(0, '34.670')] [2024-08-05 15:51:28,437][15417] Signal inference workers to stop experience collection... (4500 times) [2024-08-05 15:51:28,439][15417] Signal inference workers to resume experience collection... (4500 times) [2024-08-05 15:51:28,495][15444] InferenceWorker_p0-w0: stopping experience collection (4500 times) [2024-08-05 15:51:28,501][15444] InferenceWorker_p0-w0: resuming experience collection (4500 times) [2024-08-05 15:51:29,078][15444] Updated weights for policy 0, policy_version 12541 (0.0019) [2024-08-05 15:51:32,650][15444] Updated weights for policy 0, policy_version 12551 (0.0017) [2024-08-05 15:51:33,119][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 102825984. Throughput: 0: 5999.3. Samples: 25712400. Policy #0 lag: (min: 0.0, avg: 4.4, max: 8.0) [2024-08-05 15:51:33,119][15372] Avg episode reward: [(0, '35.096')] [2024-08-05 15:51:35,944][15444] Updated weights for policy 0, policy_version 12561 (0.0013) [2024-08-05 15:51:38,119][15372] Fps is (10 sec: 24597.1, 60 sec: 24030.8, 300 sec: 24215.0). Total num frames: 102948864. Throughput: 0: 5984.2. Samples: 25729950. Policy #0 lag: (min: 0.0, avg: 4.4, max: 8.0) [2024-08-05 15:51:38,126][15372] Avg episode reward: [(0, '35.169')] [2024-08-05 15:51:39,639][15444] Updated weights for policy 0, policy_version 12571 (0.0013) [2024-08-05 15:51:43,118][15372] Fps is (10 sec: 22937.7, 60 sec: 23893.4, 300 sec: 24187.2). Total num frames: 103055360. Throughput: 0: 5969.8. Samples: 25765620. Policy #0 lag: (min: 0.0, avg: 2.9, max: 8.0) [2024-08-05 15:51:43,126][15372] Avg episode reward: [(0, '35.672')] [2024-08-05 15:51:43,368][15444] Updated weights for policy 0, policy_version 12581 (0.0029) [2024-08-05 15:51:46,316][15444] Updated weights for policy 0, policy_version 12591 (0.0028) [2024-08-05 15:51:48,118][15372] Fps is (10 sec: 22937.8, 60 sec: 23893.3, 300 sec: 24159.5). Total num frames: 103178240. Throughput: 0: 5969.2. Samples: 25801420. Policy #0 lag: (min: 0.0, avg: 2.9, max: 8.0) [2024-08-05 15:51:48,126][15372] Avg episode reward: [(0, '35.892')] [2024-08-05 15:51:49,975][15444] Updated weights for policy 0, policy_version 12601 (0.0026) [2024-08-05 15:51:53,118][15372] Fps is (10 sec: 24576.1, 60 sec: 23893.4, 300 sec: 24187.2). Total num frames: 103301120. Throughput: 0: 5959.4. Samples: 25819350. Policy #0 lag: (min: 0.0, avg: 4.3, max: 8.0) [2024-08-05 15:51:53,126][15372] Avg episode reward: [(0, '34.996')] [2024-08-05 15:51:53,349][15444] Updated weights for policy 0, policy_version 12611 (0.0021) [2024-08-05 15:51:56,540][15444] Updated weights for policy 0, policy_version 12621 (0.0012) [2024-08-05 15:51:58,118][15372] Fps is (10 sec: 23756.7, 60 sec: 23893.4, 300 sec: 24159.5). Total num frames: 103415808. Throughput: 0: 5977.4. Samples: 25855440. Policy #0 lag: (min: 0.0, avg: 4.3, max: 8.0) [2024-08-05 15:51:58,126][15372] Avg episode reward: [(0, '35.242')] [2024-08-05 15:52:00,267][15444] Updated weights for policy 0, policy_version 12631 (0.0019) [2024-08-05 15:52:03,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24030.0, 300 sec: 24187.7). Total num frames: 103546880. Throughput: 0: 5990.2. Samples: 25892550. Policy #0 lag: (min: 0.0, avg: 3.1, max: 7.0) [2024-08-05 15:52:03,129][15372] Avg episode reward: [(0, '35.842')] [2024-08-05 15:52:03,289][15444] Updated weights for policy 0, policy_version 12641 (0.0022) [2024-08-05 15:52:04,970][15417] Signal inference workers to stop experience collection... (4550 times) [2024-08-05 15:52:04,971][15417] Signal inference workers to resume experience collection... (4550 times) [2024-08-05 15:52:05,012][15444] InferenceWorker_p0-w0: stopping experience collection (4550 times) [2024-08-05 15:52:05,018][15444] InferenceWorker_p0-w0: resuming experience collection (4550 times) [2024-08-05 15:52:06,849][15444] Updated weights for policy 0, policy_version 12651 (0.0026) [2024-08-05 15:52:08,124][15372] Fps is (10 sec: 25383.6, 60 sec: 24028.2, 300 sec: 24186.8). Total num frames: 103669760. Throughput: 0: 5993.2. Samples: 25910640. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 15:52:08,126][15372] Avg episode reward: [(0, '35.046')] [2024-08-05 15:52:10,152][15444] Updated weights for policy 0, policy_version 12661 (0.0020) [2024-08-05 15:52:13,119][15372] Fps is (10 sec: 23756.2, 60 sec: 24029.9, 300 sec: 24159.4). Total num frames: 103784448. Throughput: 0: 6020.7. Samples: 25947490. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 15:52:13,127][15372] Avg episode reward: [(0, '34.735')] [2024-08-05 15:52:13,503][15444] Updated weights for policy 0, policy_version 12671 (0.0014) [2024-08-05 15:52:17,152][15444] Updated weights for policy 0, policy_version 12681 (0.0012) [2024-08-05 15:52:18,118][15372] Fps is (10 sec: 24587.1, 60 sec: 24029.9, 300 sec: 24215.0). Total num frames: 103915520. Throughput: 0: 6021.8. Samples: 25983380. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 15:52:18,119][15372] Avg episode reward: [(0, '34.961')] [2024-08-05 15:52:18,122][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000012685_103915520.pth... [2024-08-05 15:52:18,231][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000011976_98107392.pth [2024-08-05 15:52:20,107][15444] Updated weights for policy 0, policy_version 12691 (0.0017) [2024-08-05 15:52:23,118][15372] Fps is (10 sec: 23757.4, 60 sec: 23893.4, 300 sec: 24187.2). Total num frames: 104022016. Throughput: 0: 6042.5. Samples: 26001860. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 15:52:23,126][15372] Avg episode reward: [(0, '35.150')] [2024-08-05 15:52:23,786][15444] Updated weights for policy 0, policy_version 12701 (0.0021) [2024-08-05 15:52:26,981][15444] Updated weights for policy 0, policy_version 12711 (0.0012) [2024-08-05 15:52:28,119][15372] Fps is (10 sec: 22936.5, 60 sec: 24033.1, 300 sec: 24159.4). Total num frames: 104144896. Throughput: 0: 6038.8. Samples: 26037370. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 15:52:28,120][15372] Avg episode reward: [(0, '35.535')] [2024-08-05 15:52:30,488][15444] Updated weights for policy 0, policy_version 12721 (0.0016) [2024-08-05 15:52:33,118][15372] Fps is (10 sec: 25395.2, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 104275968. Throughput: 0: 6060.7. Samples: 26074150. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 15:52:33,119][15372] Avg episode reward: [(0, '35.821')] [2024-08-05 15:52:34,112][15444] Updated weights for policy 0, policy_version 12731 (0.0011) [2024-08-05 15:52:34,769][15417] Signal inference workers to stop experience collection... (4600 times) [2024-08-05 15:52:34,770][15417] Signal inference workers to resume experience collection... (4600 times) [2024-08-05 15:52:34,801][15444] InferenceWorker_p0-w0: stopping experience collection (4600 times) [2024-08-05 15:52:34,851][15444] InferenceWorker_p0-w0: resuming experience collection (4600 times) [2024-08-05 15:52:37,108][15444] Updated weights for policy 0, policy_version 12741 (0.0030) [2024-08-05 15:52:38,119][15372] Fps is (10 sec: 24575.8, 60 sec: 24029.7, 300 sec: 24159.4). Total num frames: 104390656. Throughput: 0: 6059.0. Samples: 26092010. Policy #0 lag: (min: 2.0, avg: 4.6, max: 8.0) [2024-08-05 15:52:38,127][15372] Avg episode reward: [(0, '36.464')] [2024-08-05 15:52:40,755][15444] Updated weights for policy 0, policy_version 12751 (0.0014) [2024-08-05 15:52:43,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24303.0, 300 sec: 24159.5). Total num frames: 104513536. Throughput: 0: 6063.8. Samples: 26128310. Policy #0 lag: (min: 2.0, avg: 4.6, max: 8.0) [2024-08-05 15:52:43,119][15372] Avg episode reward: [(0, '36.315')] [2024-08-05 15:52:44,169][15444] Updated weights for policy 0, policy_version 12761 (0.0019) [2024-08-05 15:52:47,418][15444] Updated weights for policy 0, policy_version 12771 (0.0026) [2024-08-05 15:52:48,118][15372] Fps is (10 sec: 24577.5, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 104636416. Throughput: 0: 6030.9. Samples: 26163940. Policy #0 lag: (min: 1.0, avg: 4.0, max: 8.0) [2024-08-05 15:52:48,126][15372] Avg episode reward: [(0, '35.083')] [2024-08-05 15:52:50,883][15444] Updated weights for policy 0, policy_version 12781 (0.0016) [2024-08-05 15:52:53,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 104751104. Throughput: 0: 6044.2. Samples: 26182600. Policy #0 lag: (min: 1.0, avg: 4.0, max: 8.0) [2024-08-05 15:52:53,126][15372] Avg episode reward: [(0, '35.620')] [2024-08-05 15:52:54,243][15444] Updated weights for policy 0, policy_version 12791 (0.0020) [2024-08-05 15:52:57,947][15444] Updated weights for policy 0, policy_version 12801 (0.0017) [2024-08-05 15:52:58,119][15372] Fps is (10 sec: 22936.9, 60 sec: 24166.3, 300 sec: 24159.4). Total num frames: 104865792. Throughput: 0: 6023.3. Samples: 26218540. Policy #0 lag: (min: 1.0, avg: 3.8, max: 7.0) [2024-08-05 15:52:58,119][15372] Avg episode reward: [(0, '35.316')] [2024-08-05 15:53:00,877][15444] Updated weights for policy 0, policy_version 12811 (0.0016) [2024-08-05 15:53:03,119][15372] Fps is (10 sec: 24575.6, 60 sec: 24166.3, 300 sec: 24215.0). Total num frames: 104996864. Throughput: 0: 6020.7. Samples: 26254310. Policy #0 lag: (min: 1.0, avg: 3.8, max: 7.0) [2024-08-05 15:53:03,126][15372] Avg episode reward: [(0, '34.875')] [2024-08-05 15:53:04,586][15444] Updated weights for policy 0, policy_version 12821 (0.0021) [2024-08-05 15:53:08,034][15444] Updated weights for policy 0, policy_version 12831 (0.0025) [2024-08-05 15:53:08,118][15372] Fps is (10 sec: 24576.7, 60 sec: 24031.7, 300 sec: 24159.5). Total num frames: 105111552. Throughput: 0: 6006.9. Samples: 26272170. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 15:53:08,119][15372] Avg episode reward: [(0, '34.816')] [2024-08-05 15:53:11,388][15444] Updated weights for policy 0, policy_version 12841 (0.0025) [2024-08-05 15:53:13,118][15372] Fps is (10 sec: 23757.1, 60 sec: 24166.5, 300 sec: 24159.5). Total num frames: 105234432. Throughput: 0: 6013.2. Samples: 26307960. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 15:53:13,128][15372] Avg episode reward: [(0, '36.025')] [2024-08-05 15:53:14,780][15444] Updated weights for policy 0, policy_version 12851 (0.0013) [2024-08-05 15:53:18,118][15372] Fps is (10 sec: 23756.9, 60 sec: 23893.4, 300 sec: 24159.5). Total num frames: 105349120. Throughput: 0: 6018.0. Samples: 26344960. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 15:53:18,126][15372] Avg episode reward: [(0, '35.932')] [2024-08-05 15:53:18,160][15444] Updated weights for policy 0, policy_version 12861 (0.0013) [2024-08-05 15:53:19,262][15417] Signal inference workers to stop experience collection... (4650 times) [2024-08-05 15:53:19,268][15417] Signal inference workers to resume experience collection... (4650 times) [2024-08-05 15:53:19,315][15444] InferenceWorker_p0-w0: stopping experience collection (4650 times) [2024-08-05 15:53:19,316][15444] InferenceWorker_p0-w0: resuming experience collection (4650 times) [2024-08-05 15:53:21,371][15444] Updated weights for policy 0, policy_version 12871 (0.0027) [2024-08-05 15:53:23,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 105472000. Throughput: 0: 6036.5. Samples: 26363650. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 15:53:23,119][15372] Avg episode reward: [(0, '34.635')] [2024-08-05 15:53:24,670][15444] Updated weights for policy 0, policy_version 12881 (0.0012) [2024-08-05 15:53:27,844][15444] Updated weights for policy 0, policy_version 12891 (0.0021) [2024-08-05 15:53:28,119][15372] Fps is (10 sec: 25394.2, 60 sec: 24303.0, 300 sec: 24187.2). Total num frames: 105603072. Throughput: 0: 6064.8. Samples: 26401230. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 15:53:28,119][15372] Avg episode reward: [(0, '34.850')] [2024-08-05 15:53:31,525][15444] Updated weights for policy 0, policy_version 12901 (0.0015) [2024-08-05 15:53:33,118][15372] Fps is (10 sec: 25395.2, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 105725952. Throughput: 0: 6063.3. Samples: 26436790. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 15:53:33,119][15372] Avg episode reward: [(0, '35.550')] [2024-08-05 15:53:34,718][15444] Updated weights for policy 0, policy_version 12911 (0.0011) [2024-08-05 15:53:38,118][15372] Fps is (10 sec: 23757.7, 60 sec: 24166.6, 300 sec: 24159.7). Total num frames: 105840640. Throughput: 0: 6063.1. Samples: 26455440. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 15:53:38,126][15372] Avg episode reward: [(0, '34.265')] [2024-08-05 15:53:38,308][15444] Updated weights for policy 0, policy_version 12921 (0.0014) [2024-08-05 15:53:41,410][15444] Updated weights for policy 0, policy_version 12931 (0.0024) [2024-08-05 15:53:43,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 105963520. Throughput: 0: 6061.8. Samples: 26491320. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 15:53:43,126][15372] Avg episode reward: [(0, '34.502')] [2024-08-05 15:53:45,084][15444] Updated weights for policy 0, policy_version 12941 (0.0017) [2024-08-05 15:53:48,121][15372] Fps is (10 sec: 24570.7, 60 sec: 24165.5, 300 sec: 24159.3). Total num frames: 106086400. Throughput: 0: 6082.6. Samples: 26528040. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 15:53:48,129][15372] Avg episode reward: [(0, '35.069')] [2024-08-05 15:53:48,182][15444] Updated weights for policy 0, policy_version 12951 (0.0015) [2024-08-05 15:53:51,648][15444] Updated weights for policy 0, policy_version 12961 (0.0014) [2024-08-05 15:53:53,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24302.9, 300 sec: 24159.5). Total num frames: 106209280. Throughput: 0: 6095.3. Samples: 26546460. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 15:53:53,126][15372] Avg episode reward: [(0, '35.576')] [2024-08-05 15:53:55,043][15444] Updated weights for policy 0, policy_version 12971 (0.0014) [2024-08-05 15:53:58,118][15372] Fps is (10 sec: 24581.3, 60 sec: 24439.6, 300 sec: 24187.2). Total num frames: 106332160. Throughput: 0: 6118.0. Samples: 26583270. Policy #0 lag: (min: 2.0, avg: 4.6, max: 8.0) [2024-08-05 15:53:58,126][15372] Avg episode reward: [(0, '35.732')] [2024-08-05 15:53:58,279][15444] Updated weights for policy 0, policy_version 12981 (0.0016) [2024-08-05 15:54:01,703][15444] Updated weights for policy 0, policy_version 12991 (0.0020) [2024-08-05 15:54:03,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.5, 300 sec: 24159.5). Total num frames: 106446848. Throughput: 0: 6090.7. Samples: 26619040. Policy #0 lag: (min: 2.0, avg: 4.6, max: 8.0) [2024-08-05 15:54:03,119][15372] Avg episode reward: [(0, '35.209')] [2024-08-05 15:54:05,180][15444] Updated weights for policy 0, policy_version 13001 (0.0012) [2024-08-05 15:54:07,818][15417] Signal inference workers to stop experience collection... (4700 times) [2024-08-05 15:54:07,828][15417] Signal inference workers to resume experience collection... (4700 times) [2024-08-05 15:54:07,875][15444] InferenceWorker_p0-w0: stopping experience collection (4700 times) [2024-08-05 15:54:07,876][15444] InferenceWorker_p0-w0: resuming experience collection (4700 times) [2024-08-05 15:54:08,119][15372] Fps is (10 sec: 23755.5, 60 sec: 24302.7, 300 sec: 24159.4). Total num frames: 106569728. Throughput: 0: 6091.0. Samples: 26637750. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 15:54:08,127][15372] Avg episode reward: [(0, '35.690')] [2024-08-05 15:54:08,531][15444] Updated weights for policy 0, policy_version 13011 (0.0011) [2024-08-05 15:54:12,137][15444] Updated weights for policy 0, policy_version 13021 (0.0022) [2024-08-05 15:54:13,118][15372] Fps is (10 sec: 25395.2, 60 sec: 24439.5, 300 sec: 24187.2). Total num frames: 106700800. Throughput: 0: 6053.6. Samples: 26673640. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 15:54:13,119][15372] Avg episode reward: [(0, '36.384')] [2024-08-05 15:54:15,226][15444] Updated weights for policy 0, policy_version 13031 (0.0015) [2024-08-05 15:54:18,118][15372] Fps is (10 sec: 23758.0, 60 sec: 24302.9, 300 sec: 24159.5). Total num frames: 106807296. Throughput: 0: 6069.1. Samples: 26709900. Policy #0 lag: (min: 0.0, avg: 2.8, max: 7.0) [2024-08-05 15:54:18,126][15372] Avg episode reward: [(0, '36.463')] [2024-08-05 15:54:18,129][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000013038_106807296.pth... [2024-08-05 15:54:18,317][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000012331_101015552.pth [2024-08-05 15:54:18,911][15444] Updated weights for policy 0, policy_version 13041 (0.0028) [2024-08-05 15:54:22,322][15444] Updated weights for policy 0, policy_version 13051 (0.0030) [2024-08-05 15:54:23,119][15372] Fps is (10 sec: 22936.7, 60 sec: 24302.8, 300 sec: 24131.7). Total num frames: 106930176. Throughput: 0: 6062.2. Samples: 26728240. Policy #0 lag: (min: 0.0, avg: 2.8, max: 7.0) [2024-08-05 15:54:23,119][15372] Avg episode reward: [(0, '35.627')] [2024-08-05 15:54:25,477][15444] Updated weights for policy 0, policy_version 13061 (0.0054) [2024-08-05 15:54:28,118][15372] Fps is (10 sec: 25395.1, 60 sec: 24303.1, 300 sec: 24187.2). Total num frames: 107061248. Throughput: 0: 6052.9. Samples: 26763700. Policy #0 lag: (min: 0.0, avg: 4.3, max: 9.0) [2024-08-05 15:54:28,127][15372] Avg episode reward: [(0, '34.904')] [2024-08-05 15:54:29,182][15444] Updated weights for policy 0, policy_version 13071 (0.0024) [2024-08-05 15:54:32,236][15444] Updated weights for policy 0, policy_version 13081 (0.0025) [2024-08-05 15:54:33,118][15372] Fps is (10 sec: 23757.7, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 107167744. Throughput: 0: 6025.2. Samples: 26799160. Policy #0 lag: (min: 0.0, avg: 4.3, max: 9.0) [2024-08-05 15:54:33,126][15372] Avg episode reward: [(0, '36.138')] [2024-08-05 15:54:35,883][15444] Updated weights for policy 0, policy_version 13091 (0.0034) [2024-08-05 15:54:38,121][15372] Fps is (10 sec: 22931.6, 60 sec: 24165.3, 300 sec: 24103.7). Total num frames: 107290624. Throughput: 0: 6022.1. Samples: 26817470. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 15:54:38,121][15372] Avg episode reward: [(0, '36.741')] [2024-08-05 15:54:38,208][15417] Saving new best policy, reward=36.741! [2024-08-05 15:54:39,251][15444] Updated weights for policy 0, policy_version 13101 (0.0035) [2024-08-05 15:54:42,623][15444] Updated weights for policy 0, policy_version 13111 (0.0015) [2024-08-05 15:54:43,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 107413504. Throughput: 0: 6011.8. Samples: 26853800. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 15:54:43,119][15372] Avg episode reward: [(0, '35.290')] [2024-08-05 15:54:46,189][15444] Updated weights for policy 0, policy_version 13121 (0.0013) [2024-08-05 15:54:48,118][15372] Fps is (10 sec: 23763.2, 60 sec: 24030.7, 300 sec: 24131.8). Total num frames: 107528192. Throughput: 0: 6001.1. Samples: 26889090. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 15:54:48,119][15372] Avg episode reward: [(0, '35.927')] [2024-08-05 15:54:48,791][15417] Signal inference workers to stop experience collection... (4750 times) [2024-08-05 15:54:48,792][15417] Signal inference workers to resume experience collection... (4750 times) [2024-08-05 15:54:48,861][15444] InferenceWorker_p0-w0: stopping experience collection (4750 times) [2024-08-05 15:54:48,861][15444] InferenceWorker_p0-w0: resuming experience collection (4750 times) [2024-08-05 15:54:49,436][15444] Updated weights for policy 0, policy_version 13131 (0.0010) [2024-08-05 15:54:52,961][15444] Updated weights for policy 0, policy_version 13141 (0.0017) [2024-08-05 15:54:53,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.8, 300 sec: 24131.7). Total num frames: 107651072. Throughput: 0: 6002.3. Samples: 26907850. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 15:54:53,119][15372] Avg episode reward: [(0, '36.563')] [2024-08-05 15:54:56,021][15444] Updated weights for policy 0, policy_version 13151 (0.0011) [2024-08-05 15:54:58,118][15372] Fps is (10 sec: 25395.2, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 107782144. Throughput: 0: 6010.2. Samples: 26944100. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 15:54:58,119][15372] Avg episode reward: [(0, '36.331')] [2024-08-05 15:54:59,670][15444] Updated weights for policy 0, policy_version 13161 (0.0036) [2024-08-05 15:55:03,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24029.9, 300 sec: 24104.0). Total num frames: 107888640. Throughput: 0: 5994.9. Samples: 26979670. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 15:55:03,119][15372] Avg episode reward: [(0, '35.569')] [2024-08-05 15:55:03,269][15444] Updated weights for policy 0, policy_version 13171 (0.0030) [2024-08-05 15:55:06,284][15444] Updated weights for policy 0, policy_version 13181 (0.0017) [2024-08-05 15:55:08,118][15372] Fps is (10 sec: 22937.6, 60 sec: 24030.1, 300 sec: 24132.0). Total num frames: 108011520. Throughput: 0: 6007.6. Samples: 26998580. Policy #0 lag: (min: 1.0, avg: 3.3, max: 8.0) [2024-08-05 15:55:08,126][15372] Avg episode reward: [(0, '35.589')] [2024-08-05 15:55:09,963][15444] Updated weights for policy 0, policy_version 13191 (0.0014) [2024-08-05 15:55:13,051][15444] Updated weights for policy 0, policy_version 13201 (0.0019) [2024-08-05 15:55:13,118][15372] Fps is (10 sec: 25395.1, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 108142592. Throughput: 0: 6036.0. Samples: 27035320. Policy #0 lag: (min: 1.0, avg: 3.3, max: 8.0) [2024-08-05 15:55:13,119][15372] Avg episode reward: [(0, '36.293')] [2024-08-05 15:55:16,648][15444] Updated weights for policy 0, policy_version 13211 (0.0014) [2024-08-05 15:55:18,119][15372] Fps is (10 sec: 24574.7, 60 sec: 24166.2, 300 sec: 24131.6). Total num frames: 108257280. Throughput: 0: 6041.7. Samples: 27071040. Policy #0 lag: (min: 0.0, avg: 4.3, max: 7.0) [2024-08-05 15:55:18,127][15372] Avg episode reward: [(0, '35.126')] [2024-08-05 15:55:19,892][15444] Updated weights for policy 0, policy_version 13221 (0.0011) [2024-08-05 15:55:23,119][15372] Fps is (10 sec: 23756.4, 60 sec: 24166.5, 300 sec: 24131.7). Total num frames: 108380160. Throughput: 0: 6056.8. Samples: 27090010. Policy #0 lag: (min: 0.0, avg: 4.3, max: 7.0) [2024-08-05 15:55:23,127][15372] Avg episode reward: [(0, '34.949')] [2024-08-05 15:55:23,152][15444] Updated weights for policy 0, policy_version 13231 (0.0015) [2024-08-05 15:55:26,905][15444] Updated weights for policy 0, policy_version 13241 (0.0010) [2024-08-05 15:55:28,118][15372] Fps is (10 sec: 24577.1, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 108503040. Throughput: 0: 6050.0. Samples: 27126050. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 15:55:28,119][15372] Avg episode reward: [(0, '35.351')] [2024-08-05 15:55:29,907][15444] Updated weights for policy 0, policy_version 13251 (0.0017) [2024-08-05 15:55:31,429][15417] Signal inference workers to stop experience collection... (4800 times) [2024-08-05 15:55:31,430][15417] Signal inference workers to resume experience collection... (4800 times) [2024-08-05 15:55:31,482][15444] InferenceWorker_p0-w0: stopping experience collection (4800 times) [2024-08-05 15:55:31,482][15444] InferenceWorker_p0-w0: resuming experience collection (4800 times) [2024-08-05 15:55:33,118][15372] Fps is (10 sec: 24576.6, 60 sec: 24302.9, 300 sec: 24131.9). Total num frames: 108625920. Throughput: 0: 6087.8. Samples: 27163040. Policy #0 lag: (min: 1.0, avg: 3.3, max: 7.0) [2024-08-05 15:55:33,119][15372] Avg episode reward: [(0, '36.227')] [2024-08-05 15:55:33,308][15444] Updated weights for policy 0, policy_version 13261 (0.0021) [2024-08-05 15:55:36,944][15444] Updated weights for policy 0, policy_version 13271 (0.0011) [2024-08-05 15:55:38,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24304.0, 300 sec: 24159.5). Total num frames: 108748800. Throughput: 0: 6074.7. Samples: 27181210. Policy #0 lag: (min: 1.0, avg: 3.3, max: 7.0) [2024-08-05 15:55:38,119][15372] Avg episode reward: [(0, '35.331')] [2024-08-05 15:55:39,911][15444] Updated weights for policy 0, policy_version 13281 (0.0013) [2024-08-05 15:55:43,123][15372] Fps is (10 sec: 24566.7, 60 sec: 24301.4, 300 sec: 24159.2). Total num frames: 108871680. Throughput: 0: 6079.3. Samples: 27217690. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 15:55:43,131][15372] Avg episode reward: [(0, '35.152')] [2024-08-05 15:55:43,631][15444] Updated weights for policy 0, policy_version 13291 (0.0026) [2024-08-05 15:55:46,870][15444] Updated weights for policy 0, policy_version 13301 (0.0022) [2024-08-05 15:55:48,120][15372] Fps is (10 sec: 23754.0, 60 sec: 24302.5, 300 sec: 24131.6). Total num frames: 108986368. Throughput: 0: 6079.0. Samples: 27253230. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 15:55:48,120][15372] Avg episode reward: [(0, '35.456')] [2024-08-05 15:55:50,198][15444] Updated weights for policy 0, policy_version 13311 (0.0018) [2024-08-05 15:55:53,118][15372] Fps is (10 sec: 23765.7, 60 sec: 24302.9, 300 sec: 24159.5). Total num frames: 109109248. Throughput: 0: 6070.0. Samples: 27271730. Policy #0 lag: (min: 0.0, avg: 4.5, max: 8.0) [2024-08-05 15:55:53,126][15372] Avg episode reward: [(0, '35.754')] [2024-08-05 15:55:53,664][15444] Updated weights for policy 0, policy_version 13321 (0.0012) [2024-08-05 15:55:57,019][15444] Updated weights for policy 0, policy_version 13331 (0.0017) [2024-08-05 15:55:58,118][15372] Fps is (10 sec: 24579.0, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 109232128. Throughput: 0: 6053.3. Samples: 27307720. Policy #0 lag: (min: 0.0, avg: 4.5, max: 8.0) [2024-08-05 15:55:58,119][15372] Avg episode reward: [(0, '36.419')] [2024-08-05 15:56:00,274][15444] Updated weights for policy 0, policy_version 13341 (0.0018) [2024-08-05 15:56:03,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24439.4, 300 sec: 24159.5). Total num frames: 109355008. Throughput: 0: 6081.8. Samples: 27344720. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 15:56:03,126][15372] Avg episode reward: [(0, '36.201')] [2024-08-05 15:56:03,687][15444] Updated weights for policy 0, policy_version 13351 (0.0012) [2024-08-05 15:56:07,201][15444] Updated weights for policy 0, policy_version 13361 (0.0011) [2024-08-05 15:56:08,118][15372] Fps is (10 sec: 24575.8, 60 sec: 24439.5, 300 sec: 24187.3). Total num frames: 109477888. Throughput: 0: 6074.2. Samples: 27363350. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 15:56:08,119][15372] Avg episode reward: [(0, '35.832')] [2024-08-05 15:56:10,376][15444] Updated weights for policy 0, policy_version 13371 (0.0021) [2024-08-05 15:56:13,119][15372] Fps is (10 sec: 24575.8, 60 sec: 24302.9, 300 sec: 24159.5). Total num frames: 109600768. Throughput: 0: 6075.5. Samples: 27399450. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 15:56:13,119][15372] Avg episode reward: [(0, '35.583')] [2024-08-05 15:56:14,043][15444] Updated weights for policy 0, policy_version 13381 (0.0039) [2024-08-05 15:56:17,166][15444] Updated weights for policy 0, policy_version 13391 (0.0019) [2024-08-05 15:56:18,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24303.1, 300 sec: 24159.5). Total num frames: 109715456. Throughput: 0: 6051.1. Samples: 27435340. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 15:56:18,119][15372] Avg episode reward: [(0, '35.320')] [2024-08-05 15:56:18,122][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000013393_109715456.pth... [2024-08-05 15:56:18,247][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000012685_103915520.pth [2024-08-05 15:56:20,790][15444] Updated weights for policy 0, policy_version 13401 (0.0019) [2024-08-05 15:56:23,118][15372] Fps is (10 sec: 23757.2, 60 sec: 24303.0, 300 sec: 24187.9). Total num frames: 109838336. Throughput: 0: 6064.9. Samples: 27454130. Policy #0 lag: (min: 1.0, avg: 3.5, max: 7.0) [2024-08-05 15:56:23,119][15372] Avg episode reward: [(0, '35.446')] [2024-08-05 15:56:23,819][15444] Updated weights for policy 0, policy_version 13411 (0.0023) [2024-08-05 15:56:24,942][15417] Signal inference workers to stop experience collection... (4850 times) [2024-08-05 15:56:24,945][15417] Signal inference workers to resume experience collection... (4850 times) [2024-08-05 15:56:25,012][15444] InferenceWorker_p0-w0: stopping experience collection (4850 times) [2024-08-05 15:56:25,018][15444] InferenceWorker_p0-w0: resuming experience collection (4850 times) [2024-08-05 15:56:27,291][15444] Updated weights for policy 0, policy_version 13421 (0.0042) [2024-08-05 15:56:28,119][15372] Fps is (10 sec: 25394.9, 60 sec: 24439.4, 300 sec: 24215.0). Total num frames: 109969408. Throughput: 0: 6067.1. Samples: 27490690. Policy #0 lag: (min: 1.0, avg: 3.5, max: 7.0) [2024-08-05 15:56:28,119][15372] Avg episode reward: [(0, '35.710')] [2024-08-05 15:56:30,492][15444] Updated weights for policy 0, policy_version 13431 (0.0024) [2024-08-05 15:56:33,119][15372] Fps is (10 sec: 23755.4, 60 sec: 24166.1, 300 sec: 24159.4). Total num frames: 110075904. Throughput: 0: 6088.5. Samples: 27527210. Policy #0 lag: (min: 1.0, avg: 3.5, max: 7.0) [2024-08-05 15:56:33,119][15372] Avg episode reward: [(0, '36.249')] [2024-08-05 15:56:34,018][15444] Updated weights for policy 0, policy_version 13441 (0.0029) [2024-08-05 15:56:37,572][15444] Updated weights for policy 0, policy_version 13451 (0.0011) [2024-08-05 15:56:38,118][15372] Fps is (10 sec: 22938.0, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 110198784. Throughput: 0: 6076.7. Samples: 27545180. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 15:56:38,119][15372] Avg episode reward: [(0, '35.436')] [2024-08-05 15:56:40,690][15444] Updated weights for policy 0, policy_version 13461 (0.0015) [2024-08-05 15:56:43,118][15372] Fps is (10 sec: 25396.6, 60 sec: 24304.4, 300 sec: 24242.8). Total num frames: 110329856. Throughput: 0: 6070.9. Samples: 27580910. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 15:56:43,126][15372] Avg episode reward: [(0, '35.238')] [2024-08-05 15:56:44,303][15444] Updated weights for policy 0, policy_version 13471 (0.0018) [2024-08-05 15:56:47,707][15444] Updated weights for policy 0, policy_version 13481 (0.0012) [2024-08-05 15:56:48,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.9, 300 sec: 24187.2). Total num frames: 110436352. Throughput: 0: 6048.7. Samples: 27616910. Policy #0 lag: (min: 0.0, avg: 3.8, max: 9.0) [2024-08-05 15:56:48,119][15372] Avg episode reward: [(0, '35.060')] [2024-08-05 15:56:51,030][15444] Updated weights for policy 0, policy_version 13491 (0.0010) [2024-08-05 15:56:53,127][15372] Fps is (10 sec: 23736.4, 60 sec: 24299.4, 300 sec: 24242.1). Total num frames: 110567424. Throughput: 0: 6052.8. Samples: 27635780. Policy #0 lag: (min: 0.0, avg: 3.8, max: 9.0) [2024-08-05 15:56:53,127][15372] Avg episode reward: [(0, '35.779')] [2024-08-05 15:56:54,691][15444] Updated weights for policy 0, policy_version 13501 (0.0018) [2024-08-05 15:56:57,748][15444] Updated weights for policy 0, policy_version 13511 (0.0017) [2024-08-05 15:56:58,118][15372] Fps is (10 sec: 25395.1, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 110690304. Throughput: 0: 6055.8. Samples: 27671960. Policy #0 lag: (min: 0.0, avg: 4.2, max: 9.0) [2024-08-05 15:56:58,119][15372] Avg episode reward: [(0, '35.855')] [2024-08-05 15:57:01,231][15444] Updated weights for policy 0, policy_version 13521 (0.0020) [2024-08-05 15:57:03,121][15372] Fps is (10 sec: 23772.0, 60 sec: 24165.5, 300 sec: 24187.4). Total num frames: 110804992. Throughput: 0: 6045.0. Samples: 27707380. Policy #0 lag: (min: 0.0, avg: 4.2, max: 9.0) [2024-08-05 15:57:03,128][15372] Avg episode reward: [(0, '36.495')] [2024-08-05 15:57:04,570][15444] Updated weights for policy 0, policy_version 13531 (0.0026) [2024-08-05 15:57:07,981][15444] Updated weights for policy 0, policy_version 13541 (0.0023) [2024-08-05 15:57:08,119][15372] Fps is (10 sec: 23755.9, 60 sec: 24166.3, 300 sec: 24215.0). Total num frames: 110927872. Throughput: 0: 6055.9. Samples: 27726650. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 15:57:08,119][15372] Avg episode reward: [(0, '36.015')] [2024-08-05 15:57:11,365][15444] Updated weights for policy 0, policy_version 13551 (0.0014) [2024-08-05 15:57:13,118][15372] Fps is (10 sec: 24581.3, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 111050752. Throughput: 0: 6044.9. Samples: 27762710. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 15:57:13,126][15372] Avg episode reward: [(0, '35.670')] [2024-08-05 15:57:14,220][15417] Signal inference workers to stop experience collection... (4900 times) [2024-08-05 15:57:14,220][15417] Signal inference workers to resume experience collection... (4900 times) [2024-08-05 15:57:14,293][15444] InferenceWorker_p0-w0: stopping experience collection (4900 times) [2024-08-05 15:57:14,293][15444] InferenceWorker_p0-w0: resuming experience collection (4900 times) [2024-08-05 15:57:14,784][15444] Updated weights for policy 0, policy_version 13561 (0.0021) [2024-08-05 15:57:18,118][15372] Fps is (10 sec: 23757.7, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 111165440. Throughput: 0: 6061.9. Samples: 27799990. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 15:57:18,119][15372] Avg episode reward: [(0, '36.749')] [2024-08-05 15:57:18,153][15417] Saving new best policy, reward=36.749! [2024-08-05 15:57:18,171][15444] Updated weights for policy 0, policy_version 13571 (0.0021) [2024-08-05 15:57:21,220][15444] Updated weights for policy 0, policy_version 13581 (0.0038) [2024-08-05 15:57:23,119][15372] Fps is (10 sec: 23755.6, 60 sec: 24166.2, 300 sec: 24215.0). Total num frames: 111288320. Throughput: 0: 6061.3. Samples: 27817940. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 15:57:23,127][15372] Avg episode reward: [(0, '36.571')] [2024-08-05 15:57:24,754][15444] Updated weights for policy 0, policy_version 13591 (0.0012) [2024-08-05 15:57:28,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24029.9, 300 sec: 24187.2). Total num frames: 111411200. Throughput: 0: 6084.9. Samples: 27854730. Policy #0 lag: (min: 0.0, avg: 3.9, max: 9.0) [2024-08-05 15:57:28,126][15372] Avg episode reward: [(0, '36.183')] [2024-08-05 15:57:28,342][15444] Updated weights for policy 0, policy_version 13601 (0.0010) [2024-08-05 15:57:31,466][15444] Updated weights for policy 0, policy_version 13611 (0.0032) [2024-08-05 15:57:33,119][15372] Fps is (10 sec: 25395.8, 60 sec: 24439.6, 300 sec: 24242.8). Total num frames: 111542272. Throughput: 0: 6087.1. Samples: 27890830. Policy #0 lag: (min: 0.0, avg: 3.9, max: 9.0) [2024-08-05 15:57:33,127][15372] Avg episode reward: [(0, '36.015')] [2024-08-05 15:57:34,835][15444] Updated weights for policy 0, policy_version 13621 (0.0010) [2024-08-05 15:57:37,953][15444] Updated weights for policy 0, policy_version 13631 (0.0037) [2024-08-05 15:57:38,118][15372] Fps is (10 sec: 25395.2, 60 sec: 24439.5, 300 sec: 24242.8). Total num frames: 111665152. Throughput: 0: 6091.8. Samples: 27909860. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 15:57:38,119][15372] Avg episode reward: [(0, '36.400')] [2024-08-05 15:57:41,611][15444] Updated weights for policy 0, policy_version 13641 (0.0013) [2024-08-05 15:57:43,119][15372] Fps is (10 sec: 23757.3, 60 sec: 24166.3, 300 sec: 24215.0). Total num frames: 111779840. Throughput: 0: 6083.5. Samples: 27945720. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 15:57:43,119][15372] Avg episode reward: [(0, '35.695')] [2024-08-05 15:57:45,188][15444] Updated weights for policy 0, policy_version 13651 (0.0011) [2024-08-05 15:57:48,126][15372] Fps is (10 sec: 23739.3, 60 sec: 24436.5, 300 sec: 24242.2). Total num frames: 111902720. Throughput: 0: 6091.1. Samples: 27981510. Policy #0 lag: (min: 0.0, avg: 4.3, max: 7.0) [2024-08-05 15:57:48,133][15372] Avg episode reward: [(0, '35.466')] [2024-08-05 15:57:48,342][15444] Updated weights for policy 0, policy_version 13661 (0.0022) [2024-08-05 15:57:51,796][15444] Updated weights for policy 0, policy_version 13671 (0.0022) [2024-08-05 15:57:53,118][15372] Fps is (10 sec: 24576.3, 60 sec: 24306.4, 300 sec: 24270.6). Total num frames: 112025600. Throughput: 0: 6069.2. Samples: 27999760. Policy #0 lag: (min: 0.0, avg: 4.3, max: 7.0) [2024-08-05 15:57:53,119][15372] Avg episode reward: [(0, '36.047')] [2024-08-05 15:57:54,995][15444] Updated weights for policy 0, policy_version 13681 (0.0020) [2024-08-05 15:57:56,052][15417] Signal inference workers to stop experience collection... (4950 times) [2024-08-05 15:57:56,060][15417] Signal inference workers to resume experience collection... (4950 times) [2024-08-05 15:57:56,092][15444] InferenceWorker_p0-w0: stopping experience collection (4950 times) [2024-08-05 15:57:56,097][15444] InferenceWorker_p0-w0: resuming experience collection (4950 times) [2024-08-05 15:57:58,119][15372] Fps is (10 sec: 24592.7, 60 sec: 24302.7, 300 sec: 24242.7). Total num frames: 112148480. Throughput: 0: 6087.3. Samples: 28036640. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 15:57:58,119][15372] Avg episode reward: [(0, '35.801')] [2024-08-05 15:57:58,312][15444] Updated weights for policy 0, policy_version 13691 (0.0021) [2024-08-05 15:58:02,065][15444] Updated weights for policy 0, policy_version 13701 (0.0019) [2024-08-05 15:58:03,123][15372] Fps is (10 sec: 24568.1, 60 sec: 24439.0, 300 sec: 24270.3). Total num frames: 112271360. Throughput: 0: 6055.6. Samples: 28072510. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 15:58:03,124][15372] Avg episode reward: [(0, '36.020')] [2024-08-05 15:58:05,096][15444] Updated weights for policy 0, policy_version 13711 (0.0011) [2024-08-05 15:58:08,118][15372] Fps is (10 sec: 23758.2, 60 sec: 24303.1, 300 sec: 24242.8). Total num frames: 112386048. Throughput: 0: 6075.6. Samples: 28091340. Policy #0 lag: (min: 0.0, avg: 4.3, max: 7.0) [2024-08-05 15:58:08,126][15372] Avg episode reward: [(0, '36.372')] [2024-08-05 15:58:08,655][15444] Updated weights for policy 0, policy_version 13721 (0.0021) [2024-08-05 15:58:11,927][15444] Updated weights for policy 0, policy_version 13731 (0.0024) [2024-08-05 15:58:13,118][15372] Fps is (10 sec: 23764.6, 60 sec: 24303.0, 300 sec: 24270.5). Total num frames: 112508928. Throughput: 0: 6054.9. Samples: 28127200. Policy #0 lag: (min: 0.0, avg: 4.3, max: 7.0) [2024-08-05 15:58:13,119][15372] Avg episode reward: [(0, '34.825')] [2024-08-05 15:58:15,279][15444] Updated weights for policy 0, policy_version 13741 (0.0013) [2024-08-05 15:58:18,135][15372] Fps is (10 sec: 24536.4, 60 sec: 24432.9, 300 sec: 24269.2). Total num frames: 112631808. Throughput: 0: 6074.3. Samples: 28164270. Policy #0 lag: (min: 0.0, avg: 4.5, max: 8.0) [2024-08-05 15:58:18,143][15372] Avg episode reward: [(0, '34.557')] [2024-08-05 15:58:18,189][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000013750_112640000.pth... [2024-08-05 15:58:18,303][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000013038_106807296.pth [2024-08-05 15:58:18,834][15444] Updated weights for policy 0, policy_version 13751 (0.0010) [2024-08-05 15:58:21,975][15444] Updated weights for policy 0, policy_version 13761 (0.0020) [2024-08-05 15:58:23,119][15372] Fps is (10 sec: 23756.3, 60 sec: 24303.1, 300 sec: 24215.0). Total num frames: 112746496. Throughput: 0: 6059.5. Samples: 28182540. Policy #0 lag: (min: 0.0, avg: 4.5, max: 8.0) [2024-08-05 15:58:23,119][15372] Avg episode reward: [(0, '35.814')] [2024-08-05 15:58:25,450][15444] Updated weights for policy 0, policy_version 13771 (0.0019) [2024-08-05 15:58:28,119][15372] Fps is (10 sec: 24614.5, 60 sec: 24439.3, 300 sec: 24242.7). Total num frames: 112877568. Throughput: 0: 6073.5. Samples: 28219030. Policy #0 lag: (min: 1.0, avg: 4.1, max: 7.0) [2024-08-05 15:58:28,119][15372] Avg episode reward: [(0, '34.587')] [2024-08-05 15:58:29,134][15444] Updated weights for policy 0, policy_version 13781 (0.0020) [2024-08-05 15:58:32,252][15444] Updated weights for policy 0, policy_version 13791 (0.0021) [2024-08-05 15:58:33,118][15372] Fps is (10 sec: 24576.4, 60 sec: 24166.5, 300 sec: 24242.8). Total num frames: 112992256. Throughput: 0: 6066.1. Samples: 28254440. Policy #0 lag: (min: 1.0, avg: 4.1, max: 7.0) [2024-08-05 15:58:33,127][15372] Avg episode reward: [(0, '35.178')] [2024-08-05 15:58:35,550][15444] Updated weights for policy 0, policy_version 13801 (0.0015) [2024-08-05 15:58:38,119][15372] Fps is (10 sec: 23756.4, 60 sec: 24166.1, 300 sec: 24242.7). Total num frames: 113115136. Throughput: 0: 6072.6. Samples: 28273030. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 15:58:38,119][15372] Avg episode reward: [(0, '35.987')] [2024-08-05 15:58:38,864][15444] Updated weights for policy 0, policy_version 13811 (0.0013) [2024-08-05 15:58:41,040][15417] Signal inference workers to stop experience collection... (5000 times) [2024-08-05 15:58:41,048][15417] Signal inference workers to resume experience collection... (5000 times) [2024-08-05 15:58:41,101][15444] InferenceWorker_p0-w0: stopping experience collection (5000 times) [2024-08-05 15:58:41,102][15444] InferenceWorker_p0-w0: resuming experience collection (5000 times) [2024-08-05 15:58:42,470][15444] Updated weights for policy 0, policy_version 13821 (0.0027) [2024-08-05 15:58:43,120][15372] Fps is (10 sec: 24572.5, 60 sec: 24302.4, 300 sec: 24242.8). Total num frames: 113238016. Throughput: 0: 6055.4. Samples: 28309140. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 15:58:43,120][15372] Avg episode reward: [(0, '35.544')] [2024-08-05 15:58:45,563][15444] Updated weights for policy 0, policy_version 13831 (0.0010) [2024-08-05 15:58:48,118][15372] Fps is (10 sec: 24577.7, 60 sec: 24305.9, 300 sec: 24242.8). Total num frames: 113360896. Throughput: 0: 6061.3. Samples: 28345250. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 15:58:48,126][15372] Avg episode reward: [(0, '35.743')] [2024-08-05 15:58:49,377][15444] Updated weights for policy 0, policy_version 13841 (0.0033) [2024-08-05 15:58:52,411][15444] Updated weights for policy 0, policy_version 13851 (0.0010) [2024-08-05 15:58:53,118][15372] Fps is (10 sec: 23760.3, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 113475584. Throughput: 0: 6052.0. Samples: 28363680. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 15:58:53,119][15372] Avg episode reward: [(0, '35.839')] [2024-08-05 15:58:55,794][15444] Updated weights for policy 0, policy_version 13861 (0.0028) [2024-08-05 15:58:58,121][15372] Fps is (10 sec: 23750.1, 60 sec: 24165.5, 300 sec: 24242.5). Total num frames: 113598464. Throughput: 0: 6053.0. Samples: 28399600. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 15:58:58,129][15372] Avg episode reward: [(0, '36.110')] [2024-08-05 15:58:59,496][15444] Updated weights for policy 0, policy_version 13871 (0.0011) [2024-08-05 15:59:02,711][15444] Updated weights for policy 0, policy_version 13881 (0.0019) [2024-08-05 15:59:03,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24167.7, 300 sec: 24242.8). Total num frames: 113721344. Throughput: 0: 6042.2. Samples: 28436070. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 15:59:03,119][15372] Avg episode reward: [(0, '36.800')] [2024-08-05 15:59:03,119][15417] Saving new best policy, reward=36.800! [2024-08-05 15:59:06,094][15444] Updated weights for policy 0, policy_version 13891 (0.0010) [2024-08-05 15:59:08,119][15372] Fps is (10 sec: 23762.0, 60 sec: 24166.2, 300 sec: 24187.2). Total num frames: 113836032. Throughput: 0: 6037.9. Samples: 28454250. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 15:59:08,126][15372] Avg episode reward: [(0, '36.935')] [2024-08-05 15:59:08,173][15417] Saving new best policy, reward=36.935! [2024-08-05 15:59:09,598][15444] Updated weights for policy 0, policy_version 13901 (0.0011) [2024-08-05 15:59:13,084][15444] Updated weights for policy 0, policy_version 13911 (0.0027) [2024-08-05 15:59:13,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.4, 300 sec: 24242.8). Total num frames: 113958912. Throughput: 0: 6018.9. Samples: 28489880. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 15:59:13,119][15372] Avg episode reward: [(0, '36.978')] [2024-08-05 15:59:13,120][15417] Saving new best policy, reward=36.978! [2024-08-05 15:59:16,636][15444] Updated weights for policy 0, policy_version 13921 (0.0021) [2024-08-05 15:59:18,118][15372] Fps is (10 sec: 24577.5, 60 sec: 24172.9, 300 sec: 24242.8). Total num frames: 114081792. Throughput: 0: 6022.2. Samples: 28525440. Policy #0 lag: (min: 0.0, avg: 3.2, max: 7.0) [2024-08-05 15:59:18,126][15372] Avg episode reward: [(0, '36.558')] [2024-08-05 15:59:19,643][15444] Updated weights for policy 0, policy_version 13931 (0.0010) [2024-08-05 15:59:23,119][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 114196480. Throughput: 0: 6037.9. Samples: 28544730. Policy #0 lag: (min: 0.0, avg: 3.2, max: 7.0) [2024-08-05 15:59:23,127][15372] Avg episode reward: [(0, '35.905')] [2024-08-05 15:59:23,148][15444] Updated weights for policy 0, policy_version 13941 (0.0032) [2024-08-05 15:59:26,508][15444] Updated weights for policy 0, policy_version 13951 (0.0020) [2024-08-05 15:59:28,119][15372] Fps is (10 sec: 23755.6, 60 sec: 24029.9, 300 sec: 24242.7). Total num frames: 114319360. Throughput: 0: 6029.9. Samples: 28580480. Policy #0 lag: (min: 1.0, avg: 3.7, max: 7.0) [2024-08-05 15:59:28,127][15372] Avg episode reward: [(0, '35.612')] [2024-08-05 15:59:29,953][15444] Updated weights for policy 0, policy_version 13961 (0.0012) [2024-08-05 15:59:31,132][15417] Signal inference workers to stop experience collection... (5050 times) [2024-08-05 15:59:31,133][15417] Signal inference workers to resume experience collection... (5050 times) [2024-08-05 15:59:31,171][15444] InferenceWorker_p0-w0: stopping experience collection (5050 times) [2024-08-05 15:59:31,171][15444] InferenceWorker_p0-w0: resuming experience collection (5050 times) [2024-08-05 15:59:33,119][15372] Fps is (10 sec: 24575.8, 60 sec: 24166.3, 300 sec: 24243.0). Total num frames: 114442240. Throughput: 0: 6047.5. Samples: 28617390. Policy #0 lag: (min: 1.0, avg: 3.7, max: 7.0) [2024-08-05 15:59:33,119][15372] Avg episode reward: [(0, '35.742')] [2024-08-05 15:59:33,304][15444] Updated weights for policy 0, policy_version 13971 (0.0013) [2024-08-05 15:59:36,437][15444] Updated weights for policy 0, policy_version 13981 (0.0017) [2024-08-05 15:59:38,118][15372] Fps is (10 sec: 24577.3, 60 sec: 24166.7, 300 sec: 24242.8). Total num frames: 114565120. Throughput: 0: 6050.9. Samples: 28635970. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 15:59:38,119][15372] Avg episode reward: [(0, '36.372')] [2024-08-05 15:59:39,737][15444] Updated weights for policy 0, policy_version 13991 (0.0023) [2024-08-05 15:59:43,118][15372] Fps is (10 sec: 24576.5, 60 sec: 24167.0, 300 sec: 24270.5). Total num frames: 114688000. Throughput: 0: 6075.7. Samples: 28672990. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 15:59:43,126][15372] Avg episode reward: [(0, '37.055')] [2024-08-05 15:59:43,192][15417] Saving new best policy, reward=37.055! [2024-08-05 15:59:43,215][15444] Updated weights for policy 0, policy_version 14001 (0.0011) [2024-08-05 15:59:46,691][15444] Updated weights for policy 0, policy_version 14011 (0.0037) [2024-08-05 15:59:48,119][15372] Fps is (10 sec: 24574.3, 60 sec: 24166.1, 300 sec: 24270.5). Total num frames: 114810880. Throughput: 0: 6055.5. Samples: 28708570. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 15:59:48,119][15372] Avg episode reward: [(0, '37.030')] [2024-08-05 15:59:50,032][15444] Updated weights for policy 0, policy_version 14021 (0.0013) [2024-08-05 15:59:53,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24302.9, 300 sec: 24242.8). Total num frames: 114933760. Throughput: 0: 6065.2. Samples: 28727180. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 15:59:53,126][15372] Avg episode reward: [(0, '36.075')] [2024-08-05 15:59:53,419][15444] Updated weights for policy 0, policy_version 14031 (0.0021) [2024-08-05 15:59:56,666][15444] Updated weights for policy 0, policy_version 14041 (0.0020) [2024-08-05 15:59:58,118][15372] Fps is (10 sec: 24577.7, 60 sec: 24304.1, 300 sec: 24298.3). Total num frames: 115056640. Throughput: 0: 6083.1. Samples: 28763620. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 15:59:58,119][15372] Avg episode reward: [(0, '35.240')] [2024-08-05 16:00:00,134][15444] Updated weights for policy 0, policy_version 14051 (0.0013) [2024-08-05 16:00:03,119][15372] Fps is (10 sec: 24575.5, 60 sec: 24302.9, 300 sec: 24298.3). Total num frames: 115179520. Throughput: 0: 6116.2. Samples: 28800670. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 16:00:03,126][15372] Avg episode reward: [(0, '34.264')] [2024-08-05 16:00:03,369][15444] Updated weights for policy 0, policy_version 14061 (0.0010) [2024-08-05 16:00:06,643][15444] Updated weights for policy 0, policy_version 14071 (0.0010) [2024-08-05 16:00:08,119][15372] Fps is (10 sec: 23754.7, 60 sec: 24302.8, 300 sec: 24242.7). Total num frames: 115294208. Throughput: 0: 6093.0. Samples: 28818920. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 16:00:08,127][15372] Avg episode reward: [(0, '36.639')] [2024-08-05 16:00:10,143][15444] Updated weights for policy 0, policy_version 14081 (0.0024) [2024-08-05 16:00:13,118][15372] Fps is (10 sec: 24576.5, 60 sec: 24439.5, 300 sec: 24298.3). Total num frames: 115425280. Throughput: 0: 6131.6. Samples: 28856400. Policy #0 lag: (min: 1.0, avg: 3.9, max: 7.0) [2024-08-05 16:00:13,126][15372] Avg episode reward: [(0, '36.563')] [2024-08-05 16:00:13,437][15444] Updated weights for policy 0, policy_version 14091 (0.0018) [2024-08-05 16:00:16,891][15444] Updated weights for policy 0, policy_version 14101 (0.0015) [2024-08-05 16:00:18,118][15372] Fps is (10 sec: 25397.5, 60 sec: 24439.5, 300 sec: 24298.3). Total num frames: 115548160. Throughput: 0: 6100.9. Samples: 28891930. Policy #0 lag: (min: 1.0, avg: 3.9, max: 7.0) [2024-08-05 16:00:18,119][15372] Avg episode reward: [(0, '35.165')] [2024-08-05 16:00:18,123][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000014105_115548160.pth... [2024-08-05 16:00:18,238][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000013393_109715456.pth [2024-08-05 16:00:20,059][15444] Updated weights for policy 0, policy_version 14111 (0.0011) [2024-08-05 16:00:23,119][15372] Fps is (10 sec: 23756.4, 60 sec: 24439.4, 300 sec: 24270.5). Total num frames: 115662848. Throughput: 0: 6104.4. Samples: 28910670. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 16:00:23,126][15372] Avg episode reward: [(0, '35.802')] [2024-08-05 16:00:23,501][15444] Updated weights for policy 0, policy_version 14121 (0.0018) [2024-08-05 16:00:26,994][15444] Updated weights for policy 0, policy_version 14131 (0.0012) [2024-08-05 16:00:28,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24439.7, 300 sec: 24270.5). Total num frames: 115785728. Throughput: 0: 6079.3. Samples: 28946560. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 16:00:28,119][15372] Avg episode reward: [(0, '35.984')] [2024-08-05 16:00:30,215][15444] Updated weights for policy 0, policy_version 14141 (0.0018) [2024-08-05 16:00:33,119][15372] Fps is (10 sec: 24575.3, 60 sec: 24439.4, 300 sec: 24270.5). Total num frames: 115908608. Throughput: 0: 6093.6. Samples: 28982780. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 16:00:33,127][15372] Avg episode reward: [(0, '35.744')] [2024-08-05 16:00:33,713][15444] Updated weights for policy 0, policy_version 14151 (0.0025) [2024-08-05 16:00:37,127][15444] Updated weights for policy 0, policy_version 14161 (0.0042) [2024-08-05 16:00:38,119][15372] Fps is (10 sec: 23756.0, 60 sec: 24302.8, 300 sec: 24243.0). Total num frames: 116023296. Throughput: 0: 6083.7. Samples: 29000950. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 16:00:38,119][15372] Avg episode reward: [(0, '35.800')] [2024-08-05 16:00:40,644][15444] Updated weights for policy 0, policy_version 14171 (0.0025) [2024-08-05 16:00:40,983][15417] Signal inference workers to stop experience collection... (5100 times) [2024-08-05 16:00:40,983][15417] Signal inference workers to resume experience collection... (5100 times) [2024-08-05 16:00:41,051][15444] InferenceWorker_p0-w0: stopping experience collection (5100 times) [2024-08-05 16:00:41,051][15444] InferenceWorker_p0-w0: resuming experience collection (5100 times) [2024-08-05 16:00:43,118][15372] Fps is (10 sec: 24577.0, 60 sec: 24439.4, 300 sec: 24298.4). Total num frames: 116154368. Throughput: 0: 6088.0. Samples: 29037580. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 16:00:43,119][15372] Avg episode reward: [(0, '36.149')] [2024-08-05 16:00:43,738][15444] Updated weights for policy 0, policy_version 14181 (0.0014) [2024-08-05 16:00:47,082][15444] Updated weights for policy 0, policy_version 14191 (0.0013) [2024-08-05 16:00:48,119][15372] Fps is (10 sec: 24576.4, 60 sec: 24303.1, 300 sec: 24270.5). Total num frames: 116269056. Throughput: 0: 6086.2. Samples: 29074550. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 16:00:48,119][15372] Avg episode reward: [(0, '36.282')] [2024-08-05 16:00:50,513][15444] Updated weights for policy 0, policy_version 14201 (0.0014) [2024-08-05 16:00:53,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24439.5, 300 sec: 24298.3). Total num frames: 116400128. Throughput: 0: 6093.5. Samples: 29093120. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 16:00:53,119][15372] Avg episode reward: [(0, '35.739')] [2024-08-05 16:00:53,937][15444] Updated weights for policy 0, policy_version 14211 (0.0018) [2024-08-05 16:00:57,213][15444] Updated weights for policy 0, policy_version 14221 (0.0012) [2024-08-05 16:00:58,118][15372] Fps is (10 sec: 24576.6, 60 sec: 24302.9, 300 sec: 24270.5). Total num frames: 116514816. Throughput: 0: 6067.3. Samples: 29129430. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 16:00:58,126][15372] Avg episode reward: [(0, '36.830')] [2024-08-05 16:01:00,599][15444] Updated weights for policy 0, policy_version 14231 (0.0012) [2024-08-05 16:01:03,123][15372] Fps is (10 sec: 23745.9, 60 sec: 24301.2, 300 sec: 24270.2). Total num frames: 116637696. Throughput: 0: 6090.9. Samples: 29166050. Policy #0 lag: (min: 0.0, avg: 4.4, max: 8.0) [2024-08-05 16:01:03,123][15372] Avg episode reward: [(0, '36.901')] [2024-08-05 16:01:03,979][15444] Updated weights for policy 0, policy_version 14241 (0.0010) [2024-08-05 16:01:07,377][15444] Updated weights for policy 0, policy_version 14251 (0.0016) [2024-08-05 16:01:08,119][15372] Fps is (10 sec: 24575.5, 60 sec: 24439.8, 300 sec: 24270.5). Total num frames: 116760576. Throughput: 0: 6078.0. Samples: 29184180. Policy #0 lag: (min: 0.0, avg: 4.4, max: 8.0) [2024-08-05 16:01:08,120][15372] Avg episode reward: [(0, '35.514')] [2024-08-05 16:01:10,638][15444] Updated weights for policy 0, policy_version 14261 (0.0012) [2024-08-05 16:01:13,118][15372] Fps is (10 sec: 24587.4, 60 sec: 24302.9, 300 sec: 24298.3). Total num frames: 116883456. Throughput: 0: 6075.8. Samples: 29219970. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 16:01:13,126][15372] Avg episode reward: [(0, '35.412')] [2024-08-05 16:01:14,210][15444] Updated weights for policy 0, policy_version 14271 (0.0032) [2024-08-05 16:01:17,693][15444] Updated weights for policy 0, policy_version 14281 (0.0012) [2024-08-05 16:01:18,118][15372] Fps is (10 sec: 23757.4, 60 sec: 24166.4, 300 sec: 24270.5). Total num frames: 116998144. Throughput: 0: 6075.8. Samples: 29256190. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 16:01:18,119][15372] Avg episode reward: [(0, '35.959')] [2024-08-05 16:01:21,146][15444] Updated weights for policy 0, policy_version 14291 (0.0012) [2024-08-05 16:01:23,119][15372] Fps is (10 sec: 22937.4, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 117112832. Throughput: 0: 6078.0. Samples: 29274460. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 16:01:23,127][15372] Avg episode reward: [(0, '36.372')] [2024-08-05 16:01:24,341][15444] Updated weights for policy 0, policy_version 14301 (0.0021) [2024-08-05 16:01:27,929][15444] Updated weights for policy 0, policy_version 14311 (0.0030) [2024-08-05 16:01:28,119][15372] Fps is (10 sec: 23754.3, 60 sec: 24166.0, 300 sec: 24270.5). Total num frames: 117235712. Throughput: 0: 6059.4. Samples: 29310260. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 16:01:28,120][15372] Avg episode reward: [(0, '35.995')] [2024-08-05 16:01:30,902][15444] Updated weights for policy 0, policy_version 14321 (0.0013) [2024-08-05 16:01:32,384][15417] Signal inference workers to stop experience collection... (5150 times) [2024-08-05 16:01:32,392][15417] Signal inference workers to resume experience collection... (5150 times) [2024-08-05 16:01:32,428][15444] InferenceWorker_p0-w0: stopping experience collection (5150 times) [2024-08-05 16:01:32,428][15444] InferenceWorker_p0-w0: resuming experience collection (5150 times) [2024-08-05 16:01:33,118][15372] Fps is (10 sec: 24576.3, 60 sec: 24166.6, 300 sec: 24270.5). Total num frames: 117358592. Throughput: 0: 6046.3. Samples: 29346630. Policy #0 lag: (min: 0.0, avg: 4.7, max: 8.0) [2024-08-05 16:01:33,119][15372] Avg episode reward: [(0, '36.235')] [2024-08-05 16:01:34,533][15444] Updated weights for policy 0, policy_version 14331 (0.0011) [2024-08-05 16:01:37,942][15444] Updated weights for policy 0, policy_version 14341 (0.0017) [2024-08-05 16:01:38,120][15372] Fps is (10 sec: 24574.6, 60 sec: 24302.4, 300 sec: 24242.6). Total num frames: 117481472. Throughput: 0: 6046.2. Samples: 29365210. Policy #0 lag: (min: 0.0, avg: 4.7, max: 8.0) [2024-08-05 16:01:38,120][15372] Avg episode reward: [(0, '35.620')] [2024-08-05 16:01:41,079][15444] Updated weights for policy 0, policy_version 14351 (0.0010) [2024-08-05 16:01:43,119][15372] Fps is (10 sec: 24575.4, 60 sec: 24166.3, 300 sec: 24298.3). Total num frames: 117604352. Throughput: 0: 6038.0. Samples: 29401140. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 16:01:43,119][15372] Avg episode reward: [(0, '35.904')] [2024-08-05 16:01:44,799][15444] Updated weights for policy 0, policy_version 14361 (0.0020) [2024-08-05 16:01:48,119][15372] Fps is (10 sec: 23759.9, 60 sec: 24166.4, 300 sec: 24243.5). Total num frames: 117719040. Throughput: 0: 6011.5. Samples: 29436540. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 16:01:48,119][15372] Avg episode reward: [(0, '36.702')] [2024-08-05 16:01:48,217][15444] Updated weights for policy 0, policy_version 14371 (0.0018) [2024-08-05 16:01:51,660][15444] Updated weights for policy 0, policy_version 14381 (0.0039) [2024-08-05 16:01:53,118][15372] Fps is (10 sec: 23757.3, 60 sec: 24029.9, 300 sec: 24242.8). Total num frames: 117841920. Throughput: 0: 6021.1. Samples: 29455130. Policy #0 lag: (min: 1.0, avg: 4.1, max: 7.0) [2024-08-05 16:01:53,119][15372] Avg episode reward: [(0, '37.428')] [2024-08-05 16:01:53,119][15417] Saving new best policy, reward=37.428! [2024-08-05 16:01:54,840][15444] Updated weights for policy 0, policy_version 14391 (0.0017) [2024-08-05 16:01:58,119][15372] Fps is (10 sec: 23756.6, 60 sec: 24029.7, 300 sec: 24242.9). Total num frames: 117956608. Throughput: 0: 6029.5. Samples: 29491300. Policy #0 lag: (min: 1.0, avg: 4.1, max: 7.0) [2024-08-05 16:01:58,127][15372] Avg episode reward: [(0, '36.882')] [2024-08-05 16:01:58,461][15444] Updated weights for policy 0, policy_version 14401 (0.0025) [2024-08-05 16:02:01,957][15444] Updated weights for policy 0, policy_version 14411 (0.0036) [2024-08-05 16:02:03,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24168.3, 300 sec: 24270.6). Total num frames: 118087680. Throughput: 0: 6013.6. Samples: 29526800. Policy #0 lag: (min: 1.0, avg: 4.1, max: 7.0) [2024-08-05 16:02:03,119][15372] Avg episode reward: [(0, '36.307')] [2024-08-05 16:02:04,963][15444] Updated weights for policy 0, policy_version 14421 (0.0030) [2024-08-05 16:02:08,118][15372] Fps is (10 sec: 24576.9, 60 sec: 24030.0, 300 sec: 24242.8). Total num frames: 118202368. Throughput: 0: 6026.2. Samples: 29545640. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 16:02:08,126][15372] Avg episode reward: [(0, '35.751')] [2024-08-05 16:02:08,607][15444] Updated weights for policy 0, policy_version 14431 (0.0013) [2024-08-05 16:02:10,947][15417] Signal inference workers to stop experience collection... (5200 times) [2024-08-05 16:02:10,948][15417] Signal inference workers to resume experience collection... (5200 times) [2024-08-05 16:02:11,016][15444] InferenceWorker_p0-w0: stopping experience collection (5200 times) [2024-08-05 16:02:11,016][15444] InferenceWorker_p0-w0: resuming experience collection (5200 times) [2024-08-05 16:02:11,895][15444] Updated weights for policy 0, policy_version 14441 (0.0014) [2024-08-05 16:02:13,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24029.9, 300 sec: 24270.5). Total num frames: 118325248. Throughput: 0: 6030.1. Samples: 29581610. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 16:02:13,119][15372] Avg episode reward: [(0, '35.400')] [2024-08-05 16:02:15,133][15444] Updated weights for policy 0, policy_version 14451 (0.0023) [2024-08-05 16:02:18,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.4, 300 sec: 24270.6). Total num frames: 118448128. Throughput: 0: 6042.4. Samples: 29618540. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 16:02:18,126][15372] Avg episode reward: [(0, '36.052')] [2024-08-05 16:02:18,130][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000014459_118448128.pth... [2024-08-05 16:02:18,276][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000013750_112640000.pth [2024-08-05 16:02:18,726][15444] Updated weights for policy 0, policy_version 14461 (0.0012) [2024-08-05 16:02:22,098][15444] Updated weights for policy 0, policy_version 14471 (0.0024) [2024-08-05 16:02:23,119][15372] Fps is (10 sec: 24574.6, 60 sec: 24302.7, 300 sec: 24270.5). Total num frames: 118571008. Throughput: 0: 6036.6. Samples: 29636850. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 16:02:23,119][15372] Avg episode reward: [(0, '35.880')] [2024-08-05 16:02:25,374][15444] Updated weights for policy 0, policy_version 14481 (0.0022) [2024-08-05 16:02:28,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24303.3, 300 sec: 24242.8). Total num frames: 118693888. Throughput: 0: 6042.2. Samples: 29673040. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 16:02:28,119][15372] Avg episode reward: [(0, '35.343')] [2024-08-05 16:02:28,976][15444] Updated weights for policy 0, policy_version 14491 (0.0014) [2024-08-05 16:02:32,105][15444] Updated weights for policy 0, policy_version 14501 (0.0019) [2024-08-05 16:02:33,118][15372] Fps is (10 sec: 23758.2, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 118808576. Throughput: 0: 6046.7. Samples: 29708640. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 16:02:33,126][15372] Avg episode reward: [(0, '35.819')] [2024-08-05 16:02:35,481][15444] Updated weights for policy 0, policy_version 14511 (0.0014) [2024-08-05 16:02:38,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24167.0, 300 sec: 24242.8). Total num frames: 118931456. Throughput: 0: 6038.0. Samples: 29726840. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 16:02:38,126][15372] Avg episode reward: [(0, '36.684')] [2024-08-05 16:02:39,111][15444] Updated weights for policy 0, policy_version 14521 (0.0026) [2024-08-05 16:02:42,384][15444] Updated weights for policy 0, policy_version 14531 (0.0040) [2024-08-05 16:02:43,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.5, 300 sec: 24243.4). Total num frames: 119054336. Throughput: 0: 6039.8. Samples: 29763090. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 16:02:43,126][15372] Avg episode reward: [(0, '36.547')] [2024-08-05 16:02:45,992][15444] Updated weights for policy 0, policy_version 14541 (0.0011) [2024-08-05 16:02:48,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.5, 300 sec: 24215.0). Total num frames: 119169024. Throughput: 0: 6044.9. Samples: 29798820. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 16:02:48,119][15372] Avg episode reward: [(0, '35.890')] [2024-08-05 16:02:49,300][15444] Updated weights for policy 0, policy_version 14551 (0.0039) [2024-08-05 16:02:52,823][15444] Updated weights for policy 0, policy_version 14561 (0.0010) [2024-08-05 16:02:53,118][15372] Fps is (10 sec: 22937.7, 60 sec: 24029.9, 300 sec: 24187.3). Total num frames: 119283712. Throughput: 0: 6035.1. Samples: 29817220. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 16:02:53,119][15372] Avg episode reward: [(0, '36.213')] [2024-08-05 16:02:55,976][15444] Updated weights for policy 0, policy_version 14571 (0.0019) [2024-08-05 16:02:58,119][15372] Fps is (10 sec: 24575.7, 60 sec: 24303.0, 300 sec: 24215.3). Total num frames: 119414784. Throughput: 0: 6023.8. Samples: 29852680. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 16:02:58,119][15372] Avg episode reward: [(0, '35.654')] [2024-08-05 16:02:59,713][15444] Updated weights for policy 0, policy_version 14581 (0.0023) [2024-08-05 16:03:00,290][15417] Signal inference workers to stop experience collection... (5250 times) [2024-08-05 16:03:00,290][15417] Signal inference workers to resume experience collection... (5250 times) [2024-08-05 16:03:00,332][15444] InferenceWorker_p0-w0: stopping experience collection (5250 times) [2024-08-05 16:03:00,332][15444] InferenceWorker_p0-w0: resuming experience collection (5250 times) [2024-08-05 16:03:03,119][15372] Fps is (10 sec: 23756.4, 60 sec: 23893.3, 300 sec: 24187.2). Total num frames: 119521280. Throughput: 0: 6004.7. Samples: 29888750. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 16:03:03,119][15372] Avg episode reward: [(0, '35.763')] [2024-08-05 16:03:03,165][15444] Updated weights for policy 0, policy_version 14591 (0.0012) [2024-08-05 16:03:06,284][15444] Updated weights for policy 0, policy_version 14601 (0.0018) [2024-08-05 16:03:08,118][15372] Fps is (10 sec: 22937.9, 60 sec: 24029.9, 300 sec: 24187.2). Total num frames: 119644160. Throughput: 0: 6006.7. Samples: 29907150. Policy #0 lag: (min: 0.0, avg: 3.0, max: 7.0) [2024-08-05 16:03:08,126][15372] Avg episode reward: [(0, '35.653')] [2024-08-05 16:03:09,941][15444] Updated weights for policy 0, policy_version 14611 (0.0013) [2024-08-05 16:03:12,953][15444] Updated weights for policy 0, policy_version 14621 (0.0025) [2024-08-05 16:03:13,119][15372] Fps is (10 sec: 25395.1, 60 sec: 24166.3, 300 sec: 24216.3). Total num frames: 119775232. Throughput: 0: 6015.5. Samples: 29943740. Policy #0 lag: (min: 0.0, avg: 3.0, max: 7.0) [2024-08-05 16:03:13,119][15372] Avg episode reward: [(0, '36.341')] [2024-08-05 16:03:16,534][15444] Updated weights for policy 0, policy_version 14631 (0.0018) [2024-08-05 16:03:18,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24029.9, 300 sec: 24215.0). Total num frames: 119889920. Throughput: 0: 6018.2. Samples: 29979460. Policy #0 lag: (min: 0.0, avg: 3.0, max: 7.0) [2024-08-05 16:03:18,119][15372] Avg episode reward: [(0, '35.733')] [2024-08-05 16:03:19,883][15444] Updated weights for policy 0, policy_version 14641 (0.0012) [2024-08-05 16:03:23,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24030.1, 300 sec: 24187.3). Total num frames: 120012800. Throughput: 0: 6024.7. Samples: 29997950. Policy #0 lag: (min: 1.0, avg: 3.2, max: 7.0) [2024-08-05 16:03:23,126][15372] Avg episode reward: [(0, '36.102')] [2024-08-05 16:03:23,274][15444] Updated weights for policy 0, policy_version 14651 (0.0011) [2024-08-05 16:03:26,509][15444] Updated weights for policy 0, policy_version 14661 (0.0019) [2024-08-05 16:03:28,121][15372] Fps is (10 sec: 24570.1, 60 sec: 24028.9, 300 sec: 24214.8). Total num frames: 120135680. Throughput: 0: 6019.0. Samples: 30033960. Policy #0 lag: (min: 1.0, avg: 3.2, max: 7.0) [2024-08-05 16:03:28,129][15372] Avg episode reward: [(0, '36.062')] [2024-08-05 16:03:29,841][15444] Updated weights for policy 0, policy_version 14671 (0.0011) [2024-08-05 16:03:33,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 120258560. Throughput: 0: 6040.4. Samples: 30070640. Policy #0 lag: (min: 0.0, avg: 3.6, max: 9.0) [2024-08-05 16:03:33,126][15372] Avg episode reward: [(0, '36.739')] [2024-08-05 16:03:33,505][15444] Updated weights for policy 0, policy_version 14681 (0.0012) [2024-08-05 16:03:36,725][15444] Updated weights for policy 0, policy_version 14691 (0.0012) [2024-08-05 16:03:38,118][15372] Fps is (10 sec: 24581.9, 60 sec: 24166.4, 300 sec: 24215.1). Total num frames: 120381440. Throughput: 0: 6044.9. Samples: 30089240. Policy #0 lag: (min: 0.0, avg: 3.6, max: 9.0) [2024-08-05 16:03:38,119][15372] Avg episode reward: [(0, '36.134')] [2024-08-05 16:03:38,908][15417] Signal inference workers to stop experience collection... (5300 times) [2024-08-05 16:03:38,908][15417] Signal inference workers to resume experience collection... (5300 times) [2024-08-05 16:03:38,948][15444] InferenceWorker_p0-w0: stopping experience collection (5300 times) [2024-08-05 16:03:38,948][15444] InferenceWorker_p0-w0: resuming experience collection (5300 times) [2024-08-05 16:03:40,058][15444] Updated weights for policy 0, policy_version 14701 (0.0028) [2024-08-05 16:03:43,119][15372] Fps is (10 sec: 24575.8, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 120504320. Throughput: 0: 6072.9. Samples: 30125960. Policy #0 lag: (min: 1.0, avg: 3.7, max: 7.0) [2024-08-05 16:03:43,119][15372] Avg episode reward: [(0, '35.907')] [2024-08-05 16:03:43,344][15444] Updated weights for policy 0, policy_version 14711 (0.0023) [2024-08-05 16:03:46,934][15444] Updated weights for policy 0, policy_version 14721 (0.0011) [2024-08-05 16:03:48,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24302.9, 300 sec: 24242.8). Total num frames: 120627200. Throughput: 0: 6071.6. Samples: 30161970. Policy #0 lag: (min: 1.0, avg: 3.7, max: 7.0) [2024-08-05 16:03:48,119][15372] Avg episode reward: [(0, '36.115')] [2024-08-05 16:03:50,012][15444] Updated weights for policy 0, policy_version 14731 (0.0024) [2024-08-05 16:03:53,118][15372] Fps is (10 sec: 23757.2, 60 sec: 24303.0, 300 sec: 24215.2). Total num frames: 120741888. Throughput: 0: 6076.7. Samples: 30180600. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 16:03:53,126][15372] Avg episode reward: [(0, '36.946')] [2024-08-05 16:03:53,720][15444] Updated weights for policy 0, policy_version 14741 (0.0015) [2024-08-05 16:03:56,996][15444] Updated weights for policy 0, policy_version 14751 (0.0011) [2024-08-05 16:03:58,119][15372] Fps is (10 sec: 23756.6, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 120864768. Throughput: 0: 6069.6. Samples: 30216870. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 16:03:58,119][15372] Avg episode reward: [(0, '37.090')] [2024-08-05 16:04:00,209][15444] Updated weights for policy 0, policy_version 14761 (0.0025) [2024-08-05 16:04:03,118][15372] Fps is (10 sec: 25394.9, 60 sec: 24576.0, 300 sec: 24270.6). Total num frames: 120995840. Throughput: 0: 6097.1. Samples: 30253830. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 16:04:03,126][15372] Avg episode reward: [(0, '36.146')] [2024-08-05 16:04:03,426][15444] Updated weights for policy 0, policy_version 14771 (0.0023) [2024-08-05 16:04:06,873][15444] Updated weights for policy 0, policy_version 14781 (0.0022) [2024-08-05 16:04:08,119][15372] Fps is (10 sec: 24576.0, 60 sec: 24439.4, 300 sec: 24242.8). Total num frames: 121110528. Throughput: 0: 6096.2. Samples: 30272280. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 16:04:08,119][15372] Avg episode reward: [(0, '35.094')] [2024-08-05 16:04:10,411][15444] Updated weights for policy 0, policy_version 14791 (0.0016) [2024-08-05 16:04:13,118][15372] Fps is (10 sec: 23756.6, 60 sec: 24303.0, 300 sec: 24242.8). Total num frames: 121233408. Throughput: 0: 6103.9. Samples: 30308620. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 16:04:13,119][15372] Avg episode reward: [(0, '35.051')] [2024-08-05 16:04:13,683][15444] Updated weights for policy 0, policy_version 14801 (0.0013) [2024-08-05 16:04:17,057][15444] Updated weights for policy 0, policy_version 14811 (0.0012) [2024-08-05 16:04:18,119][15372] Fps is (10 sec: 24575.9, 60 sec: 24439.4, 300 sec: 24270.5). Total num frames: 121356288. Throughput: 0: 6089.8. Samples: 30344680. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 16:04:18,119][15372] Avg episode reward: [(0, '35.404')] [2024-08-05 16:04:18,123][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000014814_121356288.pth... [2024-08-05 16:04:18,227][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000014105_115548160.pth [2024-08-05 16:04:20,599][15444] Updated weights for policy 0, policy_version 14821 (0.0011) [2024-08-05 16:04:23,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24439.5, 300 sec: 24270.6). Total num frames: 121479168. Throughput: 0: 6083.8. Samples: 30363010. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 16:04:23,119][15372] Avg episode reward: [(0, '34.935')] [2024-08-05 16:04:23,787][15444] Updated weights for policy 0, policy_version 14831 (0.0023) [2024-08-05 16:04:27,177][15444] Updated weights for policy 0, policy_version 14841 (0.0028) [2024-08-05 16:04:28,118][15372] Fps is (10 sec: 23757.1, 60 sec: 24303.9, 300 sec: 24242.8). Total num frames: 121593856. Throughput: 0: 6075.6. Samples: 30399360. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 16:04:28,119][15372] Avg episode reward: [(0, '36.183')] [2024-08-05 16:04:30,342][15417] Signal inference workers to stop experience collection... (5350 times) [2024-08-05 16:04:30,343][15417] Signal inference workers to resume experience collection... (5350 times) [2024-08-05 16:04:30,417][15444] InferenceWorker_p0-w0: stopping experience collection (5350 times) [2024-08-05 16:04:30,417][15444] InferenceWorker_p0-w0: resuming experience collection (5350 times) [2024-08-05 16:04:30,447][15444] Updated weights for policy 0, policy_version 14851 (0.0018) [2024-08-05 16:04:33,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24302.9, 300 sec: 24242.8). Total num frames: 121716736. Throughput: 0: 6074.4. Samples: 30435320. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 16:04:33,126][15372] Avg episode reward: [(0, '36.818')] [2024-08-05 16:04:34,077][15444] Updated weights for policy 0, policy_version 14861 (0.0011) [2024-08-05 16:04:37,212][15444] Updated weights for policy 0, policy_version 14871 (0.0019) [2024-08-05 16:04:38,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24302.9, 300 sec: 24242.8). Total num frames: 121839616. Throughput: 0: 6084.4. Samples: 30454400. Policy #0 lag: (min: 1.0, avg: 3.9, max: 7.0) [2024-08-05 16:04:38,119][15372] Avg episode reward: [(0, '36.661')] [2024-08-05 16:04:40,657][15444] Updated weights for policy 0, policy_version 14881 (0.0011) [2024-08-05 16:04:43,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24303.0, 300 sec: 24242.8). Total num frames: 121962496. Throughput: 0: 6082.5. Samples: 30490580. Policy #0 lag: (min: 1.0, avg: 3.9, max: 7.0) [2024-08-05 16:04:43,126][15372] Avg episode reward: [(0, '35.279')] [2024-08-05 16:04:44,111][15444] Updated weights for policy 0, policy_version 14891 (0.0011) [2024-08-05 16:04:47,332][15444] Updated weights for policy 0, policy_version 14901 (0.0021) [2024-08-05 16:04:48,119][15372] Fps is (10 sec: 24575.6, 60 sec: 24302.9, 300 sec: 24242.8). Total num frames: 122085376. Throughput: 0: 6050.6. Samples: 30526110. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 16:04:48,119][15372] Avg episode reward: [(0, '35.826')] [2024-08-05 16:04:50,934][15444] Updated weights for policy 0, policy_version 14911 (0.0011) [2024-08-05 16:04:53,119][15372] Fps is (10 sec: 23756.3, 60 sec: 24302.8, 300 sec: 24215.0). Total num frames: 122200064. Throughput: 0: 6060.4. Samples: 30545000. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 16:04:53,119][15372] Avg episode reward: [(0, '36.731')] [2024-08-05 16:04:54,126][15444] Updated weights for policy 0, policy_version 14921 (0.0012) [2024-08-05 16:04:57,765][15444] Updated weights for policy 0, policy_version 14931 (0.0011) [2024-08-05 16:04:58,118][15372] Fps is (10 sec: 23757.1, 60 sec: 24303.0, 300 sec: 24215.0). Total num frames: 122322944. Throughput: 0: 6049.3. Samples: 30580840. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 16:04:58,119][15372] Avg episode reward: [(0, '36.631')] [2024-08-05 16:05:01,123][15444] Updated weights for policy 0, policy_version 14941 (0.0016) [2024-08-05 16:05:03,118][15372] Fps is (10 sec: 24576.5, 60 sec: 24166.4, 300 sec: 24242.8). Total num frames: 122445824. Throughput: 0: 6035.1. Samples: 30616260. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 16:05:03,126][15372] Avg episode reward: [(0, '35.792')] [2024-08-05 16:05:04,423][15444] Updated weights for policy 0, policy_version 14951 (0.0012) [2024-08-05 16:05:08,075][15444] Updated weights for policy 0, policy_version 14961 (0.0011) [2024-08-05 16:05:08,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 122560512. Throughput: 0: 6058.0. Samples: 30635620. Policy #0 lag: (min: 1.0, avg: 3.5, max: 7.0) [2024-08-05 16:05:08,119][15372] Avg episode reward: [(0, '36.330')] [2024-08-05 16:05:11,108][15444] Updated weights for policy 0, policy_version 14971 (0.0011) [2024-08-05 16:05:13,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 122683392. Throughput: 0: 6043.8. Samples: 30671330. Policy #0 lag: (min: 1.0, avg: 3.5, max: 7.0) [2024-08-05 16:05:13,126][15372] Avg episode reward: [(0, '35.853')] [2024-08-05 16:05:14,610][15417] Signal inference workers to stop experience collection... (5400 times) [2024-08-05 16:05:14,618][15417] Signal inference workers to resume experience collection... (5400 times) [2024-08-05 16:05:14,660][15444] InferenceWorker_p0-w0: stopping experience collection (5400 times) [2024-08-05 16:05:14,660][15444] InferenceWorker_p0-w0: resuming experience collection (5400 times) [2024-08-05 16:05:14,699][15444] Updated weights for policy 0, policy_version 14981 (0.0032) [2024-08-05 16:05:18,049][15444] Updated weights for policy 0, policy_version 14991 (0.0013) [2024-08-05 16:05:18,119][15372] Fps is (10 sec: 24575.6, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 122806272. Throughput: 0: 6046.4. Samples: 30707410. Policy #0 lag: (min: 2.0, avg: 4.2, max: 8.0) [2024-08-05 16:05:18,119][15372] Avg episode reward: [(0, '35.565')] [2024-08-05 16:05:21,490][15444] Updated weights for policy 0, policy_version 15001 (0.0011) [2024-08-05 16:05:23,121][15372] Fps is (10 sec: 24570.0, 60 sec: 24165.4, 300 sec: 24214.8). Total num frames: 122929152. Throughput: 0: 6031.7. Samples: 30725840. Policy #0 lag: (min: 2.0, avg: 4.2, max: 8.0) [2024-08-05 16:05:23,122][15372] Avg episode reward: [(0, '36.641')] [2024-08-05 16:05:24,956][15444] Updated weights for policy 0, policy_version 15011 (0.0014) [2024-08-05 16:05:28,119][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.3, 300 sec: 24187.2). Total num frames: 123043840. Throughput: 0: 6042.6. Samples: 30762500. Policy #0 lag: (min: 0.0, avg: 4.4, max: 9.0) [2024-08-05 16:05:28,127][15372] Avg episode reward: [(0, '37.173')] [2024-08-05 16:05:28,167][15444] Updated weights for policy 0, policy_version 15021 (0.0029) [2024-08-05 16:05:31,693][15444] Updated weights for policy 0, policy_version 15031 (0.0020) [2024-08-05 16:05:33,118][15372] Fps is (10 sec: 23762.7, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 123166720. Throughput: 0: 6038.5. Samples: 30797840. Policy #0 lag: (min: 0.0, avg: 4.4, max: 9.0) [2024-08-05 16:05:33,119][15372] Avg episode reward: [(0, '37.391')] [2024-08-05 16:05:34,813][15444] Updated weights for policy 0, policy_version 15041 (0.0016) [2024-08-05 16:05:38,119][15372] Fps is (10 sec: 24575.7, 60 sec: 24166.3, 300 sec: 24187.2). Total num frames: 123289600. Throughput: 0: 6038.7. Samples: 30816740. Policy #0 lag: (min: 0.0, avg: 4.4, max: 9.0) [2024-08-05 16:05:38,127][15372] Avg episode reward: [(0, '37.151')] [2024-08-05 16:05:38,254][15444] Updated weights for policy 0, policy_version 15051 (0.0010) [2024-08-05 16:05:41,651][15444] Updated weights for policy 0, policy_version 15061 (0.0012) [2024-08-05 16:05:43,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 123412480. Throughput: 0: 6047.8. Samples: 30852990. Policy #0 lag: (min: 1.0, avg: 3.9, max: 7.0) [2024-08-05 16:05:43,127][15372] Avg episode reward: [(0, '36.829')] [2024-08-05 16:05:45,067][15444] Updated weights for policy 0, policy_version 15071 (0.0019) [2024-08-05 16:05:48,118][15372] Fps is (10 sec: 24576.7, 60 sec: 24166.5, 300 sec: 24187.2). Total num frames: 123535360. Throughput: 0: 6078.2. Samples: 30889780. Policy #0 lag: (min: 1.0, avg: 3.9, max: 7.0) [2024-08-05 16:05:48,126][15372] Avg episode reward: [(0, '36.552')] [2024-08-05 16:05:48,465][15444] Updated weights for policy 0, policy_version 15081 (0.0036) [2024-08-05 16:05:51,812][15444] Updated weights for policy 0, policy_version 15091 (0.0012) [2024-08-05 16:05:53,119][15372] Fps is (10 sec: 24575.4, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 123658240. Throughput: 0: 6052.6. Samples: 30907990. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 16:05:53,119][15372] Avg episode reward: [(0, '36.860')] [2024-08-05 16:05:55,064][15444] Updated weights for policy 0, policy_version 15101 (0.0016) [2024-08-05 16:05:58,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24302.9, 300 sec: 24215.4). Total num frames: 123781120. Throughput: 0: 6074.2. Samples: 30944670. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 16:05:58,126][15372] Avg episode reward: [(0, '36.100')] [2024-08-05 16:05:58,622][15444] Updated weights for policy 0, policy_version 15111 (0.0019) [2024-08-05 16:06:00,006][15417] Signal inference workers to stop experience collection... (5450 times) [2024-08-05 16:06:00,006][15417] Signal inference workers to resume experience collection... (5450 times) [2024-08-05 16:06:00,069][15444] InferenceWorker_p0-w0: stopping experience collection (5450 times) [2024-08-05 16:06:00,074][15444] InferenceWorker_p0-w0: resuming experience collection (5450 times) [2024-08-05 16:06:01,874][15444] Updated weights for policy 0, policy_version 15121 (0.0034) [2024-08-05 16:06:03,118][15372] Fps is (10 sec: 23757.5, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 123895808. Throughput: 0: 6086.7. Samples: 30981310. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 16:06:03,119][15372] Avg episode reward: [(0, '34.853')] [2024-08-05 16:06:05,259][15444] Updated weights for policy 0, policy_version 15131 (0.0020) [2024-08-05 16:06:08,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24439.5, 300 sec: 24215.0). Total num frames: 124026880. Throughput: 0: 6087.4. Samples: 30999760. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 16:06:08,126][15372] Avg episode reward: [(0, '34.578')] [2024-08-05 16:06:08,474][15444] Updated weights for policy 0, policy_version 15141 (0.0016) [2024-08-05 16:06:11,883][15444] Updated weights for policy 0, policy_version 15151 (0.0017) [2024-08-05 16:06:13,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 124141568. Throughput: 0: 6083.4. Samples: 31036250. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 16:06:13,119][15372] Avg episode reward: [(0, '35.042')] [2024-08-05 16:06:15,172][15444] Updated weights for policy 0, policy_version 15161 (0.0028) [2024-08-05 16:06:18,119][15372] Fps is (10 sec: 23755.7, 60 sec: 24302.8, 300 sec: 24242.7). Total num frames: 124264448. Throughput: 0: 6116.8. Samples: 31073100. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 16:06:18,127][15372] Avg episode reward: [(0, '36.128')] [2024-08-05 16:06:18,130][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000015169_124264448.pth... [2024-08-05 16:06:18,275][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000014459_118448128.pth [2024-08-05 16:06:18,502][15444] Updated weights for policy 0, policy_version 15171 (0.0018) [2024-08-05 16:06:22,070][15444] Updated weights for policy 0, policy_version 15181 (0.0009) [2024-08-05 16:06:23,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24303.9, 300 sec: 24242.9). Total num frames: 124387328. Throughput: 0: 6090.9. Samples: 31090830. Policy #0 lag: (min: 0.0, avg: 4.8, max: 9.0) [2024-08-05 16:06:23,119][15372] Avg episode reward: [(0, '36.907')] [2024-08-05 16:06:25,573][15444] Updated weights for policy 0, policy_version 15191 (0.0020) [2024-08-05 16:06:28,118][15372] Fps is (10 sec: 23758.0, 60 sec: 24303.0, 300 sec: 24215.0). Total num frames: 124502016. Throughput: 0: 6084.9. Samples: 31126810. Policy #0 lag: (min: 0.0, avg: 4.8, max: 9.0) [2024-08-05 16:06:28,126][15372] Avg episode reward: [(0, '36.564')] [2024-08-05 16:06:28,799][15444] Updated weights for policy 0, policy_version 15201 (0.0013) [2024-08-05 16:06:32,296][15444] Updated weights for policy 0, policy_version 15211 (0.0020) [2024-08-05 16:06:33,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24302.9, 300 sec: 24215.1). Total num frames: 124624896. Throughput: 0: 6062.4. Samples: 31162590. Policy #0 lag: (min: 0.0, avg: 4.8, max: 9.0) [2024-08-05 16:06:33,119][15372] Avg episode reward: [(0, '36.858')] [2024-08-05 16:06:35,459][15444] Updated weights for policy 0, policy_version 15221 (0.0016) [2024-08-05 16:06:38,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24303.1, 300 sec: 24215.0). Total num frames: 124747776. Throughput: 0: 6079.6. Samples: 31181570. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 16:06:38,119][15372] Avg episode reward: [(0, '37.184')] [2024-08-05 16:06:38,936][15444] Updated weights for policy 0, policy_version 15231 (0.0022) [2024-08-05 16:06:42,189][15444] Updated weights for policy 0, policy_version 15241 (0.0017) [2024-08-05 16:06:43,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24303.0, 300 sec: 24242.8). Total num frames: 124870656. Throughput: 0: 6082.0. Samples: 31218360. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 16:06:43,126][15372] Avg episode reward: [(0, '36.742')] [2024-08-05 16:06:45,537][15444] Updated weights for policy 0, policy_version 15251 (0.0019) [2024-08-05 16:06:45,660][15417] Signal inference workers to stop experience collection... (5500 times) [2024-08-05 16:06:45,661][15417] Signal inference workers to resume experience collection... (5500 times) [2024-08-05 16:06:45,690][15444] InferenceWorker_p0-w0: stopping experience collection (5500 times) [2024-08-05 16:06:45,690][15444] InferenceWorker_p0-w0: resuming experience collection (5500 times) [2024-08-05 16:06:48,119][15372] Fps is (10 sec: 24575.6, 60 sec: 24302.9, 300 sec: 24242.8). Total num frames: 124993536. Throughput: 0: 6083.1. Samples: 31255050. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 16:06:48,119][15372] Avg episode reward: [(0, '36.611')] [2024-08-05 16:06:48,874][15444] Updated weights for policy 0, policy_version 15261 (0.0026) [2024-08-05 16:06:52,399][15444] Updated weights for policy 0, policy_version 15271 (0.0012) [2024-08-05 16:06:53,119][15372] Fps is (10 sec: 24575.8, 60 sec: 24303.0, 300 sec: 24270.6). Total num frames: 125116416. Throughput: 0: 6094.0. Samples: 31273990. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 16:06:53,119][15372] Avg episode reward: [(0, '37.221')] [2024-08-05 16:06:55,691][15444] Updated weights for policy 0, policy_version 15281 (0.0023) [2024-08-05 16:06:58,118][15372] Fps is (10 sec: 24576.4, 60 sec: 24303.0, 300 sec: 24242.8). Total num frames: 125239296. Throughput: 0: 6080.2. Samples: 31309860. Policy #0 lag: (min: 0.0, avg: 3.8, max: 9.0) [2024-08-05 16:06:58,126][15372] Avg episode reward: [(0, '37.067')] [2024-08-05 16:06:59,101][15444] Updated weights for policy 0, policy_version 15291 (0.0030) [2024-08-05 16:07:02,515][15444] Updated weights for policy 0, policy_version 15301 (0.0015) [2024-08-05 16:07:03,118][15372] Fps is (10 sec: 23757.1, 60 sec: 24302.9, 300 sec: 24242.8). Total num frames: 125353984. Throughput: 0: 6054.3. Samples: 31345540. Policy #0 lag: (min: 0.0, avg: 3.8, max: 9.0) [2024-08-05 16:07:03,119][15372] Avg episode reward: [(0, '36.734')] [2024-08-05 16:07:05,674][15444] Updated weights for policy 0, policy_version 15311 (0.0022) [2024-08-05 16:07:08,119][15372] Fps is (10 sec: 24575.8, 60 sec: 24302.9, 300 sec: 24270.5). Total num frames: 125485056. Throughput: 0: 6070.2. Samples: 31363990. Policy #0 lag: (min: 1.0, avg: 3.9, max: 8.0) [2024-08-05 16:07:08,126][15372] Avg episode reward: [(0, '36.421')] [2024-08-05 16:07:09,358][15444] Updated weights for policy 0, policy_version 15321 (0.0012) [2024-08-05 16:07:12,934][15444] Updated weights for policy 0, policy_version 15331 (0.0026) [2024-08-05 16:07:13,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 125591552. Throughput: 0: 6076.7. Samples: 31400260. Policy #0 lag: (min: 1.0, avg: 3.9, max: 8.0) [2024-08-05 16:07:13,119][15372] Avg episode reward: [(0, '37.393')] [2024-08-05 16:07:15,976][15444] Updated weights for policy 0, policy_version 15341 (0.0037) [2024-08-05 16:07:18,121][15372] Fps is (10 sec: 23750.9, 60 sec: 24302.1, 300 sec: 24242.6). Total num frames: 125722624. Throughput: 0: 6080.1. Samples: 31436210. Policy #0 lag: (min: 1.0, avg: 3.9, max: 8.0) [2024-08-05 16:07:18,129][15372] Avg episode reward: [(0, '37.073')] [2024-08-05 16:07:19,449][15444] Updated weights for policy 0, policy_version 15351 (0.0015) [2024-08-05 16:07:22,772][15444] Updated weights for policy 0, policy_version 15361 (0.0011) [2024-08-05 16:07:23,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 125837312. Throughput: 0: 6058.4. Samples: 31454200. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 16:07:23,119][15372] Avg episode reward: [(0, '35.959')] [2024-08-05 16:07:26,210][15444] Updated weights for policy 0, policy_version 15371 (0.0019) [2024-08-05 16:07:28,119][15372] Fps is (10 sec: 23761.8, 60 sec: 24302.8, 300 sec: 24242.7). Total num frames: 125960192. Throughput: 0: 6039.7. Samples: 31490150. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 16:07:28,127][15372] Avg episode reward: [(0, '35.932')] [2024-08-05 16:07:29,676][15444] Updated weights for policy 0, policy_version 15381 (0.0018) [2024-08-05 16:07:33,079][15444] Updated weights for policy 0, policy_version 15391 (0.0023) [2024-08-05 16:07:33,120][15372] Fps is (10 sec: 24571.2, 60 sec: 24302.1, 300 sec: 24242.6). Total num frames: 126083072. Throughput: 0: 6034.9. Samples: 31526630. Policy #0 lag: (min: 1.0, avg: 4.4, max: 8.0) [2024-08-05 16:07:33,121][15372] Avg episode reward: [(0, '36.023')] [2024-08-05 16:07:36,197][15417] Signal inference workers to stop experience collection... (5550 times) [2024-08-05 16:07:36,198][15417] Signal inference workers to resume experience collection... (5550 times) [2024-08-05 16:07:36,265][15444] InferenceWorker_p0-w0: stopping experience collection (5550 times) [2024-08-05 16:07:36,273][15444] InferenceWorker_p0-w0: resuming experience collection (5550 times) [2024-08-05 16:07:36,297][15444] Updated weights for policy 0, policy_version 15401 (0.0013) [2024-08-05 16:07:38,119][15372] Fps is (10 sec: 23757.2, 60 sec: 24166.3, 300 sec: 24215.0). Total num frames: 126197760. Throughput: 0: 6030.6. Samples: 31545370. Policy #0 lag: (min: 1.0, avg: 4.4, max: 8.0) [2024-08-05 16:07:38,126][15372] Avg episode reward: [(0, '36.202')] [2024-08-05 16:07:39,930][15444] Updated weights for policy 0, policy_version 15411 (0.0014) [2024-08-05 16:07:43,119][15372] Fps is (10 sec: 23761.2, 60 sec: 24166.4, 300 sec: 24242.8). Total num frames: 126320640. Throughput: 0: 6041.8. Samples: 31581740. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 16:07:43,126][15372] Avg episode reward: [(0, '36.813')] [2024-08-05 16:07:43,308][15444] Updated weights for policy 0, policy_version 15421 (0.0030) [2024-08-05 16:07:46,557][15444] Updated weights for policy 0, policy_version 15431 (0.0013) [2024-08-05 16:07:48,118][15372] Fps is (10 sec: 24576.6, 60 sec: 24166.5, 300 sec: 24270.5). Total num frames: 126443520. Throughput: 0: 6045.6. Samples: 31617590. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 16:07:48,126][15372] Avg episode reward: [(0, '37.441')] [2024-08-05 16:07:48,129][15417] Saving new best policy, reward=37.441! [2024-08-05 16:07:50,073][15444] Updated weights for policy 0, policy_version 15441 (0.0014) [2024-08-05 16:07:53,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24166.4, 300 sec: 24242.8). Total num frames: 126566400. Throughput: 0: 6040.4. Samples: 31635810. Policy #0 lag: (min: 1.0, avg: 3.8, max: 8.0) [2024-08-05 16:07:53,126][15372] Avg episode reward: [(0, '37.653')] [2024-08-05 16:07:53,127][15417] Saving new best policy, reward=37.653! [2024-08-05 16:07:53,450][15444] Updated weights for policy 0, policy_version 15451 (0.0029) [2024-08-05 16:07:56,744][15444] Updated weights for policy 0, policy_version 15461 (0.0011) [2024-08-05 16:07:58,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.9, 300 sec: 24270.5). Total num frames: 126681088. Throughput: 0: 6023.8. Samples: 31671330. Policy #0 lag: (min: 1.0, avg: 3.8, max: 8.0) [2024-08-05 16:07:58,119][15372] Avg episode reward: [(0, '37.139')] [2024-08-05 16:08:00,114][15444] Updated weights for policy 0, policy_version 15471 (0.0021) [2024-08-05 16:08:03,119][15372] Fps is (10 sec: 24575.7, 60 sec: 24302.8, 300 sec: 24298.3). Total num frames: 126812160. Throughput: 0: 6047.0. Samples: 31708310. Policy #0 lag: (min: 1.0, avg: 3.8, max: 8.0) [2024-08-05 16:08:03,126][15372] Avg episode reward: [(0, '37.029')] [2024-08-05 16:08:03,504][15444] Updated weights for policy 0, policy_version 15481 (0.0015) [2024-08-05 16:08:06,917][15444] Updated weights for policy 0, policy_version 15491 (0.0032) [2024-08-05 16:08:08,119][15372] Fps is (10 sec: 24575.3, 60 sec: 24029.8, 300 sec: 24242.8). Total num frames: 126926848. Throughput: 0: 6052.0. Samples: 31726540. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 16:08:08,119][15372] Avg episode reward: [(0, '36.923')] [2024-08-05 16:08:10,327][15444] Updated weights for policy 0, policy_version 15501 (0.0017) [2024-08-05 16:08:13,119][15372] Fps is (10 sec: 23756.2, 60 sec: 24302.7, 300 sec: 24270.5). Total num frames: 127049728. Throughput: 0: 6067.1. Samples: 31763170. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 16:08:13,119][15372] Avg episode reward: [(0, '36.248')] [2024-08-05 16:08:13,576][15444] Updated weights for policy 0, policy_version 15511 (0.0011) [2024-08-05 16:08:16,970][15444] Updated weights for policy 0, policy_version 15521 (0.0011) [2024-08-05 16:08:18,118][15372] Fps is (10 sec: 24576.6, 60 sec: 24167.4, 300 sec: 24270.5). Total num frames: 127172608. Throughput: 0: 6059.1. Samples: 31799280. Policy #0 lag: (min: 1.0, avg: 3.9, max: 8.0) [2024-08-05 16:08:18,119][15372] Avg episode reward: [(0, '35.972')] [2024-08-05 16:08:18,122][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000015524_127172608.pth... [2024-08-05 16:08:18,240][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000014814_121356288.pth [2024-08-05 16:08:20,261][15444] Updated weights for policy 0, policy_version 15531 (0.0012) [2024-08-05 16:08:23,118][15372] Fps is (10 sec: 24577.1, 60 sec: 24302.9, 300 sec: 24270.7). Total num frames: 127295488. Throughput: 0: 6051.8. Samples: 31817700. Policy #0 lag: (min: 1.0, avg: 3.9, max: 8.0) [2024-08-05 16:08:23,127][15372] Avg episode reward: [(0, '36.108')] [2024-08-05 16:08:23,847][15444] Updated weights for policy 0, policy_version 15541 (0.0029) [2024-08-05 16:08:27,187][15444] Updated weights for policy 0, policy_version 15551 (0.0011) [2024-08-05 16:08:28,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24303.1, 300 sec: 24270.5). Total num frames: 127418368. Throughput: 0: 6040.7. Samples: 31853570. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 16:08:28,119][15372] Avg episode reward: [(0, '35.974')] [2024-08-05 16:08:30,555][15444] Updated weights for policy 0, policy_version 15561 (0.0029) [2024-08-05 16:08:31,797][15417] Signal inference workers to stop experience collection... (5600 times) [2024-08-05 16:08:31,805][15417] Signal inference workers to resume experience collection... (5600 times) [2024-08-05 16:08:31,872][15444] InferenceWorker_p0-w0: stopping experience collection (5600 times) [2024-08-05 16:08:31,872][15444] InferenceWorker_p0-w0: resuming experience collection (5600 times) [2024-08-05 16:08:33,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24167.2, 300 sec: 24242.8). Total num frames: 127533056. Throughput: 0: 6047.8. Samples: 31889740. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 16:08:33,119][15372] Avg episode reward: [(0, '35.977')] [2024-08-05 16:08:33,897][15444] Updated weights for policy 0, policy_version 15571 (0.0010) [2024-08-05 16:08:37,281][15444] Updated weights for policy 0, policy_version 15581 (0.0011) [2024-08-05 16:08:38,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24303.0, 300 sec: 24242.8). Total num frames: 127655936. Throughput: 0: 6059.3. Samples: 31908480. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 16:08:38,119][15372] Avg episode reward: [(0, '35.707')] [2024-08-05 16:08:40,708][15444] Updated weights for policy 0, policy_version 15591 (0.0026) [2024-08-05 16:08:43,119][15372] Fps is (10 sec: 24575.5, 60 sec: 24302.9, 300 sec: 24242.8). Total num frames: 127778816. Throughput: 0: 6078.6. Samples: 31944870. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 16:08:43,119][15372] Avg episode reward: [(0, '35.521')] [2024-08-05 16:08:43,795][15444] Updated weights for policy 0, policy_version 15601 (0.0022) [2024-08-05 16:08:47,546][15444] Updated weights for policy 0, policy_version 15611 (0.0028) [2024-08-05 16:08:48,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.4, 300 sec: 24242.8). Total num frames: 127893504. Throughput: 0: 6053.4. Samples: 31980710. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 16:08:48,119][15372] Avg episode reward: [(0, '36.417')] [2024-08-05 16:08:50,853][15444] Updated weights for policy 0, policy_version 15621 (0.0011) [2024-08-05 16:08:53,119][15372] Fps is (10 sec: 23756.2, 60 sec: 24166.2, 300 sec: 24242.7). Total num frames: 128016384. Throughput: 0: 6063.3. Samples: 31999390. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 16:08:53,127][15372] Avg episode reward: [(0, '36.247')] [2024-08-05 16:08:54,359][15444] Updated weights for policy 0, policy_version 15631 (0.0018) [2024-08-05 16:08:57,808][15444] Updated weights for policy 0, policy_version 15641 (0.0011) [2024-08-05 16:08:58,119][15372] Fps is (10 sec: 24575.7, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 128139264. Throughput: 0: 6040.9. Samples: 32035010. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 16:08:58,119][15372] Avg episode reward: [(0, '35.958')] [2024-08-05 16:09:00,914][15444] Updated weights for policy 0, policy_version 15651 (0.0029) [2024-08-05 16:09:03,118][15372] Fps is (10 sec: 24577.1, 60 sec: 24166.5, 300 sec: 24242.8). Total num frames: 128262144. Throughput: 0: 6041.6. Samples: 32071150. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 16:09:03,126][15372] Avg episode reward: [(0, '36.765')] [2024-08-05 16:09:04,370][15444] Updated weights for policy 0, policy_version 15661 (0.0023) [2024-08-05 16:09:07,874][15444] Updated weights for policy 0, policy_version 15671 (0.0020) [2024-08-05 16:09:08,119][15372] Fps is (10 sec: 23756.2, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 128376832. Throughput: 0: 6041.1. Samples: 32089550. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 16:09:08,120][15372] Avg episode reward: [(0, '36.873')] [2024-08-05 16:09:11,269][15444] Updated weights for policy 0, policy_version 15681 (0.0030) [2024-08-05 16:09:13,119][15372] Fps is (10 sec: 23756.4, 60 sec: 24166.5, 300 sec: 24215.0). Total num frames: 128499712. Throughput: 0: 6044.9. Samples: 32125590. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 16:09:13,127][15372] Avg episode reward: [(0, '36.891')] [2024-08-05 16:09:14,416][15444] Updated weights for policy 0, policy_version 15691 (0.0014) [2024-08-05 16:09:17,950][15444] Updated weights for policy 0, policy_version 15701 (0.0036) [2024-08-05 16:09:18,119][15372] Fps is (10 sec: 24575.0, 60 sec: 24166.1, 300 sec: 24214.9). Total num frames: 128622592. Throughput: 0: 6058.3. Samples: 32162370. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 16:09:18,120][15372] Avg episode reward: [(0, '36.117')] [2024-08-05 16:09:21,354][15444] Updated weights for policy 0, policy_version 15711 (0.0010) [2024-08-05 16:09:22,399][15417] Signal inference workers to stop experience collection... (5650 times) [2024-08-05 16:09:22,407][15417] Signal inference workers to resume experience collection... (5650 times) [2024-08-05 16:09:22,433][15444] InferenceWorker_p0-w0: stopping experience collection (5650 times) [2024-08-05 16:09:22,433][15444] InferenceWorker_p0-w0: resuming experience collection (5650 times) [2024-08-05 16:09:23,118][15372] Fps is (10 sec: 24576.5, 60 sec: 24166.4, 300 sec: 24242.8). Total num frames: 128745472. Throughput: 0: 6050.0. Samples: 32180730. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 16:09:23,119][15372] Avg episode reward: [(0, '36.208')] [2024-08-05 16:09:24,447][15444] Updated weights for policy 0, policy_version 15721 (0.0016) [2024-08-05 16:09:28,045][15444] Updated weights for policy 0, policy_version 15731 (0.0014) [2024-08-05 16:09:28,118][15372] Fps is (10 sec: 24578.1, 60 sec: 24166.4, 300 sec: 24242.8). Total num frames: 128868352. Throughput: 0: 6060.7. Samples: 32217600. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 16:09:28,119][15372] Avg episode reward: [(0, '36.269')] [2024-08-05 16:09:31,219][15444] Updated weights for policy 0, policy_version 15741 (0.0015) [2024-08-05 16:09:33,119][15372] Fps is (10 sec: 23755.3, 60 sec: 24166.2, 300 sec: 24214.9). Total num frames: 128983040. Throughput: 0: 6055.7. Samples: 32253220. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 16:09:33,127][15372] Avg episode reward: [(0, '36.175')] [2024-08-05 16:09:34,793][15444] Updated weights for policy 0, policy_version 15751 (0.0021) [2024-08-05 16:09:38,119][15372] Fps is (10 sec: 23755.6, 60 sec: 24166.2, 300 sec: 24215.0). Total num frames: 129105920. Throughput: 0: 6051.8. Samples: 32271720. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 16:09:38,127][15372] Avg episode reward: [(0, '36.329')] [2024-08-05 16:09:38,157][15444] Updated weights for policy 0, policy_version 15761 (0.0015) [2024-08-05 16:09:41,498][15444] Updated weights for policy 0, policy_version 15771 (0.0012) [2024-08-05 16:09:43,120][15372] Fps is (10 sec: 24574.7, 60 sec: 24166.0, 300 sec: 24214.9). Total num frames: 129228800. Throughput: 0: 6065.6. Samples: 32307970. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 16:09:43,127][15372] Avg episode reward: [(0, '35.293')] [2024-08-05 16:09:44,910][15444] Updated weights for policy 0, policy_version 15781 (0.0012) [2024-08-05 16:09:48,118][15372] Fps is (10 sec: 24577.1, 60 sec: 24302.9, 300 sec: 24242.8). Total num frames: 129351680. Throughput: 0: 6069.3. Samples: 32344270. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 16:09:48,126][15372] Avg episode reward: [(0, '35.813')] [2024-08-05 16:09:48,425][15444] Updated weights for policy 0, policy_version 15791 (0.0020) [2024-08-05 16:09:51,471][15444] Updated weights for policy 0, policy_version 15801 (0.0019) [2024-08-05 16:09:53,119][15372] Fps is (10 sec: 24578.5, 60 sec: 24303.1, 300 sec: 24242.8). Total num frames: 129474560. Throughput: 0: 6077.8. Samples: 32363050. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 16:09:53,126][15372] Avg episode reward: [(0, '36.304')] [2024-08-05 16:09:55,023][15444] Updated weights for policy 0, policy_version 15811 (0.0016) [2024-08-05 16:09:58,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24303.0, 300 sec: 24242.8). Total num frames: 129597440. Throughput: 0: 6096.2. Samples: 32399920. Policy #0 lag: (min: 0.0, avg: 3.6, max: 9.0) [2024-08-05 16:09:58,126][15372] Avg episode reward: [(0, '36.269')] [2024-08-05 16:09:58,369][15444] Updated weights for policy 0, policy_version 15821 (0.0011) [2024-08-05 16:10:01,635][15444] Updated weights for policy 0, policy_version 15831 (0.0023) [2024-08-05 16:10:03,118][15372] Fps is (10 sec: 23757.1, 60 sec: 24166.4, 300 sec: 24242.8). Total num frames: 129712128. Throughput: 0: 6081.4. Samples: 32436030. Policy #0 lag: (min: 0.0, avg: 3.6, max: 9.0) [2024-08-05 16:10:03,126][15372] Avg episode reward: [(0, '36.471')] [2024-08-05 16:10:05,123][15444] Updated weights for policy 0, policy_version 15841 (0.0022) [2024-08-05 16:10:08,119][15372] Fps is (10 sec: 24574.1, 60 sec: 24439.3, 300 sec: 24270.5). Total num frames: 129843200. Throughput: 0: 6080.8. Samples: 32454370. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 16:10:08,127][15372] Avg episode reward: [(0, '36.663')] [2024-08-05 16:10:08,248][15444] Updated weights for policy 0, policy_version 15851 (0.0016) [2024-08-05 16:10:10,154][15417] Signal inference workers to stop experience collection... (5700 times) [2024-08-05 16:10:10,155][15417] Signal inference workers to resume experience collection... (5700 times) [2024-08-05 16:10:10,198][15444] InferenceWorker_p0-w0: stopping experience collection (5700 times) [2024-08-05 16:10:10,199][15444] InferenceWorker_p0-w0: resuming experience collection (5700 times) [2024-08-05 16:10:11,856][15444] Updated weights for policy 0, policy_version 15861 (0.0031) [2024-08-05 16:10:13,126][15372] Fps is (10 sec: 25376.9, 60 sec: 24436.6, 300 sec: 24270.0). Total num frames: 129966080. Throughput: 0: 6076.4. Samples: 32491080. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 16:10:13,126][15372] Avg episode reward: [(0, '36.873')] [2024-08-05 16:10:15,182][15444] Updated weights for policy 0, policy_version 15871 (0.0020) [2024-08-05 16:10:18,119][15372] Fps is (10 sec: 23758.2, 60 sec: 24303.2, 300 sec: 24243.0). Total num frames: 130080768. Throughput: 0: 6086.1. Samples: 32527090. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 16:10:18,126][15372] Avg episode reward: [(0, '35.591')] [2024-08-05 16:10:18,130][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000015879_130080768.pth... [2024-08-05 16:10:18,256][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000015169_124264448.pth [2024-08-05 16:10:18,569][15444] Updated weights for policy 0, policy_version 15881 (0.0021) [2024-08-05 16:10:22,085][15444] Updated weights for policy 0, policy_version 15891 (0.0013) [2024-08-05 16:10:23,119][15372] Fps is (10 sec: 22953.6, 60 sec: 24166.3, 300 sec: 24242.8). Total num frames: 130195456. Throughput: 0: 6069.6. Samples: 32544850. Policy #0 lag: (min: 1.0, avg: 4.3, max: 8.0) [2024-08-05 16:10:23,119][15372] Avg episode reward: [(0, '36.104')] [2024-08-05 16:10:25,360][15444] Updated weights for policy 0, policy_version 15901 (0.0019) [2024-08-05 16:10:28,118][15372] Fps is (10 sec: 24576.4, 60 sec: 24302.9, 300 sec: 24270.5). Total num frames: 130326528. Throughput: 0: 6069.5. Samples: 32581090. Policy #0 lag: (min: 1.0, avg: 4.3, max: 8.0) [2024-08-05 16:10:28,119][15372] Avg episode reward: [(0, '37.340')] [2024-08-05 16:10:28,868][15444] Updated weights for policy 0, policy_version 15911 (0.0012) [2024-08-05 16:10:32,052][15444] Updated weights for policy 0, policy_version 15921 (0.0024) [2024-08-05 16:10:33,118][15372] Fps is (10 sec: 24576.5, 60 sec: 24303.2, 300 sec: 24242.8). Total num frames: 130441216. Throughput: 0: 6061.6. Samples: 32617040. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 16:10:33,119][15372] Avg episode reward: [(0, '36.777')] [2024-08-05 16:10:35,587][15444] Updated weights for policy 0, policy_version 15931 (0.0025) [2024-08-05 16:10:38,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24439.6, 300 sec: 24270.5). Total num frames: 130572288. Throughput: 0: 6061.1. Samples: 32635800. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 16:10:38,119][15372] Avg episode reward: [(0, '36.980')] [2024-08-05 16:10:38,899][15444] Updated weights for policy 0, policy_version 15941 (0.0036) [2024-08-05 16:10:42,262][15444] Updated weights for policy 0, policy_version 15951 (0.0011) [2024-08-05 16:10:43,118][15372] Fps is (10 sec: 24575.8, 60 sec: 24303.4, 300 sec: 24242.8). Total num frames: 130686976. Throughput: 0: 6048.9. Samples: 32672120. Policy #0 lag: (min: 0.0, avg: 3.3, max: 8.0) [2024-08-05 16:10:43,119][15372] Avg episode reward: [(0, '36.506')] [2024-08-05 16:10:45,747][15444] Updated weights for policy 0, policy_version 15961 (0.0012) [2024-08-05 16:10:48,119][15372] Fps is (10 sec: 23756.6, 60 sec: 24302.9, 300 sec: 24242.8). Total num frames: 130809856. Throughput: 0: 6056.6. Samples: 32708580. Policy #0 lag: (min: 0.0, avg: 3.3, max: 8.0) [2024-08-05 16:10:48,119][15372] Avg episode reward: [(0, '36.561')] [2024-08-05 16:10:48,888][15444] Updated weights for policy 0, policy_version 15971 (0.0021) [2024-08-05 16:10:52,495][15444] Updated weights for policy 0, policy_version 15981 (0.0016) [2024-08-05 16:10:53,119][15372] Fps is (10 sec: 24575.2, 60 sec: 24302.8, 300 sec: 24242.7). Total num frames: 130932736. Throughput: 0: 6051.8. Samples: 32726700. Policy #0 lag: (min: 0.0, avg: 3.3, max: 8.0) [2024-08-05 16:10:53,119][15372] Avg episode reward: [(0, '36.717')] [2024-08-05 16:10:55,922][15444] Updated weights for policy 0, policy_version 15991 (0.0013) [2024-08-05 16:10:58,119][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.3, 300 sec: 24242.8). Total num frames: 131047424. Throughput: 0: 6039.8. Samples: 32762830. Policy #0 lag: (min: 0.0, avg: 3.9, max: 9.0) [2024-08-05 16:10:58,126][15372] Avg episode reward: [(0, '36.572')] [2024-08-05 16:10:59,277][15444] Updated weights for policy 0, policy_version 16001 (0.0012) [2024-08-05 16:11:02,629][15444] Updated weights for policy 0, policy_version 16011 (0.0016) [2024-08-05 16:11:03,118][15372] Fps is (10 sec: 23757.7, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 131170304. Throughput: 0: 6035.6. Samples: 32798690. Policy #0 lag: (min: 0.0, avg: 3.9, max: 9.0) [2024-08-05 16:11:03,119][15372] Avg episode reward: [(0, '36.061')] [2024-08-05 16:11:03,841][15417] Signal inference workers to stop experience collection... (5750 times) [2024-08-05 16:11:03,841][15417] Signal inference workers to resume experience collection... (5750 times) [2024-08-05 16:11:03,872][15444] InferenceWorker_p0-w0: stopping experience collection (5750 times) [2024-08-05 16:11:03,872][15444] InferenceWorker_p0-w0: resuming experience collection (5750 times) [2024-08-05 16:11:05,901][15444] Updated weights for policy 0, policy_version 16021 (0.0015) [2024-08-05 16:11:08,119][15372] Fps is (10 sec: 24575.7, 60 sec: 24166.6, 300 sec: 24242.7). Total num frames: 131293184. Throughput: 0: 6050.0. Samples: 32817100. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 16:11:08,119][15372] Avg episode reward: [(0, '36.255')] [2024-08-05 16:11:09,459][15444] Updated weights for policy 0, policy_version 16031 (0.0014) [2024-08-05 16:11:12,901][15444] Updated weights for policy 0, policy_version 16041 (0.0022) [2024-08-05 16:11:13,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24032.7, 300 sec: 24215.0). Total num frames: 131407872. Throughput: 0: 6059.8. Samples: 32853780. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 16:11:13,119][15372] Avg episode reward: [(0, '36.576')] [2024-08-05 16:11:16,172][15444] Updated weights for policy 0, policy_version 16051 (0.0018) [2024-08-05 16:11:18,118][15372] Fps is (10 sec: 24576.4, 60 sec: 24303.0, 300 sec: 24242.8). Total num frames: 131538944. Throughput: 0: 6066.0. Samples: 32890010. Policy #0 lag: (min: 0.0, avg: 4.5, max: 7.0) [2024-08-05 16:11:18,126][15372] Avg episode reward: [(0, '36.139')] [2024-08-05 16:11:19,590][15444] Updated weights for policy 0, policy_version 16061 (0.0018) [2024-08-05 16:11:22,613][15444] Updated weights for policy 0, policy_version 16071 (0.0028) [2024-08-05 16:11:23,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24303.0, 300 sec: 24242.8). Total num frames: 131653632. Throughput: 0: 6052.4. Samples: 32908160. Policy #0 lag: (min: 0.0, avg: 4.5, max: 7.0) [2024-08-05 16:11:23,119][15372] Avg episode reward: [(0, '36.971')] [2024-08-05 16:11:26,327][15444] Updated weights for policy 0, policy_version 16081 (0.0010) [2024-08-05 16:11:28,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24166.4, 300 sec: 24242.8). Total num frames: 131776512. Throughput: 0: 6049.8. Samples: 32944360. Policy #0 lag: (min: 0.0, avg: 4.5, max: 7.0) [2024-08-05 16:11:28,119][15372] Avg episode reward: [(0, '36.513')] [2024-08-05 16:11:29,627][15444] Updated weights for policy 0, policy_version 16091 (0.0020) [2024-08-05 16:11:32,953][15444] Updated weights for policy 0, policy_version 16101 (0.0018) [2024-08-05 16:11:33,120][15372] Fps is (10 sec: 24572.7, 60 sec: 24302.4, 300 sec: 24242.7). Total num frames: 131899392. Throughput: 0: 6056.7. Samples: 32981140. Policy #0 lag: (min: 1.0, avg: 3.1, max: 7.0) [2024-08-05 16:11:33,120][15372] Avg episode reward: [(0, '35.696')] [2024-08-05 16:11:36,569][15444] Updated weights for policy 0, policy_version 16111 (0.0034) [2024-08-05 16:11:38,119][15372] Fps is (10 sec: 23756.3, 60 sec: 24029.8, 300 sec: 24215.0). Total num frames: 132014080. Throughput: 0: 6060.5. Samples: 32999420. Policy #0 lag: (min: 1.0, avg: 3.1, max: 7.0) [2024-08-05 16:11:38,119][15372] Avg episode reward: [(0, '37.143')] [2024-08-05 16:11:39,554][15444] Updated weights for policy 0, policy_version 16121 (0.0025) [2024-08-05 16:11:43,119][15372] Fps is (10 sec: 23759.6, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 132136960. Throughput: 0: 6064.9. Samples: 33035750. Policy #0 lag: (min: 1.0, avg: 4.9, max: 7.0) [2024-08-05 16:11:43,126][15372] Avg episode reward: [(0, '37.436')] [2024-08-05 16:11:43,287][15444] Updated weights for policy 0, policy_version 16131 (0.0040) [2024-08-05 16:11:43,405][15417] Signal inference workers to stop experience collection... (5800 times) [2024-08-05 16:11:43,406][15417] Signal inference workers to resume experience collection... (5800 times) [2024-08-05 16:11:43,439][15444] InferenceWorker_p0-w0: stopping experience collection (5800 times) [2024-08-05 16:11:43,466][15444] InferenceWorker_p0-w0: resuming experience collection (5800 times) [2024-08-05 16:11:46,496][15444] Updated weights for policy 0, policy_version 16141 (0.0043) [2024-08-05 16:11:48,118][15372] Fps is (10 sec: 24576.5, 60 sec: 24166.5, 300 sec: 24215.0). Total num frames: 132259840. Throughput: 0: 6072.7. Samples: 33071960. Policy #0 lag: (min: 1.0, avg: 4.9, max: 7.0) [2024-08-05 16:11:48,126][15372] Avg episode reward: [(0, '36.849')] [2024-08-05 16:11:49,861][15444] Updated weights for policy 0, policy_version 16151 (0.0012) [2024-08-05 16:11:53,118][15372] Fps is (10 sec: 24576.5, 60 sec: 24166.6, 300 sec: 24215.0). Total num frames: 132382720. Throughput: 0: 6072.0. Samples: 33090340. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 16:11:53,126][15372] Avg episode reward: [(0, '35.976')] [2024-08-05 16:11:53,400][15444] Updated weights for policy 0, policy_version 16161 (0.0025) [2024-08-05 16:11:56,527][15444] Updated weights for policy 0, policy_version 16171 (0.0019) [2024-08-05 16:11:58,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24303.0, 300 sec: 24242.8). Total num frames: 132505600. Throughput: 0: 6061.3. Samples: 33126540. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 16:11:58,126][15372] Avg episode reward: [(0, '36.431')] [2024-08-05 16:11:59,982][15444] Updated weights for policy 0, policy_version 16181 (0.0018) [2024-08-05 16:12:03,128][15372] Fps is (10 sec: 24552.6, 60 sec: 24299.1, 300 sec: 24214.2). Total num frames: 132628480. Throughput: 0: 6078.5. Samples: 33163600. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 16:12:03,136][15372] Avg episode reward: [(0, '36.655')] [2024-08-05 16:12:03,422][15444] Updated weights for policy 0, policy_version 16191 (0.0011) [2024-08-05 16:12:06,631][15444] Updated weights for policy 0, policy_version 16201 (0.0012) [2024-08-05 16:12:08,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24303.0, 300 sec: 24270.5). Total num frames: 132751360. Throughput: 0: 6084.0. Samples: 33181940. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 16:12:08,126][15372] Avg episode reward: [(0, '35.561')] [2024-08-05 16:12:10,066][15444] Updated weights for policy 0, policy_version 16211 (0.0011) [2024-08-05 16:12:13,119][15372] Fps is (10 sec: 24598.6, 60 sec: 24439.3, 300 sec: 24242.9). Total num frames: 132874240. Throughput: 0: 6093.5. Samples: 33218570. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 16:12:13,127][15372] Avg episode reward: [(0, '36.452')] [2024-08-05 16:12:13,356][15444] Updated weights for policy 0, policy_version 16221 (0.0015) [2024-08-05 16:12:16,587][15444] Updated weights for policy 0, policy_version 16231 (0.0011) [2024-08-05 16:12:18,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24303.0, 300 sec: 24270.5). Total num frames: 132997120. Throughput: 0: 6080.2. Samples: 33254740. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 16:12:18,126][15372] Avg episode reward: [(0, '36.495')] [2024-08-05 16:12:18,129][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000016235_132997120.pth... [2024-08-05 16:12:18,271][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000015524_127172608.pth [2024-08-05 16:12:19,930][15444] Updated weights for policy 0, policy_version 16241 (0.0024) [2024-08-05 16:12:23,118][15372] Fps is (10 sec: 23757.6, 60 sec: 24303.0, 300 sec: 24242.8). Total num frames: 133111808. Throughput: 0: 6087.6. Samples: 33273360. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 16:12:23,126][15372] Avg episode reward: [(0, '36.263')] [2024-08-05 16:12:23,582][15444] Updated weights for policy 0, policy_version 16251 (0.0012) [2024-08-05 16:12:26,906][15444] Updated weights for policy 0, policy_version 16261 (0.0017) [2024-08-05 16:12:28,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24439.5, 300 sec: 24270.7). Total num frames: 133242880. Throughput: 0: 6082.9. Samples: 33309480. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 16:12:28,119][15372] Avg episode reward: [(0, '36.308')] [2024-08-05 16:12:30,191][15444] Updated weights for policy 0, policy_version 16271 (0.0027) [2024-08-05 16:12:33,119][15372] Fps is (10 sec: 24574.6, 60 sec: 24303.3, 300 sec: 24270.5). Total num frames: 133357568. Throughput: 0: 6082.1. Samples: 33345660. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 16:12:33,127][15372] Avg episode reward: [(0, '36.902')] [2024-08-05 16:12:33,756][15444] Updated weights for policy 0, policy_version 16281 (0.0011) [2024-08-05 16:12:36,929][15444] Updated weights for policy 0, policy_version 16291 (0.0011) [2024-08-05 16:12:38,118][15372] Fps is (10 sec: 22937.6, 60 sec: 24303.0, 300 sec: 24242.8). Total num frames: 133472256. Throughput: 0: 6085.1. Samples: 33364170. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 16:12:38,119][15372] Avg episode reward: [(0, '36.934')] [2024-08-05 16:12:38,288][15417] Signal inference workers to stop experience collection... (5850 times) [2024-08-05 16:12:38,288][15417] Signal inference workers to resume experience collection... (5850 times) [2024-08-05 16:12:38,321][15444] InferenceWorker_p0-w0: stopping experience collection (5850 times) [2024-08-05 16:12:38,321][15444] InferenceWorker_p0-w0: resuming experience collection (5850 times) [2024-08-05 16:12:40,360][15444] Updated weights for policy 0, policy_version 16301 (0.0017) [2024-08-05 16:12:43,119][15372] Fps is (10 sec: 24576.9, 60 sec: 24439.5, 300 sec: 24270.5). Total num frames: 133603328. Throughput: 0: 6097.1. Samples: 33400910. Policy #0 lag: (min: 1.0, avg: 3.8, max: 7.0) [2024-08-05 16:12:43,119][15372] Avg episode reward: [(0, '36.983')] [2024-08-05 16:12:43,763][15444] Updated weights for policy 0, policy_version 16311 (0.0027) [2024-08-05 16:12:47,016][15444] Updated weights for policy 0, policy_version 16321 (0.0011) [2024-08-05 16:12:48,118][15372] Fps is (10 sec: 25395.2, 60 sec: 24439.5, 300 sec: 24270.5). Total num frames: 133726208. Throughput: 0: 6072.6. Samples: 33436810. Policy #0 lag: (min: 1.0, avg: 3.8, max: 7.0) [2024-08-05 16:12:48,119][15372] Avg episode reward: [(0, '36.345')] [2024-08-05 16:12:50,576][15444] Updated weights for policy 0, policy_version 16331 (0.0023) [2024-08-05 16:12:53,118][15372] Fps is (10 sec: 24576.4, 60 sec: 24439.5, 300 sec: 24298.3). Total num frames: 133849088. Throughput: 0: 6094.4. Samples: 33456190. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 16:12:53,119][15372] Avg episode reward: [(0, '36.935')] [2024-08-05 16:12:53,680][15444] Updated weights for policy 0, policy_version 16341 (0.0022) [2024-08-05 16:12:57,163][15444] Updated weights for policy 0, policy_version 16351 (0.0010) [2024-08-05 16:12:58,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24439.5, 300 sec: 24270.6). Total num frames: 133971968. Throughput: 0: 6083.8. Samples: 33492340. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 16:12:58,119][15372] Avg episode reward: [(0, '36.834')] [2024-08-05 16:13:00,534][15444] Updated weights for policy 0, policy_version 16361 (0.0014) [2024-08-05 16:13:03,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24306.8, 300 sec: 24270.6). Total num frames: 134086656. Throughput: 0: 6079.8. Samples: 33528330. Policy #0 lag: (min: 1.0, avg: 4.6, max: 9.0) [2024-08-05 16:13:03,126][15372] Avg episode reward: [(0, '35.897')] [2024-08-05 16:13:03,909][15444] Updated weights for policy 0, policy_version 16371 (0.0012) [2024-08-05 16:13:07,643][15444] Updated weights for policy 0, policy_version 16381 (0.0016) [2024-08-05 16:13:08,119][15372] Fps is (10 sec: 23755.9, 60 sec: 24302.8, 300 sec: 24270.5). Total num frames: 134209536. Throughput: 0: 6069.7. Samples: 33546500. Policy #0 lag: (min: 1.0, avg: 4.6, max: 9.0) [2024-08-05 16:13:08,119][15372] Avg episode reward: [(0, '35.814')] [2024-08-05 16:13:10,529][15444] Updated weights for policy 0, policy_version 16391 (0.0025) [2024-08-05 16:13:13,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24303.1, 300 sec: 24270.5). Total num frames: 134332416. Throughput: 0: 6065.8. Samples: 33582440. Policy #0 lag: (min: 1.0, avg: 4.6, max: 9.0) [2024-08-05 16:13:13,126][15372] Avg episode reward: [(0, '35.562')] [2024-08-05 16:13:14,270][15444] Updated weights for policy 0, policy_version 16401 (0.0012) [2024-08-05 16:13:17,663][15444] Updated weights for policy 0, policy_version 16411 (0.0018) [2024-08-05 16:13:18,118][15372] Fps is (10 sec: 22938.4, 60 sec: 24029.9, 300 sec: 24215.0). Total num frames: 134438912. Throughput: 0: 6061.0. Samples: 33618400. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 16:13:18,119][15372] Avg episode reward: [(0, '35.067')] [2024-08-05 16:13:18,506][15417] Signal inference workers to stop experience collection... (5900 times) [2024-08-05 16:13:18,511][15417] Signal inference workers to resume experience collection... (5900 times) [2024-08-05 16:13:18,577][15444] InferenceWorker_p0-w0: stopping experience collection (5900 times) [2024-08-05 16:13:18,577][15444] InferenceWorker_p0-w0: resuming experience collection (5900 times) [2024-08-05 16:13:20,765][15444] Updated weights for policy 0, policy_version 16421 (0.0012) [2024-08-05 16:13:23,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24439.4, 300 sec: 24270.5). Total num frames: 134578176. Throughput: 0: 6060.4. Samples: 33636890. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 16:13:23,119][15372] Avg episode reward: [(0, '35.784')] [2024-08-05 16:13:24,425][15444] Updated weights for policy 0, policy_version 16431 (0.0039) [2024-08-05 16:13:27,428][15444] Updated weights for policy 0, policy_version 16441 (0.0020) [2024-08-05 16:13:28,119][15372] Fps is (10 sec: 24575.2, 60 sec: 24029.7, 300 sec: 24242.7). Total num frames: 134684672. Throughput: 0: 6059.8. Samples: 33673600. Policy #0 lag: (min: 0.0, avg: 4.1, max: 9.0) [2024-08-05 16:13:28,119][15372] Avg episode reward: [(0, '36.382')] [2024-08-05 16:13:31,041][15444] Updated weights for policy 0, policy_version 16451 (0.0022) [2024-08-05 16:13:33,119][15372] Fps is (10 sec: 23755.9, 60 sec: 24303.0, 300 sec: 24270.5). Total num frames: 134815744. Throughput: 0: 6069.3. Samples: 33709930. Policy #0 lag: (min: 0.0, avg: 4.1, max: 9.0) [2024-08-05 16:13:33,119][15372] Avg episode reward: [(0, '36.748')] [2024-08-05 16:13:34,580][15444] Updated weights for policy 0, policy_version 16461 (0.0018) [2024-08-05 16:13:37,802][15444] Updated weights for policy 0, policy_version 16471 (0.0041) [2024-08-05 16:13:38,121][15372] Fps is (10 sec: 25390.4, 60 sec: 24438.6, 300 sec: 24270.4). Total num frames: 134938624. Throughput: 0: 6026.2. Samples: 33727380. Policy #0 lag: (min: 1.0, avg: 4.3, max: 9.0) [2024-08-05 16:13:38,121][15372] Avg episode reward: [(0, '35.462')] [2024-08-05 16:13:41,451][15444] Updated weights for policy 0, policy_version 16481 (0.0035) [2024-08-05 16:13:43,118][15372] Fps is (10 sec: 23757.8, 60 sec: 24166.5, 300 sec: 24270.5). Total num frames: 135053312. Throughput: 0: 6024.2. Samples: 33763430. Policy #0 lag: (min: 1.0, avg: 4.3, max: 9.0) [2024-08-05 16:13:43,119][15372] Avg episode reward: [(0, '35.917')] [2024-08-05 16:13:44,480][15444] Updated weights for policy 0, policy_version 16491 (0.0013) [2024-08-05 16:13:48,096][15444] Updated weights for policy 0, policy_version 16501 (0.0023) [2024-08-05 16:13:48,119][15372] Fps is (10 sec: 23761.8, 60 sec: 24166.4, 300 sec: 24270.6). Total num frames: 135176192. Throughput: 0: 6031.3. Samples: 33799740. Policy #0 lag: (min: 1.0, avg: 4.3, max: 9.0) [2024-08-05 16:13:48,119][15372] Avg episode reward: [(0, '36.829')] [2024-08-05 16:13:51,195][15444] Updated weights for policy 0, policy_version 16511 (0.0019) [2024-08-05 16:13:52,854][15417] Signal inference workers to stop experience collection... (5950 times) [2024-08-05 16:13:52,863][15417] Signal inference workers to resume experience collection... (5950 times) [2024-08-05 16:13:52,898][15444] InferenceWorker_p0-w0: stopping experience collection (5950 times) [2024-08-05 16:13:52,899][15444] InferenceWorker_p0-w0: resuming experience collection (5950 times) [2024-08-05 16:13:53,119][15372] Fps is (10 sec: 24575.8, 60 sec: 24166.4, 300 sec: 24270.5). Total num frames: 135299072. Throughput: 0: 6032.9. Samples: 33817980. Policy #0 lag: (min: 0.0, avg: 2.7, max: 7.0) [2024-08-05 16:13:53,119][15372] Avg episode reward: [(0, '36.918')] [2024-08-05 16:13:54,779][15444] Updated weights for policy 0, policy_version 16521 (0.0017) [2024-08-05 16:13:58,119][15372] Fps is (10 sec: 23756.2, 60 sec: 24029.7, 300 sec: 24242.7). Total num frames: 135413760. Throughput: 0: 6051.3. Samples: 33854750. Policy #0 lag: (min: 0.0, avg: 2.7, max: 7.0) [2024-08-05 16:13:58,127][15372] Avg episode reward: [(0, '35.806')] [2024-08-05 16:13:58,467][15444] Updated weights for policy 0, policy_version 16531 (0.0024) [2024-08-05 16:14:01,432][15444] Updated weights for policy 0, policy_version 16541 (0.0015) [2024-08-05 16:14:03,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24166.4, 300 sec: 24270.6). Total num frames: 135536640. Throughput: 0: 6060.0. Samples: 33891100. Policy #0 lag: (min: 1.0, avg: 5.0, max: 8.0) [2024-08-05 16:14:03,126][15372] Avg episode reward: [(0, '36.303')] [2024-08-05 16:14:05,060][15444] Updated weights for policy 0, policy_version 16551 (0.0028) [2024-08-05 16:14:08,083][15444] Updated weights for policy 0, policy_version 16561 (0.0014) [2024-08-05 16:14:08,118][15372] Fps is (10 sec: 25395.9, 60 sec: 24303.1, 300 sec: 24298.3). Total num frames: 135667712. Throughput: 0: 6046.0. Samples: 33908960. Policy #0 lag: (min: 1.0, avg: 5.0, max: 8.0) [2024-08-05 16:14:08,119][15372] Avg episode reward: [(0, '37.418')] [2024-08-05 16:14:11,746][15444] Updated weights for policy 0, policy_version 16571 (0.0021) [2024-08-05 16:14:13,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24029.9, 300 sec: 24242.8). Total num frames: 135774208. Throughput: 0: 6034.5. Samples: 33945150. Policy #0 lag: (min: 1.0, avg: 5.0, max: 8.0) [2024-08-05 16:14:13,119][15372] Avg episode reward: [(0, '36.806')] [2024-08-05 16:14:15,284][15444] Updated weights for policy 0, policy_version 16581 (0.0019) [2024-08-05 16:14:17,906][15417] Signal inference workers to stop experience collection... (6000 times) [2024-08-05 16:14:17,910][15417] Signal inference workers to resume experience collection... (6000 times) [2024-08-05 16:14:17,982][15444] InferenceWorker_p0-w0: stopping experience collection (6000 times) [2024-08-05 16:14:17,982][15444] InferenceWorker_p0-w0: resuming experience collection (6000 times) [2024-08-05 16:14:18,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24439.5, 300 sec: 24270.5). Total num frames: 135905280. Throughput: 0: 6043.8. Samples: 33981900. Policy #0 lag: (min: 1.0, avg: 3.6, max: 8.0) [2024-08-05 16:14:18,127][15372] Avg episode reward: [(0, '37.870')] [2024-08-05 16:14:18,131][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000016590_135905280.pth... [2024-08-05 16:14:18,264][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000015879_130080768.pth [2024-08-05 16:14:18,270][15417] Saving new best policy, reward=37.870! [2024-08-05 16:14:18,489][15444] Updated weights for policy 0, policy_version 16591 (0.0041) [2024-08-05 16:14:22,095][15444] Updated weights for policy 0, policy_version 16601 (0.0026) [2024-08-05 16:14:23,118][15372] Fps is (10 sec: 24575.8, 60 sec: 24029.9, 300 sec: 24242.8). Total num frames: 136019968. Throughput: 0: 6044.7. Samples: 33999380. Policy #0 lag: (min: 1.0, avg: 3.6, max: 8.0) [2024-08-05 16:14:23,119][15372] Avg episode reward: [(0, '37.185')] [2024-08-05 16:14:25,395][15444] Updated weights for policy 0, policy_version 16611 (0.0012) [2024-08-05 16:14:28,118][15372] Fps is (10 sec: 22937.6, 60 sec: 24166.5, 300 sec: 24242.8). Total num frames: 136134656. Throughput: 0: 6039.6. Samples: 34035210. Policy #0 lag: (min: 0.0, avg: 4.9, max: 9.0) [2024-08-05 16:14:28,120][15372] Avg episode reward: [(0, '36.614')] [2024-08-05 16:14:28,838][15444] Updated weights for policy 0, policy_version 16621 (0.0017) [2024-08-05 16:14:32,295][15444] Updated weights for policy 0, policy_version 16631 (0.0012) [2024-08-05 16:14:33,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24166.6, 300 sec: 24270.6). Total num frames: 136265728. Throughput: 0: 6040.5. Samples: 34071560. Policy #0 lag: (min: 0.0, avg: 4.9, max: 9.0) [2024-08-05 16:14:33,119][15372] Avg episode reward: [(0, '37.037')] [2024-08-05 16:14:35,348][15444] Updated weights for policy 0, policy_version 16641 (0.0018) [2024-08-05 16:14:38,121][15372] Fps is (10 sec: 25388.3, 60 sec: 24166.2, 300 sec: 24270.4). Total num frames: 136388608. Throughput: 0: 6045.2. Samples: 34090030. Policy #0 lag: (min: 2.0, avg: 3.9, max: 8.0) [2024-08-05 16:14:38,122][15372] Avg episode reward: [(0, '37.307')] [2024-08-05 16:14:38,984][15444] Updated weights for policy 0, policy_version 16651 (0.0022) [2024-08-05 16:14:42,129][15444] Updated weights for policy 0, policy_version 16661 (0.0023) [2024-08-05 16:14:43,119][15372] Fps is (10 sec: 23756.6, 60 sec: 24166.4, 300 sec: 24242.8). Total num frames: 136503296. Throughput: 0: 6040.9. Samples: 34126590. Policy #0 lag: (min: 2.0, avg: 3.9, max: 8.0) [2024-08-05 16:14:43,126][15372] Avg episode reward: [(0, '36.503')] [2024-08-05 16:14:45,625][15444] Updated weights for policy 0, policy_version 16671 (0.0023) [2024-08-05 16:14:48,118][15372] Fps is (10 sec: 23763.4, 60 sec: 24166.4, 300 sec: 24242.8). Total num frames: 136626176. Throughput: 0: 6035.3. Samples: 34162690. Policy #0 lag: (min: 2.0, avg: 3.9, max: 8.0) [2024-08-05 16:14:48,126][15372] Avg episode reward: [(0, '37.417')] [2024-08-05 16:14:49,373][15444] Updated weights for policy 0, policy_version 16681 (0.0039) [2024-08-05 16:14:52,371][15444] Updated weights for policy 0, policy_version 16691 (0.0019) [2024-08-05 16:14:53,119][15372] Fps is (10 sec: 24575.0, 60 sec: 24166.2, 300 sec: 24242.7). Total num frames: 136749056. Throughput: 0: 6035.1. Samples: 34180540. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 16:14:53,127][15372] Avg episode reward: [(0, '36.964')] [2024-08-05 16:14:55,983][15444] Updated weights for policy 0, policy_version 16701 (0.0012) [2024-08-05 16:14:58,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.5, 300 sec: 24242.8). Total num frames: 136863744. Throughput: 0: 6019.1. Samples: 34216010. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 16:14:58,119][15372] Avg episode reward: [(0, '36.921')] [2024-08-05 16:14:59,281][15444] Updated weights for policy 0, policy_version 16711 (0.0039) [2024-08-05 16:15:02,691][15444] Updated weights for policy 0, policy_version 16721 (0.0013) [2024-08-05 16:15:03,121][15372] Fps is (10 sec: 23751.3, 60 sec: 24165.3, 300 sec: 24214.8). Total num frames: 136986624. Throughput: 0: 6002.3. Samples: 34252020. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 16:15:03,122][15372] Avg episode reward: [(0, '37.151')] [2024-08-05 16:15:03,231][15417] Signal inference workers to stop experience collection... (6050 times) [2024-08-05 16:15:03,232][15417] Signal inference workers to resume experience collection... (6050 times) [2024-08-05 16:15:03,304][15444] InferenceWorker_p0-w0: stopping experience collection (6050 times) [2024-08-05 16:15:03,309][15444] InferenceWorker_p0-w0: resuming experience collection (6050 times) [2024-08-05 16:15:06,310][15444] Updated weights for policy 0, policy_version 16731 (0.0015) [2024-08-05 16:15:08,119][15372] Fps is (10 sec: 24574.8, 60 sec: 24029.7, 300 sec: 24215.5). Total num frames: 137109504. Throughput: 0: 6026.6. Samples: 34270580. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 16:15:08,119][15372] Avg episode reward: [(0, '36.574')] [2024-08-05 16:15:09,383][15444] Updated weights for policy 0, policy_version 16741 (0.0021) [2024-08-05 16:15:13,114][15444] Updated weights for policy 0, policy_version 16751 (0.0011) [2024-08-05 16:15:13,119][15372] Fps is (10 sec: 23763.3, 60 sec: 24166.3, 300 sec: 24215.0). Total num frames: 137224192. Throughput: 0: 6037.3. Samples: 34306890. Policy #0 lag: (min: 0.0, avg: 3.1, max: 7.0) [2024-08-05 16:15:13,125][15372] Avg episode reward: [(0, '35.895')] [2024-08-05 16:15:16,103][15444] Updated weights for policy 0, policy_version 16761 (0.0036) [2024-08-05 16:15:18,125][15372] Fps is (10 sec: 23742.2, 60 sec: 24027.2, 300 sec: 24242.2). Total num frames: 137347072. Throughput: 0: 6030.2. Samples: 34342960. Policy #0 lag: (min: 0.0, avg: 3.1, max: 7.0) [2024-08-05 16:15:18,132][15372] Avg episode reward: [(0, '36.472')] [2024-08-05 16:15:19,838][15444] Updated weights for policy 0, policy_version 16771 (0.0011) [2024-08-05 16:15:23,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24029.9, 300 sec: 24187.2). Total num frames: 137461760. Throughput: 0: 6015.9. Samples: 34360730. Policy #0 lag: (min: 0.0, avg: 3.1, max: 7.0) [2024-08-05 16:15:23,126][15372] Avg episode reward: [(0, '36.435')] [2024-08-05 16:15:23,293][15444] Updated weights for policy 0, policy_version 16781 (0.0036) [2024-08-05 16:15:26,330][15444] Updated weights for policy 0, policy_version 16791 (0.0030) [2024-08-05 16:15:28,119][15372] Fps is (10 sec: 23772.1, 60 sec: 24166.3, 300 sec: 24215.0). Total num frames: 137584640. Throughput: 0: 5995.3. Samples: 34396380. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 16:15:28,127][15372] Avg episode reward: [(0, '36.215')] [2024-08-05 16:15:30,105][15444] Updated weights for policy 0, policy_version 16801 (0.0039) [2024-08-05 16:15:33,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24029.9, 300 sec: 24187.2). Total num frames: 137707520. Throughput: 0: 6007.1. Samples: 34433010. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 16:15:33,126][15372] Avg episode reward: [(0, '36.704')] [2024-08-05 16:15:33,332][15444] Updated weights for policy 0, policy_version 16811 (0.0010) [2024-08-05 16:15:36,634][15444] Updated weights for policy 0, policy_version 16821 (0.0014) [2024-08-05 16:15:38,118][15372] Fps is (10 sec: 24576.4, 60 sec: 24031.0, 300 sec: 24215.0). Total num frames: 137830400. Throughput: 0: 6018.1. Samples: 34451350. Policy #0 lag: (min: 1.0, avg: 4.5, max: 9.0) [2024-08-05 16:15:38,126][15372] Avg episode reward: [(0, '36.588')] [2024-08-05 16:15:40,158][15444] Updated weights for policy 0, policy_version 16831 (0.0019) [2024-08-05 16:15:43,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 137953280. Throughput: 0: 6042.2. Samples: 34487910. Policy #0 lag: (min: 1.0, avg: 4.5, max: 9.0) [2024-08-05 16:15:43,126][15372] Avg episode reward: [(0, '36.076')] [2024-08-05 16:15:43,282][15444] Updated weights for policy 0, policy_version 16841 (0.0010) [2024-08-05 16:15:45,060][15417] Signal inference workers to stop experience collection... (6100 times) [2024-08-05 16:15:45,061][15417] Signal inference workers to resume experience collection... (6100 times) [2024-08-05 16:15:45,104][15444] InferenceWorker_p0-w0: stopping experience collection (6100 times) [2024-08-05 16:15:45,104][15444] InferenceWorker_p0-w0: resuming experience collection (6100 times) [2024-08-05 16:15:46,891][15444] Updated weights for policy 0, policy_version 16851 (0.0016) [2024-08-05 16:15:48,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 138076160. Throughput: 0: 6053.7. Samples: 34524420. Policy #0 lag: (min: 1.0, avg: 4.5, max: 9.0) [2024-08-05 16:15:48,119][15372] Avg episode reward: [(0, '36.113')] [2024-08-05 16:15:49,979][15444] Updated weights for policy 0, policy_version 16861 (0.0012) [2024-08-05 16:15:53,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24030.1, 300 sec: 24215.0). Total num frames: 138190848. Throughput: 0: 6072.5. Samples: 34543840. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 16:15:53,126][15372] Avg episode reward: [(0, '36.951')] [2024-08-05 16:15:53,468][15444] Updated weights for policy 0, policy_version 16871 (0.0016) [2024-08-05 16:15:56,827][15444] Updated weights for policy 0, policy_version 16881 (0.0011) [2024-08-05 16:15:58,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24302.9, 300 sec: 24242.8). Total num frames: 138321920. Throughput: 0: 6058.2. Samples: 34579510. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 16:15:58,119][15372] Avg episode reward: [(0, '37.330')] [2024-08-05 16:16:00,094][15444] Updated weights for policy 0, policy_version 16891 (0.0017) [2024-08-05 16:16:03,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24167.5, 300 sec: 24215.0). Total num frames: 138436608. Throughput: 0: 6067.6. Samples: 34615960. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 16:16:03,126][15372] Avg episode reward: [(0, '36.825')] [2024-08-05 16:16:03,614][15444] Updated weights for policy 0, policy_version 16901 (0.0011) [2024-08-05 16:16:06,888][15444] Updated weights for policy 0, policy_version 16911 (0.0014) [2024-08-05 16:16:08,120][15372] Fps is (10 sec: 23753.3, 60 sec: 24166.0, 300 sec: 24242.6). Total num frames: 138559488. Throughput: 0: 6082.0. Samples: 34634430. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 16:16:08,120][15372] Avg episode reward: [(0, '36.393')] [2024-08-05 16:16:10,276][15444] Updated weights for policy 0, policy_version 16921 (0.0010) [2024-08-05 16:16:13,119][15372] Fps is (10 sec: 24574.1, 60 sec: 24302.6, 300 sec: 24214.9). Total num frames: 138682368. Throughput: 0: 6096.6. Samples: 34670730. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 16:16:13,127][15372] Avg episode reward: [(0, '36.763')] [2024-08-05 16:16:13,696][15444] Updated weights for policy 0, policy_version 16931 (0.0021) [2024-08-05 16:16:17,005][15444] Updated weights for policy 0, policy_version 16941 (0.0015) [2024-08-05 16:16:18,119][15372] Fps is (10 sec: 23760.0, 60 sec: 24169.0, 300 sec: 24215.0). Total num frames: 138797056. Throughput: 0: 6092.9. Samples: 34707190. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 16:16:18,119][15372] Avg episode reward: [(0, '37.562')] [2024-08-05 16:16:18,122][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000016944_138805248.pth... [2024-08-05 16:16:18,238][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000016235_132997120.pth [2024-08-05 16:16:20,487][15444] Updated weights for policy 0, policy_version 16951 (0.0012) [2024-08-05 16:16:23,118][15372] Fps is (10 sec: 24578.0, 60 sec: 24439.5, 300 sec: 24242.8). Total num frames: 138928128. Throughput: 0: 6098.2. Samples: 34725770. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 16:16:23,119][15372] Avg episode reward: [(0, '37.116')] [2024-08-05 16:16:23,619][15444] Updated weights for policy 0, policy_version 16961 (0.0022) [2024-08-05 16:16:27,122][15444] Updated weights for policy 0, policy_version 16971 (0.0019) [2024-08-05 16:16:28,119][15372] Fps is (10 sec: 25394.6, 60 sec: 24439.4, 300 sec: 24242.8). Total num frames: 139051008. Throughput: 0: 6092.0. Samples: 34762050. Policy #0 lag: (min: 1.0, avg: 3.8, max: 8.0) [2024-08-05 16:16:28,119][15372] Avg episode reward: [(0, '37.206')] [2024-08-05 16:16:30,356][15444] Updated weights for policy 0, policy_version 16981 (0.0011) [2024-08-05 16:16:33,119][15372] Fps is (10 sec: 24575.6, 60 sec: 24439.4, 300 sec: 24270.5). Total num frames: 139173888. Throughput: 0: 6102.4. Samples: 34799030. Policy #0 lag: (min: 1.0, avg: 3.8, max: 8.0) [2024-08-05 16:16:33,119][15372] Avg episode reward: [(0, '37.339')] [2024-08-05 16:16:33,858][15444] Updated weights for policy 0, policy_version 16991 (0.0015) [2024-08-05 16:16:37,206][15444] Updated weights for policy 0, policy_version 17001 (0.0031) [2024-08-05 16:16:38,119][15372] Fps is (10 sec: 23757.4, 60 sec: 24302.9, 300 sec: 24242.8). Total num frames: 139288576. Throughput: 0: 6074.9. Samples: 34817210. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 16:16:38,119][15372] Avg episode reward: [(0, '37.048')] [2024-08-05 16:16:40,653][15444] Updated weights for policy 0, policy_version 17011 (0.0026) [2024-08-05 16:16:43,120][15372] Fps is (10 sec: 23754.4, 60 sec: 24302.5, 300 sec: 24242.7). Total num frames: 139411456. Throughput: 0: 6079.2. Samples: 34853080. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 16:16:43,127][15372] Avg episode reward: [(0, '36.130')] [2024-08-05 16:16:43,885][15417] Signal inference workers to stop experience collection... (6150 times) [2024-08-05 16:16:43,893][15417] Signal inference workers to resume experience collection... (6150 times) [2024-08-05 16:16:43,951][15444] InferenceWorker_p0-w0: stopping experience collection (6150 times) [2024-08-05 16:16:43,951][15444] InferenceWorker_p0-w0: resuming experience collection (6150 times) [2024-08-05 16:16:43,971][15444] Updated weights for policy 0, policy_version 17021 (0.0018) [2024-08-05 16:16:47,416][15444] Updated weights for policy 0, policy_version 17031 (0.0017) [2024-08-05 16:16:48,118][15372] Fps is (10 sec: 24576.3, 60 sec: 24302.9, 300 sec: 24242.8). Total num frames: 139534336. Throughput: 0: 6074.0. Samples: 34889290. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 16:16:48,126][15372] Avg episode reward: [(0, '35.988')] [2024-08-05 16:16:50,859][15444] Updated weights for policy 0, policy_version 17041 (0.0022) [2024-08-05 16:16:53,118][15372] Fps is (10 sec: 24578.7, 60 sec: 24439.4, 300 sec: 24242.8). Total num frames: 139657216. Throughput: 0: 6077.1. Samples: 34907890. Policy #0 lag: (min: 0.0, avg: 3.8, max: 9.0) [2024-08-05 16:16:53,126][15372] Avg episode reward: [(0, '36.234')] [2024-08-05 16:16:54,190][15444] Updated weights for policy 0, policy_version 17051 (0.0031) [2024-08-05 16:16:57,597][15444] Updated weights for policy 0, policy_version 17061 (0.0013) [2024-08-05 16:16:58,118][15372] Fps is (10 sec: 23756.6, 60 sec: 24166.4, 300 sec: 24215.8). Total num frames: 139771904. Throughput: 0: 6072.8. Samples: 34944000. Policy #0 lag: (min: 0.0, avg: 3.8, max: 9.0) [2024-08-05 16:16:58,119][15372] Avg episode reward: [(0, '36.661')] [2024-08-05 16:17:00,894][15444] Updated weights for policy 0, policy_version 17071 (0.0033) [2024-08-05 16:17:03,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 139894784. Throughput: 0: 6054.9. Samples: 34979660. Policy #0 lag: (min: 1.0, avg: 4.0, max: 7.0) [2024-08-05 16:17:03,126][15372] Avg episode reward: [(0, '36.570')] [2024-08-05 16:17:04,465][15444] Updated weights for policy 0, policy_version 17081 (0.0024) [2024-08-05 16:17:08,062][15444] Updated weights for policy 0, policy_version 17091 (0.0015) [2024-08-05 16:17:08,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24167.0, 300 sec: 24187.3). Total num frames: 140009472. Throughput: 0: 6037.3. Samples: 34997450. Policy #0 lag: (min: 1.0, avg: 4.0, max: 7.0) [2024-08-05 16:17:08,119][15372] Avg episode reward: [(0, '36.894')] [2024-08-05 16:17:11,083][15444] Updated weights for policy 0, policy_version 17101 (0.0030) [2024-08-05 16:17:13,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.7, 300 sec: 24187.2). Total num frames: 140132352. Throughput: 0: 6033.6. Samples: 35033560. Policy #0 lag: (min: 1.0, avg: 4.0, max: 7.0) [2024-08-05 16:17:13,119][15372] Avg episode reward: [(0, '37.378')] [2024-08-05 16:17:14,755][15444] Updated weights for policy 0, policy_version 17111 (0.0013) [2024-08-05 16:17:18,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 140247040. Throughput: 0: 5996.5. Samples: 35068870. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 16:17:18,126][15372] Avg episode reward: [(0, '37.127')] [2024-08-05 16:17:18,310][15444] Updated weights for policy 0, policy_version 17121 (0.0031) [2024-08-05 16:17:21,435][15444] Updated weights for policy 0, policy_version 17131 (0.0016) [2024-08-05 16:17:23,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 140369920. Throughput: 0: 6006.9. Samples: 35087520. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 16:17:23,126][15372] Avg episode reward: [(0, '35.759')] [2024-08-05 16:17:25,045][15444] Updated weights for policy 0, policy_version 17141 (0.0017) [2024-08-05 16:17:28,111][15444] Updated weights for policy 0, policy_version 17151 (0.0017) [2024-08-05 16:17:28,119][15372] Fps is (10 sec: 25394.0, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 140500992. Throughput: 0: 6029.9. Samples: 35124420. Policy #0 lag: (min: 1.0, avg: 4.8, max: 8.0) [2024-08-05 16:17:28,124][15372] Avg episode reward: [(0, '35.944')] [2024-08-05 16:17:29,459][15417] Signal inference workers to stop experience collection... (6200 times) [2024-08-05 16:17:29,460][15417] Signal inference workers to resume experience collection... (6200 times) [2024-08-05 16:17:29,529][15444] InferenceWorker_p0-w0: stopping experience collection (6200 times) [2024-08-05 16:17:29,536][15444] InferenceWorker_p0-w0: resuming experience collection (6200 times) [2024-08-05 16:17:31,604][15444] Updated weights for policy 0, policy_version 17161 (0.0039) [2024-08-05 16:17:33,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24029.9, 300 sec: 24215.0). Total num frames: 140615680. Throughput: 0: 6026.9. Samples: 35160500. Policy #0 lag: (min: 1.0, avg: 4.8, max: 8.0) [2024-08-05 16:17:33,119][15372] Avg episode reward: [(0, '37.053')] [2024-08-05 16:17:35,006][15444] Updated weights for policy 0, policy_version 17171 (0.0015) [2024-08-05 16:17:38,030][15444] Updated weights for policy 0, policy_version 17181 (0.0022) [2024-08-05 16:17:38,119][15372] Fps is (10 sec: 24576.5, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 140746752. Throughput: 0: 6030.2. Samples: 35179250. Policy #0 lag: (min: 1.0, avg: 3.8, max: 8.0) [2024-08-05 16:17:38,119][15372] Avg episode reward: [(0, '36.514')] [2024-08-05 16:17:41,683][15444] Updated weights for policy 0, policy_version 17191 (0.0034) [2024-08-05 16:17:43,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24166.8, 300 sec: 24187.2). Total num frames: 140861440. Throughput: 0: 6035.1. Samples: 35215580. Policy #0 lag: (min: 1.0, avg: 3.8, max: 8.0) [2024-08-05 16:17:43,119][15372] Avg episode reward: [(0, '36.284')] [2024-08-05 16:17:44,814][15444] Updated weights for policy 0, policy_version 17201 (0.0013) [2024-08-05 16:17:48,119][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.3, 300 sec: 24187.2). Total num frames: 140984320. Throughput: 0: 6048.6. Samples: 35251850. Policy #0 lag: (min: 1.0, avg: 3.8, max: 8.0) [2024-08-05 16:17:48,127][15372] Avg episode reward: [(0, '36.326')] [2024-08-05 16:17:48,326][15444] Updated weights for policy 0, policy_version 17211 (0.0021) [2024-08-05 16:17:51,885][15444] Updated weights for policy 0, policy_version 17221 (0.0016) [2024-08-05 16:17:53,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 141107200. Throughput: 0: 6057.8. Samples: 35270050. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 16:17:53,119][15372] Avg episode reward: [(0, '35.712')] [2024-08-05 16:17:55,026][15444] Updated weights for policy 0, policy_version 17231 (0.0020) [2024-08-05 16:17:58,119][15372] Fps is (10 sec: 23755.8, 60 sec: 24166.2, 300 sec: 24187.2). Total num frames: 141221888. Throughput: 0: 6060.1. Samples: 35306270. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 16:17:58,127][15372] Avg episode reward: [(0, '36.756')] [2024-08-05 16:17:58,598][15444] Updated weights for policy 0, policy_version 17241 (0.0028) [2024-08-05 16:18:02,220][15444] Updated weights for policy 0, policy_version 17251 (0.0020) [2024-08-05 16:18:03,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24187.3). Total num frames: 141344768. Throughput: 0: 6079.3. Samples: 35342440. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 16:18:03,119][15372] Avg episode reward: [(0, '37.505')] [2024-08-05 16:18:05,271][15444] Updated weights for policy 0, policy_version 17261 (0.0012) [2024-08-05 16:18:08,118][15372] Fps is (10 sec: 24577.5, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 141467648. Throughput: 0: 6070.2. Samples: 35360680. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 16:18:08,126][15372] Avg episode reward: [(0, '37.712')] [2024-08-05 16:18:08,834][15444] Updated weights for policy 0, policy_version 17271 (0.0013) [2024-08-05 16:18:11,978][15444] Updated weights for policy 0, policy_version 17281 (0.0014) [2024-08-05 16:18:13,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24302.9, 300 sec: 24242.8). Total num frames: 141590528. Throughput: 0: 6054.5. Samples: 35396870. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 16:18:13,119][15372] Avg episode reward: [(0, '37.626')] [2024-08-05 16:18:15,395][15444] Updated weights for policy 0, policy_version 17291 (0.0024) [2024-08-05 16:18:18,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24302.9, 300 sec: 24159.5). Total num frames: 141705216. Throughput: 0: 6076.4. Samples: 35433940. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 16:18:18,119][15372] Avg episode reward: [(0, '37.787')] [2024-08-05 16:18:18,185][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000017299_141713408.pth... [2024-08-05 16:18:18,351][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000016590_135905280.pth [2024-08-05 16:18:18,793][15444] Updated weights for policy 0, policy_version 17301 (0.0029) [2024-08-05 16:18:22,443][15444] Updated weights for policy 0, policy_version 17311 (0.0026) [2024-08-05 16:18:22,791][15417] Signal inference workers to stop experience collection... (6250 times) [2024-08-05 16:18:22,791][15417] Signal inference workers to resume experience collection... (6250 times) [2024-08-05 16:18:22,853][15444] InferenceWorker_p0-w0: stopping experience collection (6250 times) [2024-08-05 16:18:22,853][15444] InferenceWorker_p0-w0: resuming experience collection (6250 times) [2024-08-05 16:18:23,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24303.0, 300 sec: 24215.0). Total num frames: 141828096. Throughput: 0: 6057.8. Samples: 35451850. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 16:18:23,119][15372] Avg episode reward: [(0, '36.480')] [2024-08-05 16:18:25,563][15444] Updated weights for policy 0, policy_version 17321 (0.0011) [2024-08-05 16:18:28,119][15372] Fps is (10 sec: 24574.6, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 141950976. Throughput: 0: 6044.6. Samples: 35487590. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 16:18:28,127][15372] Avg episode reward: [(0, '37.207')] [2024-08-05 16:18:29,225][15444] Updated weights for policy 0, policy_version 17331 (0.0027) [2024-08-05 16:18:32,694][15444] Updated weights for policy 0, policy_version 17341 (0.0014) [2024-08-05 16:18:33,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.4, 300 sec: 24159.6). Total num frames: 142065664. Throughput: 0: 6017.4. Samples: 35522630. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 16:18:33,119][15372] Avg episode reward: [(0, '36.652')] [2024-08-05 16:18:35,799][15444] Updated weights for policy 0, policy_version 17351 (0.0014) [2024-08-05 16:18:38,119][15372] Fps is (10 sec: 23758.0, 60 sec: 24029.9, 300 sec: 24187.2). Total num frames: 142188544. Throughput: 0: 6015.3. Samples: 35540740. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 16:18:38,128][15372] Avg episode reward: [(0, '36.687')] [2024-08-05 16:18:39,593][15444] Updated weights for policy 0, policy_version 17361 (0.0013) [2024-08-05 16:18:43,039][15444] Updated weights for policy 0, policy_version 17371 (0.0030) [2024-08-05 16:18:43,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 142303232. Throughput: 0: 6012.1. Samples: 35576810. Policy #0 lag: (min: 0.0, avg: 3.6, max: 10.0) [2024-08-05 16:18:43,119][15372] Avg episode reward: [(0, '37.025')] [2024-08-05 16:18:46,192][15444] Updated weights for policy 0, policy_version 17381 (0.0013) [2024-08-05 16:18:48,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 142426112. Throughput: 0: 6004.0. Samples: 35612620. Policy #0 lag: (min: 0.0, avg: 3.6, max: 10.0) [2024-08-05 16:18:48,126][15372] Avg episode reward: [(0, '37.152')] [2024-08-05 16:18:49,806][15444] Updated weights for policy 0, policy_version 17391 (0.0022) [2024-08-05 16:18:52,879][15444] Updated weights for policy 0, policy_version 17401 (0.0014) [2024-08-05 16:18:53,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24029.9, 300 sec: 24187.3). Total num frames: 142548992. Throughput: 0: 6003.3. Samples: 35630830. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 16:18:53,119][15372] Avg episode reward: [(0, '36.415')] [2024-08-05 16:18:56,388][15444] Updated weights for policy 0, policy_version 17411 (0.0017) [2024-08-05 16:18:58,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24030.1, 300 sec: 24159.5). Total num frames: 142663680. Throughput: 0: 5998.0. Samples: 35666780. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 16:18:58,119][15372] Avg episode reward: [(0, '35.658')] [2024-08-05 16:18:59,980][15444] Updated weights for policy 0, policy_version 17421 (0.0021) [2024-08-05 16:19:01,299][15417] Signal inference workers to stop experience collection... (6300 times) [2024-08-05 16:19:01,299][15417] Signal inference workers to resume experience collection... (6300 times) [2024-08-05 16:19:01,348][15444] InferenceWorker_p0-w0: stopping experience collection (6300 times) [2024-08-05 16:19:01,348][15444] InferenceWorker_p0-w0: resuming experience collection (6300 times) [2024-08-05 16:19:03,101][15444] Updated weights for policy 0, policy_version 17431 (0.0019) [2024-08-05 16:19:03,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 142794752. Throughput: 0: 6006.9. Samples: 35704250. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 16:19:03,119][15372] Avg episode reward: [(0, '36.920')] [2024-08-05 16:19:06,439][15444] Updated weights for policy 0, policy_version 17441 (0.0013) [2024-08-05 16:19:08,118][15372] Fps is (10 sec: 24575.8, 60 sec: 24029.9, 300 sec: 24187.2). Total num frames: 142909440. Throughput: 0: 6016.2. Samples: 35722580. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 16:19:08,119][15372] Avg episode reward: [(0, '37.851')] [2024-08-05 16:19:09,693][15444] Updated weights for policy 0, policy_version 17451 (0.0019) [2024-08-05 16:19:13,119][15372] Fps is (10 sec: 23756.1, 60 sec: 24029.8, 300 sec: 24159.4). Total num frames: 143032320. Throughput: 0: 6043.6. Samples: 35759550. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 16:19:13,126][15372] Avg episode reward: [(0, '37.423')] [2024-08-05 16:19:13,406][15444] Updated weights for policy 0, policy_version 17461 (0.0017) [2024-08-05 16:19:16,522][15444] Updated weights for policy 0, policy_version 17471 (0.0011) [2024-08-05 16:19:18,119][15372] Fps is (10 sec: 24575.6, 60 sec: 24166.3, 300 sec: 24187.2). Total num frames: 143155200. Throughput: 0: 6068.9. Samples: 35795730. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 16:19:18,126][15372] Avg episode reward: [(0, '36.395')] [2024-08-05 16:19:19,998][15444] Updated weights for policy 0, policy_version 17481 (0.0030) [2024-08-05 16:19:23,096][15444] Updated weights for policy 0, policy_version 17491 (0.0013) [2024-08-05 16:19:23,119][15372] Fps is (10 sec: 25395.5, 60 sec: 24302.9, 300 sec: 24242.8). Total num frames: 143286272. Throughput: 0: 6070.4. Samples: 35813910. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 16:19:23,119][15372] Avg episode reward: [(0, '36.229')] [2024-08-05 16:19:26,576][15444] Updated weights for policy 0, policy_version 17501 (0.0018) [2024-08-05 16:19:28,118][15372] Fps is (10 sec: 23757.4, 60 sec: 24030.1, 300 sec: 24159.5). Total num frames: 143392768. Throughput: 0: 6071.1. Samples: 35850010. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 16:19:28,126][15372] Avg episode reward: [(0, '35.595')] [2024-08-05 16:19:30,205][15444] Updated weights for policy 0, policy_version 17511 (0.0012) [2024-08-05 16:19:33,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24302.9, 300 sec: 24187.5). Total num frames: 143523840. Throughput: 0: 6102.0. Samples: 35887210. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 16:19:33,126][15372] Avg episode reward: [(0, '35.434')] [2024-08-05 16:19:33,366][15444] Updated weights for policy 0, policy_version 17521 (0.0021) [2024-08-05 16:19:36,926][15444] Updated weights for policy 0, policy_version 17531 (0.0013) [2024-08-05 16:19:37,084][15417] Signal inference workers to stop experience collection... (6350 times) [2024-08-05 16:19:37,084][15417] Signal inference workers to resume experience collection... (6350 times) [2024-08-05 16:19:37,124][15444] InferenceWorker_p0-w0: stopping experience collection (6350 times) [2024-08-05 16:19:37,124][15444] InferenceWorker_p0-w0: resuming experience collection (6350 times) [2024-08-05 16:19:38,119][15372] Fps is (10 sec: 25394.3, 60 sec: 24302.8, 300 sec: 24215.0). Total num frames: 143646720. Throughput: 0: 6093.1. Samples: 35905020. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 16:19:38,119][15372] Avg episode reward: [(0, '36.472')] [2024-08-05 16:19:40,091][15444] Updated weights for policy 0, policy_version 17541 (0.0011) [2024-08-05 16:19:43,119][15372] Fps is (10 sec: 22937.4, 60 sec: 24166.4, 300 sec: 24159.4). Total num frames: 143753216. Throughput: 0: 6094.4. Samples: 35941030. Policy #0 lag: (min: 1.0, avg: 3.6, max: 8.0) [2024-08-05 16:19:43,126][15372] Avg episode reward: [(0, '37.336')] [2024-08-05 16:19:43,583][15444] Updated weights for policy 0, policy_version 17551 (0.0013) [2024-08-05 16:19:47,244][15444] Updated weights for policy 0, policy_version 17561 (0.0010) [2024-08-05 16:19:48,119][15372] Fps is (10 sec: 23756.6, 60 sec: 24302.8, 300 sec: 24187.2). Total num frames: 143884288. Throughput: 0: 6069.0. Samples: 35977360. Policy #0 lag: (min: 1.0, avg: 3.6, max: 8.0) [2024-08-05 16:19:48,119][15372] Avg episode reward: [(0, '37.215')] [2024-08-05 16:19:50,164][15444] Updated weights for policy 0, policy_version 17571 (0.0034) [2024-08-05 16:19:53,118][15372] Fps is (10 sec: 25395.5, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 144007168. Throughput: 0: 6060.7. Samples: 35995310. Policy #0 lag: (min: 1.0, avg: 3.6, max: 8.0) [2024-08-05 16:19:53,126][15372] Avg episode reward: [(0, '37.829')] [2024-08-05 16:19:53,968][15444] Updated weights for policy 0, policy_version 17581 (0.0019) [2024-08-05 16:19:57,007][15444] Updated weights for policy 0, policy_version 17591 (0.0024) [2024-08-05 16:19:58,119][15372] Fps is (10 sec: 23757.5, 60 sec: 24302.9, 300 sec: 24187.5). Total num frames: 144121856. Throughput: 0: 6049.1. Samples: 36031760. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 16:19:58,126][15372] Avg episode reward: [(0, '37.042')] [2024-08-05 16:20:00,442][15444] Updated weights for policy 0, policy_version 17601 (0.0018) [2024-08-05 16:20:03,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 144252928. Throughput: 0: 6054.5. Samples: 36068180. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 16:20:03,119][15372] Avg episode reward: [(0, '37.394')] [2024-08-05 16:20:04,197][15444] Updated weights for policy 0, policy_version 17611 (0.0018) [2024-08-05 16:20:07,095][15444] Updated weights for policy 0, policy_version 17621 (0.0018) [2024-08-05 16:20:08,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 144359424. Throughput: 0: 6052.5. Samples: 36086270. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 16:20:08,126][15372] Avg episode reward: [(0, '37.560')] [2024-08-05 16:20:10,602][15417] Signal inference workers to stop experience collection... (6400 times) [2024-08-05 16:20:10,607][15417] Signal inference workers to resume experience collection... (6400 times) [2024-08-05 16:20:10,665][15444] InferenceWorker_p0-w0: stopping experience collection (6400 times) [2024-08-05 16:20:10,665][15444] InferenceWorker_p0-w0: resuming experience collection (6400 times) [2024-08-05 16:20:10,707][15444] Updated weights for policy 0, policy_version 17631 (0.0025) [2024-08-05 16:20:13,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24303.0, 300 sec: 24215.5). Total num frames: 144490496. Throughput: 0: 6057.3. Samples: 36122590. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 16:20:13,119][15372] Avg episode reward: [(0, '37.675')] [2024-08-05 16:20:13,774][15444] Updated weights for policy 0, policy_version 17641 (0.0024) [2024-08-05 16:20:17,354][15444] Updated weights for policy 0, policy_version 17651 (0.0012) [2024-08-05 16:20:18,119][15372] Fps is (10 sec: 25394.6, 60 sec: 24302.9, 300 sec: 24242.7). Total num frames: 144613376. Throughput: 0: 6040.2. Samples: 36159020. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 16:20:18,119][15372] Avg episode reward: [(0, '37.552')] [2024-08-05 16:20:18,122][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000017653_144613376.pth... [2024-08-05 16:20:18,267][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000016944_138805248.pth [2024-08-05 16:20:21,030][15444] Updated weights for policy 0, policy_version 17661 (0.0037) [2024-08-05 16:20:23,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24029.9, 300 sec: 24215.0). Total num frames: 144728064. Throughput: 0: 6055.4. Samples: 36177510. Policy #0 lag: (min: 0.0, avg: 3.2, max: 8.0) [2024-08-05 16:20:23,119][15372] Avg episode reward: [(0, '35.805')] [2024-08-05 16:20:24,073][15444] Updated weights for policy 0, policy_version 17671 (0.0016) [2024-08-05 16:20:27,639][15444] Updated weights for policy 0, policy_version 17681 (0.0022) [2024-08-05 16:20:28,118][15372] Fps is (10 sec: 23757.3, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 144850944. Throughput: 0: 6042.9. Samples: 36212960. Policy #0 lag: (min: 0.0, avg: 3.2, max: 8.0) [2024-08-05 16:20:28,119][15372] Avg episode reward: [(0, '36.134')] [2024-08-05 16:20:31,154][15444] Updated weights for policy 0, policy_version 17691 (0.0013) [2024-08-05 16:20:33,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 144973824. Throughput: 0: 6032.5. Samples: 36248820. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 16:20:33,126][15372] Avg episode reward: [(0, '37.086')] [2024-08-05 16:20:34,391][15444] Updated weights for policy 0, policy_version 17701 (0.0017) [2024-08-05 16:20:37,795][15444] Updated weights for policy 0, policy_version 17711 (0.0026) [2024-08-05 16:20:38,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24166.5, 300 sec: 24215.0). Total num frames: 145096704. Throughput: 0: 6054.0. Samples: 36267740. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 16:20:38,119][15372] Avg episode reward: [(0, '37.547')] [2024-08-05 16:20:41,028][15444] Updated weights for policy 0, policy_version 17721 (0.0011) [2024-08-05 16:20:43,119][15372] Fps is (10 sec: 23756.6, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 145211392. Throughput: 0: 6049.3. Samples: 36303980. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 16:20:43,119][15372] Avg episode reward: [(0, '37.094')] [2024-08-05 16:20:44,436][15444] Updated weights for policy 0, policy_version 17731 (0.0043) [2024-08-05 16:20:47,911][15444] Updated weights for policy 0, policy_version 17741 (0.0013) [2024-08-05 16:20:48,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.6, 300 sec: 24215.0). Total num frames: 145334272. Throughput: 0: 6042.9. Samples: 36340110. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 16:20:48,119][15372] Avg episode reward: [(0, '36.045')] [2024-08-05 16:20:51,228][15444] Updated weights for policy 0, policy_version 17751 (0.0018) [2024-08-05 16:20:53,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 145457152. Throughput: 0: 6061.1. Samples: 36359020. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 16:20:53,126][15372] Avg episode reward: [(0, '36.585')] [2024-08-05 16:20:54,468][15444] Updated weights for policy 0, policy_version 17761 (0.0012) [2024-08-05 16:20:57,944][15444] Updated weights for policy 0, policy_version 17771 (0.0013) [2024-08-05 16:20:58,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24303.0, 300 sec: 24215.0). Total num frames: 145580032. Throughput: 0: 6067.5. Samples: 36395630. Policy #0 lag: (min: 1.0, avg: 4.1, max: 7.0) [2024-08-05 16:20:58,119][15372] Avg episode reward: [(0, '35.764')] [2024-08-05 16:21:01,529][15444] Updated weights for policy 0, policy_version 17781 (0.0043) [2024-08-05 16:21:03,119][15372] Fps is (10 sec: 24574.4, 60 sec: 24166.2, 300 sec: 24215.1). Total num frames: 145702912. Throughput: 0: 6054.4. Samples: 36431470. Policy #0 lag: (min: 1.0, avg: 4.1, max: 7.0) [2024-08-05 16:21:03,119][15372] Avg episode reward: [(0, '37.328')] [2024-08-05 16:21:04,650][15444] Updated weights for policy 0, policy_version 17791 (0.0017) [2024-08-05 16:21:05,119][15417] Signal inference workers to stop experience collection... (6450 times) [2024-08-05 16:21:05,125][15417] Signal inference workers to resume experience collection... (6450 times) [2024-08-05 16:21:05,194][15444] InferenceWorker_p0-w0: stopping experience collection (6450 times) [2024-08-05 16:21:05,201][15444] InferenceWorker_p0-w0: resuming experience collection (6450 times) [2024-08-05 16:21:08,021][15444] Updated weights for policy 0, policy_version 17801 (0.0013) [2024-08-05 16:21:08,119][15372] Fps is (10 sec: 24575.0, 60 sec: 24439.3, 300 sec: 24215.0). Total num frames: 145825792. Throughput: 0: 6065.3. Samples: 36450450. Policy #0 lag: (min: 1.0, avg: 4.1, max: 7.0) [2024-08-05 16:21:08,119][15372] Avg episode reward: [(0, '36.643')] [2024-08-05 16:21:11,450][15444] Updated weights for policy 0, policy_version 17811 (0.0020) [2024-08-05 16:21:13,120][15372] Fps is (10 sec: 24573.4, 60 sec: 24302.2, 300 sec: 24242.6). Total num frames: 145948672. Throughput: 0: 6077.1. Samples: 36486440. Policy #0 lag: (min: 1.0, avg: 4.0, max: 8.0) [2024-08-05 16:21:13,128][15372] Avg episode reward: [(0, '36.340')] [2024-08-05 16:21:14,841][15444] Updated weights for policy 0, policy_version 17821 (0.0026) [2024-08-05 16:21:18,118][15372] Fps is (10 sec: 23757.9, 60 sec: 24166.5, 300 sec: 24187.2). Total num frames: 146063360. Throughput: 0: 6103.8. Samples: 36523490. Policy #0 lag: (min: 1.0, avg: 4.0, max: 8.0) [2024-08-05 16:21:18,126][15372] Avg episode reward: [(0, '36.645')] [2024-08-05 16:21:18,133][15444] Updated weights for policy 0, policy_version 17831 (0.0019) [2024-08-05 16:21:21,383][15444] Updated weights for policy 0, policy_version 17841 (0.0021) [2024-08-05 16:21:23,118][15372] Fps is (10 sec: 23760.8, 60 sec: 24303.0, 300 sec: 24187.3). Total num frames: 146186240. Throughput: 0: 6096.2. Samples: 36542070. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 16:21:23,126][15372] Avg episode reward: [(0, '37.014')] [2024-08-05 16:21:24,722][15444] Updated weights for policy 0, policy_version 17851 (0.0011) [2024-08-05 16:21:28,026][15444] Updated weights for policy 0, policy_version 17861 (0.0013) [2024-08-05 16:21:28,118][15372] Fps is (10 sec: 25395.2, 60 sec: 24439.5, 300 sec: 24215.0). Total num frames: 146317312. Throughput: 0: 6107.6. Samples: 36578820. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 16:21:28,119][15372] Avg episode reward: [(0, '36.944')] [2024-08-05 16:21:31,708][15444] Updated weights for policy 0, policy_version 17871 (0.0024) [2024-08-05 16:21:33,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 146432000. Throughput: 0: 6097.6. Samples: 36614500. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 16:21:33,119][15372] Avg episode reward: [(0, '37.111')] [2024-08-05 16:21:34,858][15444] Updated weights for policy 0, policy_version 17881 (0.0023) [2024-08-05 16:21:38,119][15372] Fps is (10 sec: 22937.2, 60 sec: 24166.3, 300 sec: 24187.3). Total num frames: 146546688. Throughput: 0: 6091.5. Samples: 36633140. Policy #0 lag: (min: 1.0, avg: 3.1, max: 7.0) [2024-08-05 16:21:38,126][15372] Avg episode reward: [(0, '37.611')] [2024-08-05 16:21:38,362][15444] Updated weights for policy 0, policy_version 17891 (0.0028) [2024-08-05 16:21:41,838][15444] Updated weights for policy 0, policy_version 17901 (0.0018) [2024-08-05 16:21:43,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24439.5, 300 sec: 24215.0). Total num frames: 146677760. Throughput: 0: 6067.8. Samples: 36668680. Policy #0 lag: (min: 1.0, avg: 3.1, max: 7.0) [2024-08-05 16:21:43,119][15372] Avg episode reward: [(0, '38.135')] [2024-08-05 16:21:43,119][15417] Saving new best policy, reward=38.135! [2024-08-05 16:21:45,137][15444] Updated weights for policy 0, policy_version 17911 (0.0019) [2024-08-05 16:21:48,118][15372] Fps is (10 sec: 24576.5, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 146792448. Throughput: 0: 6067.2. Samples: 36704490. Policy #0 lag: (min: 0.0, avg: 3.2, max: 7.0) [2024-08-05 16:21:48,126][15372] Avg episode reward: [(0, '37.213')] [2024-08-05 16:21:48,785][15444] Updated weights for policy 0, policy_version 17921 (0.0011) [2024-08-05 16:21:49,367][15417] Signal inference workers to stop experience collection... (6500 times) [2024-08-05 16:21:49,367][15417] Signal inference workers to resume experience collection... (6500 times) [2024-08-05 16:21:49,436][15444] InferenceWorker_p0-w0: stopping experience collection (6500 times) [2024-08-05 16:21:49,436][15444] InferenceWorker_p0-w0: resuming experience collection (6500 times) [2024-08-05 16:21:51,716][15444] Updated weights for policy 0, policy_version 17931 (0.0012) [2024-08-05 16:21:53,121][15372] Fps is (10 sec: 23751.1, 60 sec: 24302.0, 300 sec: 24214.8). Total num frames: 146915328. Throughput: 0: 6066.2. Samples: 36723440. Policy #0 lag: (min: 0.0, avg: 3.2, max: 7.0) [2024-08-05 16:21:53,121][15372] Avg episode reward: [(0, '36.734')] [2024-08-05 16:21:55,225][15444] Updated weights for policy 0, policy_version 17941 (0.0015) [2024-08-05 16:21:58,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24303.0, 300 sec: 24215.0). Total num frames: 147038208. Throughput: 0: 6073.1. Samples: 36759720. Policy #0 lag: (min: 0.0, avg: 3.2, max: 7.0) [2024-08-05 16:21:58,126][15372] Avg episode reward: [(0, '37.496')] [2024-08-05 16:21:58,859][15444] Updated weights for policy 0, policy_version 17951 (0.0020) [2024-08-05 16:22:02,001][15444] Updated weights for policy 0, policy_version 17961 (0.0013) [2024-08-05 16:22:03,119][15372] Fps is (10 sec: 23761.1, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 147152896. Throughput: 0: 6037.9. Samples: 36795200. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 16:22:03,119][15372] Avg episode reward: [(0, '36.901')] [2024-08-05 16:22:05,592][15444] Updated weights for policy 0, policy_version 17971 (0.0015) [2024-08-05 16:22:08,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.6, 300 sec: 24215.0). Total num frames: 147275776. Throughput: 0: 6034.2. Samples: 36813610. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 16:22:08,119][15372] Avg episode reward: [(0, '36.659')] [2024-08-05 16:22:08,936][15444] Updated weights for policy 0, policy_version 17981 (0.0034) [2024-08-05 16:22:12,271][15444] Updated weights for policy 0, policy_version 17991 (0.0013) [2024-08-05 16:22:13,119][15372] Fps is (10 sec: 24577.0, 60 sec: 24167.0, 300 sec: 24242.8). Total num frames: 147398656. Throughput: 0: 6026.6. Samples: 36850020. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 16:22:13,126][15372] Avg episode reward: [(0, '37.419')] [2024-08-05 16:22:15,713][15444] Updated weights for policy 0, policy_version 18001 (0.0040) [2024-08-05 16:22:18,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 147513344. Throughput: 0: 6046.4. Samples: 36886590. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 16:22:18,119][15372] Avg episode reward: [(0, '37.179')] [2024-08-05 16:22:18,213][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000018008_147521536.pth... [2024-08-05 16:22:18,366][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000017299_141713408.pth [2024-08-05 16:22:19,034][15444] Updated weights for policy 0, policy_version 18011 (0.0013) [2024-08-05 16:22:22,557][15444] Updated weights for policy 0, policy_version 18021 (0.0015) [2024-08-05 16:22:23,118][15372] Fps is (10 sec: 23757.2, 60 sec: 24166.4, 300 sec: 24187.3). Total num frames: 147636224. Throughput: 0: 6025.8. Samples: 36904300. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 16:22:23,119][15372] Avg episode reward: [(0, '36.423')] [2024-08-05 16:22:25,591][15444] Updated weights for policy 0, policy_version 18031 (0.0018) [2024-08-05 16:22:28,120][15372] Fps is (10 sec: 24571.4, 60 sec: 24029.1, 300 sec: 24214.8). Total num frames: 147759104. Throughput: 0: 6036.0. Samples: 36940310. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 16:22:28,128][15372] Avg episode reward: [(0, '36.088')] [2024-08-05 16:22:29,371][15444] Updated weights for policy 0, policy_version 18041 (0.0024) [2024-08-05 16:22:29,754][15417] Signal inference workers to stop experience collection... (6550 times) [2024-08-05 16:22:29,755][15417] Signal inference workers to resume experience collection... (6550 times) [2024-08-05 16:22:29,825][15444] InferenceWorker_p0-w0: stopping experience collection (6550 times) [2024-08-05 16:22:29,833][15444] InferenceWorker_p0-w0: resuming experience collection (6550 times) [2024-08-05 16:22:32,869][15444] Updated weights for policy 0, policy_version 18051 (0.0038) [2024-08-05 16:22:33,119][15372] Fps is (10 sec: 23756.1, 60 sec: 24029.8, 300 sec: 24159.5). Total num frames: 147873792. Throughput: 0: 6049.1. Samples: 36976700. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 16:22:33,119][15372] Avg episode reward: [(0, '36.817')] [2024-08-05 16:22:36,036][15444] Updated weights for policy 0, policy_version 18061 (0.0013) [2024-08-05 16:22:38,133][15372] Fps is (10 sec: 24543.8, 60 sec: 24296.9, 300 sec: 24213.8). Total num frames: 148004864. Throughput: 0: 6015.6. Samples: 36994220. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 16:22:38,134][15372] Avg episode reward: [(0, '37.003')] [2024-08-05 16:22:39,706][15444] Updated weights for policy 0, policy_version 18071 (0.0013) [2024-08-05 16:22:42,913][15444] Updated weights for policy 0, policy_version 18081 (0.0012) [2024-08-05 16:22:43,118][15372] Fps is (10 sec: 24576.7, 60 sec: 24029.9, 300 sec: 24187.2). Total num frames: 148119552. Throughput: 0: 6018.9. Samples: 37030570. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 16:22:43,119][15372] Avg episode reward: [(0, '36.854')] [2024-08-05 16:22:46,375][15444] Updated weights for policy 0, policy_version 18091 (0.0022) [2024-08-05 16:22:48,118][15372] Fps is (10 sec: 22972.0, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 148234240. Throughput: 0: 6028.1. Samples: 37066460. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 16:22:48,126][15372] Avg episode reward: [(0, '36.207')] [2024-08-05 16:22:49,686][15444] Updated weights for policy 0, policy_version 18101 (0.0013) [2024-08-05 16:22:52,971][15444] Updated weights for policy 0, policy_version 18111 (0.0019) [2024-08-05 16:22:53,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24167.4, 300 sec: 24215.1). Total num frames: 148365312. Throughput: 0: 6034.9. Samples: 37085180. Policy #0 lag: (min: 0.0, avg: 3.1, max: 8.0) [2024-08-05 16:22:53,119][15372] Avg episode reward: [(0, '36.497')] [2024-08-05 16:22:56,642][15444] Updated weights for policy 0, policy_version 18121 (0.0010) [2024-08-05 16:22:58,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24029.9, 300 sec: 24187.2). Total num frames: 148480000. Throughput: 0: 6024.0. Samples: 37121100. Policy #0 lag: (min: 0.0, avg: 3.1, max: 8.0) [2024-08-05 16:22:58,119][15372] Avg episode reward: [(0, '36.646')] [2024-08-05 16:22:59,637][15444] Updated weights for policy 0, policy_version 18131 (0.0017) [2024-08-05 16:23:03,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.6, 300 sec: 24187.2). Total num frames: 148602880. Throughput: 0: 6022.7. Samples: 37157610. Policy #0 lag: (min: 1.0, avg: 3.5, max: 8.0) [2024-08-05 16:23:03,126][15372] Avg episode reward: [(0, '36.552')] [2024-08-05 16:23:03,294][15444] Updated weights for policy 0, policy_version 18141 (0.0013) [2024-08-05 16:23:06,610][15444] Updated weights for policy 0, policy_version 18151 (0.0011) [2024-08-05 16:23:07,841][15417] Signal inference workers to stop experience collection... (6600 times) [2024-08-05 16:23:07,852][15417] Signal inference workers to resume experience collection... (6600 times) [2024-08-05 16:23:07,894][15444] InferenceWorker_p0-w0: stopping experience collection (6600 times) [2024-08-05 16:23:07,894][15444] InferenceWorker_p0-w0: resuming experience collection (6600 times) [2024-08-05 16:23:08,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 148725760. Throughput: 0: 6026.2. Samples: 37175480. Policy #0 lag: (min: 1.0, avg: 3.5, max: 8.0) [2024-08-05 16:23:08,119][15372] Avg episode reward: [(0, '36.935')] [2024-08-05 16:23:09,938][15444] Updated weights for policy 0, policy_version 18161 (0.0013) [2024-08-05 16:23:13,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.5, 300 sec: 24215.0). Total num frames: 148848640. Throughput: 0: 6042.7. Samples: 37212220. Policy #0 lag: (min: 1.0, avg: 3.5, max: 8.0) [2024-08-05 16:23:13,126][15372] Avg episode reward: [(0, '37.257')] [2024-08-05 16:23:13,625][15444] Updated weights for policy 0, policy_version 18171 (0.0011) [2024-08-05 16:23:16,871][15444] Updated weights for policy 0, policy_version 18181 (0.0012) [2024-08-05 16:23:18,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 148963328. Throughput: 0: 6029.1. Samples: 37248010. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 16:23:18,119][15372] Avg episode reward: [(0, '36.400')] [2024-08-05 16:23:20,103][15444] Updated weights for policy 0, policy_version 18191 (0.0013) [2024-08-05 16:23:23,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 149094400. Throughput: 0: 6063.1. Samples: 37266970. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 16:23:23,126][15372] Avg episode reward: [(0, '36.791')] [2024-08-05 16:23:23,395][15444] Updated weights for policy 0, policy_version 18201 (0.0018) [2024-08-05 16:23:26,966][15444] Updated weights for policy 0, policy_version 18211 (0.0040) [2024-08-05 16:23:28,124][15372] Fps is (10 sec: 24562.4, 60 sec: 24164.9, 300 sec: 24214.5). Total num frames: 149209088. Throughput: 0: 6053.9. Samples: 37303030. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 16:23:28,124][15372] Avg episode reward: [(0, '37.568')] [2024-08-05 16:23:30,209][15444] Updated weights for policy 0, policy_version 18221 (0.0029) [2024-08-05 16:23:33,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24303.0, 300 sec: 24215.0). Total num frames: 149331968. Throughput: 0: 6054.2. Samples: 37338900. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 16:23:33,126][15372] Avg episode reward: [(0, '37.205')] [2024-08-05 16:23:33,791][15444] Updated weights for policy 0, policy_version 18231 (0.0013) [2024-08-05 16:23:37,079][15444] Updated weights for policy 0, policy_version 18241 (0.0010) [2024-08-05 16:23:38,118][15372] Fps is (10 sec: 24589.6, 60 sec: 24172.4, 300 sec: 24242.8). Total num frames: 149454848. Throughput: 0: 6051.3. Samples: 37357490. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 16:23:38,119][15372] Avg episode reward: [(0, '37.814')] [2024-08-05 16:23:40,238][15444] Updated weights for policy 0, policy_version 18251 (0.0025) [2024-08-05 16:23:43,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 149569536. Throughput: 0: 6047.3. Samples: 37393230. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 16:23:43,126][15372] Avg episode reward: [(0, '37.297')] [2024-08-05 16:23:44,060][15444] Updated weights for policy 0, policy_version 18261 (0.0029) [2024-08-05 16:23:47,271][15444] Updated weights for policy 0, policy_version 18271 (0.0018) [2024-08-05 16:23:48,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 149692416. Throughput: 0: 6028.9. Samples: 37428910. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 16:23:48,126][15372] Avg episode reward: [(0, '38.086')] [2024-08-05 16:23:50,635][15444] Updated weights for policy 0, policy_version 18281 (0.0021) [2024-08-05 16:23:53,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24166.4, 300 sec: 24242.8). Total num frames: 149815296. Throughput: 0: 6033.8. Samples: 37447000. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 16:23:53,126][15372] Avg episode reward: [(0, '39.465')] [2024-08-05 16:23:53,127][15417] Saving new best policy, reward=39.465! [2024-08-05 16:23:54,217][15444] Updated weights for policy 0, policy_version 18291 (0.0017) [2024-08-05 16:23:57,633][15444] Updated weights for policy 0, policy_version 18301 (0.0011) [2024-08-05 16:23:58,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 149929984. Throughput: 0: 6024.2. Samples: 37483310. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 16:23:58,119][15372] Avg episode reward: [(0, '37.174')] [2024-08-05 16:23:58,215][15417] Signal inference workers to stop experience collection... (6650 times) [2024-08-05 16:23:58,216][15417] Signal inference workers to resume experience collection... (6650 times) [2024-08-05 16:23:58,268][15444] InferenceWorker_p0-w0: stopping experience collection (6650 times) [2024-08-05 16:23:58,274][15444] InferenceWorker_p0-w0: resuming experience collection (6650 times) [2024-08-05 16:24:00,887][15444] Updated weights for policy 0, policy_version 18311 (0.0022) [2024-08-05 16:24:03,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 150052864. Throughput: 0: 6023.3. Samples: 37519060. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 16:24:03,126][15372] Avg episode reward: [(0, '36.953')] [2024-08-05 16:24:04,498][15444] Updated weights for policy 0, policy_version 18321 (0.0024) [2024-08-05 16:24:07,645][15444] Updated weights for policy 0, policy_version 18331 (0.0035) [2024-08-05 16:24:08,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 150175744. Throughput: 0: 6002.2. Samples: 37537070. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 16:24:08,119][15372] Avg episode reward: [(0, '36.751')] [2024-08-05 16:24:11,266][15444] Updated weights for policy 0, policy_version 18341 (0.0010) [2024-08-05 16:24:13,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.9, 300 sec: 24187.2). Total num frames: 150290432. Throughput: 0: 6012.7. Samples: 37573570. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 16:24:13,119][15372] Avg episode reward: [(0, '36.673')] [2024-08-05 16:24:14,510][15444] Updated weights for policy 0, policy_version 18351 (0.0028) [2024-08-05 16:24:17,882][15444] Updated weights for policy 0, policy_version 18361 (0.0016) [2024-08-05 16:24:18,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 150413312. Throughput: 0: 6020.4. Samples: 37609820. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 16:24:18,119][15372] Avg episode reward: [(0, '37.663')] [2024-08-05 16:24:18,134][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000018362_150421504.pth... [2024-08-05 16:24:18,250][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000017653_144613376.pth [2024-08-05 16:24:21,514][15444] Updated weights for policy 0, policy_version 18371 (0.0011) [2024-08-05 16:24:23,119][15372] Fps is (10 sec: 24575.6, 60 sec: 24029.8, 300 sec: 24215.0). Total num frames: 150536192. Throughput: 0: 6013.8. Samples: 37628110. Policy #0 lag: (min: 1.0, avg: 3.6, max: 8.0) [2024-08-05 16:24:23,119][15372] Avg episode reward: [(0, '37.730')] [2024-08-05 16:24:24,595][15444] Updated weights for policy 0, policy_version 18381 (0.0015) [2024-08-05 16:24:28,048][15444] Updated weights for policy 0, policy_version 18391 (0.0013) [2024-08-05 16:24:28,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24168.6, 300 sec: 24187.2). Total num frames: 150659072. Throughput: 0: 6034.0. Samples: 37664760. Policy #0 lag: (min: 1.0, avg: 3.6, max: 8.0) [2024-08-05 16:24:28,119][15372] Avg episode reward: [(0, '37.260')] [2024-08-05 16:24:31,363][15444] Updated weights for policy 0, policy_version 18401 (0.0031) [2024-08-05 16:24:33,118][15372] Fps is (10 sec: 24576.4, 60 sec: 24166.4, 300 sec: 24187.3). Total num frames: 150781952. Throughput: 0: 6043.8. Samples: 37700880. Policy #0 lag: (min: 1.0, avg: 4.0, max: 8.0) [2024-08-05 16:24:33,126][15372] Avg episode reward: [(0, '36.656')] [2024-08-05 16:24:34,786][15444] Updated weights for policy 0, policy_version 18411 (0.0017) [2024-08-05 16:24:38,119][15372] Fps is (10 sec: 23756.7, 60 sec: 24029.8, 300 sec: 24215.0). Total num frames: 150896640. Throughput: 0: 6057.8. Samples: 37719600. Policy #0 lag: (min: 1.0, avg: 4.0, max: 8.0) [2024-08-05 16:24:38,126][15372] Avg episode reward: [(0, '36.165')] [2024-08-05 16:24:38,255][15444] Updated weights for policy 0, policy_version 18421 (0.0011) [2024-08-05 16:24:41,506][15444] Updated weights for policy 0, policy_version 18431 (0.0017) [2024-08-05 16:24:43,119][15372] Fps is (10 sec: 23755.3, 60 sec: 24166.1, 300 sec: 24187.2). Total num frames: 151019520. Throughput: 0: 6055.5. Samples: 37755810. Policy #0 lag: (min: 1.0, avg: 4.0, max: 8.0) [2024-08-05 16:24:43,127][15372] Avg episode reward: [(0, '36.155')] [2024-08-05 16:24:44,866][15444] Updated weights for policy 0, policy_version 18441 (0.0018) [2024-08-05 16:24:48,065][15444] Updated weights for policy 0, policy_version 18451 (0.0011) [2024-08-05 16:24:48,118][15372] Fps is (10 sec: 25395.4, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 151150592. Throughput: 0: 6075.6. Samples: 37792460. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 16:24:48,119][15372] Avg episode reward: [(0, '36.195')] [2024-08-05 16:24:50,852][15417] Signal inference workers to stop experience collection... (6700 times) [2024-08-05 16:24:50,853][15417] Signal inference workers to resume experience collection... (6700 times) [2024-08-05 16:24:50,896][15444] InferenceWorker_p0-w0: stopping experience collection (6700 times) [2024-08-05 16:24:50,896][15444] InferenceWorker_p0-w0: resuming experience collection (6700 times) [2024-08-05 16:24:51,489][15444] Updated weights for policy 0, policy_version 18461 (0.0026) [2024-08-05 16:24:53,119][15372] Fps is (10 sec: 24576.8, 60 sec: 24166.3, 300 sec: 24215.0). Total num frames: 151265280. Throughput: 0: 6096.6. Samples: 37811420. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 16:24:53,119][15372] Avg episode reward: [(0, '36.827')] [2024-08-05 16:24:54,855][15444] Updated weights for policy 0, policy_version 18471 (0.0016) [2024-08-05 16:24:58,122][15372] Fps is (10 sec: 24567.1, 60 sec: 24438.0, 300 sec: 24214.7). Total num frames: 151396352. Throughput: 0: 6107.7. Samples: 37848440. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 16:24:58,123][15372] Avg episode reward: [(0, '37.705')] [2024-08-05 16:24:58,131][15444] Updated weights for policy 0, policy_version 18481 (0.0011) [2024-08-05 16:25:01,634][15444] Updated weights for policy 0, policy_version 18491 (0.0014) [2024-08-05 16:25:03,118][15372] Fps is (10 sec: 24576.6, 60 sec: 24302.9, 300 sec: 24242.8). Total num frames: 151511040. Throughput: 0: 6104.9. Samples: 37884540. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 16:25:03,119][15372] Avg episode reward: [(0, '38.350')] [2024-08-05 16:25:04,808][15444] Updated weights for policy 0, policy_version 18501 (0.0031) [2024-08-05 16:25:08,118][15372] Fps is (10 sec: 23765.5, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 151633920. Throughput: 0: 6115.8. Samples: 37903320. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 16:25:08,126][15372] Avg episode reward: [(0, '38.007')] [2024-08-05 16:25:08,357][15444] Updated weights for policy 0, policy_version 18511 (0.0016) [2024-08-05 16:25:11,717][15444] Updated weights for policy 0, policy_version 18521 (0.0011) [2024-08-05 16:25:13,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24439.5, 300 sec: 24215.0). Total num frames: 151756800. Throughput: 0: 6106.7. Samples: 37939560. Policy #0 lag: (min: 1.0, avg: 4.3, max: 9.0) [2024-08-05 16:25:13,119][15372] Avg episode reward: [(0, '38.039')] [2024-08-05 16:25:14,853][15444] Updated weights for policy 0, policy_version 18531 (0.0021) [2024-08-05 16:25:18,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24439.5, 300 sec: 24242.8). Total num frames: 151879680. Throughput: 0: 6122.9. Samples: 37976410. Policy #0 lag: (min: 1.0, avg: 4.3, max: 9.0) [2024-08-05 16:25:18,126][15372] Avg episode reward: [(0, '37.495')] [2024-08-05 16:25:18,267][15444] Updated weights for policy 0, policy_version 18541 (0.0027) [2024-08-05 16:25:21,819][15444] Updated weights for policy 0, policy_version 18551 (0.0019) [2024-08-05 16:25:23,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24439.5, 300 sec: 24242.8). Total num frames: 152002560. Throughput: 0: 6110.9. Samples: 37994590. Policy #0 lag: (min: 1.0, avg: 4.3, max: 9.0) [2024-08-05 16:25:23,119][15372] Avg episode reward: [(0, '36.306')] [2024-08-05 16:25:24,878][15444] Updated weights for policy 0, policy_version 18561 (0.0013) [2024-08-05 16:25:28,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24303.0, 300 sec: 24215.0). Total num frames: 152117248. Throughput: 0: 6119.6. Samples: 38031190. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 16:25:28,126][15372] Avg episode reward: [(0, '37.700')] [2024-08-05 16:25:28,492][15444] Updated weights for policy 0, policy_version 18571 (0.0021) [2024-08-05 16:25:31,818][15444] Updated weights for policy 0, policy_version 18581 (0.0024) [2024-08-05 16:25:33,119][15372] Fps is (10 sec: 23756.6, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 152240128. Throughput: 0: 6113.8. Samples: 38067580. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 16:25:33,119][15372] Avg episode reward: [(0, '38.490')] [2024-08-05 16:25:35,089][15444] Updated weights for policy 0, policy_version 18591 (0.0013) [2024-08-05 16:25:38,119][15372] Fps is (10 sec: 25394.7, 60 sec: 24576.0, 300 sec: 24270.5). Total num frames: 152371200. Throughput: 0: 6097.6. Samples: 38085810. Policy #0 lag: (min: 1.0, avg: 4.0, max: 8.0) [2024-08-05 16:25:38,126][15372] Avg episode reward: [(0, '38.696')] [2024-08-05 16:25:38,815][15444] Updated weights for policy 0, policy_version 18601 (0.0020) [2024-08-05 16:25:39,625][15417] Signal inference workers to stop experience collection... (6750 times) [2024-08-05 16:25:39,626][15417] Signal inference workers to resume experience collection... (6750 times) [2024-08-05 16:25:39,670][15444] InferenceWorker_p0-w0: stopping experience collection (6750 times) [2024-08-05 16:25:39,671][15444] InferenceWorker_p0-w0: resuming experience collection (6750 times) [2024-08-05 16:25:41,679][15444] Updated weights for policy 0, policy_version 18611 (0.0012) [2024-08-05 16:25:43,118][15372] Fps is (10 sec: 24576.3, 60 sec: 24439.7, 300 sec: 24242.8). Total num frames: 152485888. Throughput: 0: 6084.9. Samples: 38122240. Policy #0 lag: (min: 1.0, avg: 4.0, max: 8.0) [2024-08-05 16:25:43,119][15372] Avg episode reward: [(0, '38.615')] [2024-08-05 16:25:45,209][15444] Updated weights for policy 0, policy_version 18621 (0.0011) [2024-08-05 16:25:48,118][15372] Fps is (10 sec: 24576.4, 60 sec: 24439.5, 300 sec: 24270.5). Total num frames: 152616960. Throughput: 0: 6111.3. Samples: 38159550. Policy #0 lag: (min: 1.0, avg: 4.0, max: 8.0) [2024-08-05 16:25:48,126][15372] Avg episode reward: [(0, '37.605')] [2024-08-05 16:25:48,486][15444] Updated weights for policy 0, policy_version 18631 (0.0022) [2024-08-05 16:25:51,884][15444] Updated weights for policy 0, policy_version 18641 (0.0041) [2024-08-05 16:25:53,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24439.6, 300 sec: 24242.8). Total num frames: 152731648. Throughput: 0: 6101.6. Samples: 38177890. Policy #0 lag: (min: 0.0, avg: 3.9, max: 9.0) [2024-08-05 16:25:53,119][15372] Avg episode reward: [(0, '37.046')] [2024-08-05 16:25:55,286][15444] Updated weights for policy 0, policy_version 18651 (0.0012) [2024-08-05 16:25:58,119][15372] Fps is (10 sec: 23756.5, 60 sec: 24304.4, 300 sec: 24242.8). Total num frames: 152854528. Throughput: 0: 6109.5. Samples: 38214490. Policy #0 lag: (min: 0.0, avg: 3.9, max: 9.0) [2024-08-05 16:25:58,126][15372] Avg episode reward: [(0, '37.415')] [2024-08-05 16:25:58,487][15444] Updated weights for policy 0, policy_version 18661 (0.0024) [2024-08-05 16:26:02,114][15444] Updated weights for policy 0, policy_version 18671 (0.0018) [2024-08-05 16:26:03,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24439.5, 300 sec: 24242.8). Total num frames: 152977408. Throughput: 0: 6083.3. Samples: 38250160. Policy #0 lag: (min: 0.0, avg: 3.9, max: 9.0) [2024-08-05 16:26:03,119][15372] Avg episode reward: [(0, '36.794')] [2024-08-05 16:26:05,337][15444] Updated weights for policy 0, policy_version 18681 (0.0011) [2024-08-05 16:26:08,119][15372] Fps is (10 sec: 24575.0, 60 sec: 24439.2, 300 sec: 24242.9). Total num frames: 153100288. Throughput: 0: 6102.4. Samples: 38269200. Policy #0 lag: (min: 0.0, avg: 4.1, max: 9.0) [2024-08-05 16:26:08,124][15372] Avg episode reward: [(0, '37.715')] [2024-08-05 16:26:08,661][15444] Updated weights for policy 0, policy_version 18691 (0.0011) [2024-08-05 16:26:12,051][15444] Updated weights for policy 0, policy_version 18701 (0.0024) [2024-08-05 16:26:13,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24439.5, 300 sec: 24270.5). Total num frames: 153223168. Throughput: 0: 6096.2. Samples: 38305520. Policy #0 lag: (min: 0.0, avg: 4.1, max: 9.0) [2024-08-05 16:26:13,119][15372] Avg episode reward: [(0, '37.757')] [2024-08-05 16:26:15,225][15444] Updated weights for policy 0, policy_version 18711 (0.0010) [2024-08-05 16:26:18,118][15372] Fps is (10 sec: 24577.2, 60 sec: 24439.4, 300 sec: 24270.5). Total num frames: 153346048. Throughput: 0: 6090.7. Samples: 38341660. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 16:26:18,119][15372] Avg episode reward: [(0, '37.733')] [2024-08-05 16:26:18,122][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000018719_153346048.pth... [2024-08-05 16:26:18,228][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000018008_147521536.pth [2024-08-05 16:26:19,007][15444] Updated weights for policy 0, policy_version 18721 (0.0012) [2024-08-05 16:26:22,237][15444] Updated weights for policy 0, policy_version 18731 (0.0011) [2024-08-05 16:26:23,119][15372] Fps is (10 sec: 23756.6, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 153460736. Throughput: 0: 6088.9. Samples: 38359810. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 16:26:23,126][15372] Avg episode reward: [(0, '37.626')] [2024-08-05 16:26:25,685][15444] Updated weights for policy 0, policy_version 18741 (0.0016) [2024-08-05 16:26:28,119][15372] Fps is (10 sec: 24575.6, 60 sec: 24575.9, 300 sec: 24270.5). Total num frames: 153591808. Throughput: 0: 6089.3. Samples: 38396260. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 16:26:28,119][15372] Avg episode reward: [(0, '36.715')] [2024-08-05 16:26:29,074][15444] Updated weights for policy 0, policy_version 18751 (0.0011) [2024-08-05 16:26:30,121][15417] Signal inference workers to stop experience collection... (6800 times) [2024-08-05 16:26:30,127][15417] Signal inference workers to resume experience collection... (6800 times) [2024-08-05 16:26:30,199][15444] InferenceWorker_p0-w0: stopping experience collection (6800 times) [2024-08-05 16:26:30,199][15444] InferenceWorker_p0-w0: resuming experience collection (6800 times) [2024-08-05 16:26:32,194][15444] Updated weights for policy 0, policy_version 18761 (0.0013) [2024-08-05 16:26:33,126][15372] Fps is (10 sec: 23739.4, 60 sec: 24299.9, 300 sec: 24242.2). Total num frames: 153698304. Throughput: 0: 6073.2. Samples: 38432890. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 16:26:33,126][15372] Avg episode reward: [(0, '35.789')] [2024-08-05 16:26:35,580][15444] Updated weights for policy 0, policy_version 18771 (0.0037) [2024-08-05 16:26:38,118][15372] Fps is (10 sec: 23757.3, 60 sec: 24303.0, 300 sec: 24242.8). Total num frames: 153829376. Throughput: 0: 6079.5. Samples: 38451470. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 16:26:38,119][15372] Avg episode reward: [(0, '36.751')] [2024-08-05 16:26:39,034][15444] Updated weights for policy 0, policy_version 18781 (0.0011) [2024-08-05 16:26:42,298][15444] Updated weights for policy 0, policy_version 18791 (0.0018) [2024-08-05 16:26:43,119][15372] Fps is (10 sec: 25412.4, 60 sec: 24439.2, 300 sec: 24270.5). Total num frames: 153952256. Throughput: 0: 6085.0. Samples: 38488320. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 16:26:43,127][15372] Avg episode reward: [(0, '37.867')] [2024-08-05 16:26:45,811][15444] Updated weights for policy 0, policy_version 18801 (0.0011) [2024-08-05 16:26:48,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24302.9, 300 sec: 24270.7). Total num frames: 154075136. Throughput: 0: 6093.3. Samples: 38524360. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 16:26:48,126][15372] Avg episode reward: [(0, '37.247')] [2024-08-05 16:26:49,263][15444] Updated weights for policy 0, policy_version 18811 (0.0011) [2024-08-05 16:26:52,310][15444] Updated weights for policy 0, policy_version 18821 (0.0026) [2024-08-05 16:26:53,118][15372] Fps is (10 sec: 23758.6, 60 sec: 24303.0, 300 sec: 24242.8). Total num frames: 154189824. Throughput: 0: 6066.8. Samples: 38542200. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 16:26:53,126][15372] Avg episode reward: [(0, '36.098')] [2024-08-05 16:26:56,027][15444] Updated weights for policy 0, policy_version 18831 (0.0017) [2024-08-05 16:26:58,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24303.0, 300 sec: 24270.6). Total num frames: 154312704. Throughput: 0: 6051.3. Samples: 38577830. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 16:26:58,119][15372] Avg episode reward: [(0, '36.811')] [2024-08-05 16:26:59,408][15444] Updated weights for policy 0, policy_version 18841 (0.0024) [2024-08-05 16:27:02,847][15444] Updated weights for policy 0, policy_version 18851 (0.0013) [2024-08-05 16:27:03,118][15372] Fps is (10 sec: 24575.8, 60 sec: 24302.9, 300 sec: 24270.5). Total num frames: 154435584. Throughput: 0: 6054.9. Samples: 38614130. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 16:27:03,119][15372] Avg episode reward: [(0, '37.452')] [2024-08-05 16:27:06,022][15444] Updated weights for policy 0, policy_version 18861 (0.0013) [2024-08-05 16:27:08,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.6, 300 sec: 24242.8). Total num frames: 154550272. Throughput: 0: 6059.3. Samples: 38632480. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 16:27:08,119][15372] Avg episode reward: [(0, '38.973')] [2024-08-05 16:27:09,651][15444] Updated weights for policy 0, policy_version 18871 (0.0017) [2024-08-05 16:27:13,018][15444] Updated weights for policy 0, policy_version 18881 (0.0043) [2024-08-05 16:27:13,119][15372] Fps is (10 sec: 23756.3, 60 sec: 24166.3, 300 sec: 24270.5). Total num frames: 154673152. Throughput: 0: 6046.7. Samples: 38668360. Policy #0 lag: (min: 0.0, avg: 3.4, max: 8.0) [2024-08-05 16:27:13,119][15372] Avg episode reward: [(0, '37.413')] [2024-08-05 16:27:16,224][15444] Updated weights for policy 0, policy_version 18891 (0.0018) [2024-08-05 16:27:18,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24166.4, 300 sec: 24270.5). Total num frames: 154796032. Throughput: 0: 6045.2. Samples: 38704880. Policy #0 lag: (min: 0.0, avg: 3.4, max: 8.0) [2024-08-05 16:27:18,126][15372] Avg episode reward: [(0, '37.634')] [2024-08-05 16:27:19,806][15444] Updated weights for policy 0, policy_version 18901 (0.0012) [2024-08-05 16:27:22,975][15444] Updated weights for policy 0, policy_version 18911 (0.0019) [2024-08-05 16:27:23,118][15372] Fps is (10 sec: 24576.5, 60 sec: 24303.0, 300 sec: 24270.7). Total num frames: 154918912. Throughput: 0: 6038.7. Samples: 38723210. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 16:27:23,119][15372] Avg episode reward: [(0, '38.050')] [2024-08-05 16:27:26,411][15444] Updated weights for policy 0, policy_version 18921 (0.0012) [2024-08-05 16:27:28,119][15372] Fps is (10 sec: 23756.9, 60 sec: 24029.9, 300 sec: 24270.6). Total num frames: 155033600. Throughput: 0: 6029.4. Samples: 38759640. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 16:27:28,126][15372] Avg episode reward: [(0, '37.121')] [2024-08-05 16:27:29,867][15444] Updated weights for policy 0, policy_version 18931 (0.0043) [2024-08-05 16:27:31,197][15417] Signal inference workers to stop experience collection... (6850 times) [2024-08-05 16:27:31,204][15417] Signal inference workers to resume experience collection... (6850 times) [2024-08-05 16:27:31,243][15444] InferenceWorker_p0-w0: stopping experience collection (6850 times) [2024-08-05 16:27:31,251][15444] InferenceWorker_p0-w0: resuming experience collection (6850 times) [2024-08-05 16:27:33,109][15444] Updated weights for policy 0, policy_version 18941 (0.0019) [2024-08-05 16:27:33,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24442.5, 300 sec: 24271.8). Total num frames: 155164672. Throughput: 0: 6058.0. Samples: 38796970. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 16:27:33,119][15372] Avg episode reward: [(0, '37.013')] [2024-08-05 16:27:36,508][15444] Updated weights for policy 0, policy_version 18951 (0.0027) [2024-08-05 16:27:38,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.4, 300 sec: 24270.5). Total num frames: 155279360. Throughput: 0: 6064.2. Samples: 38815090. Policy #0 lag: (min: 0.0, avg: 4.0, max: 9.0) [2024-08-05 16:27:38,119][15372] Avg episode reward: [(0, '36.842')] [2024-08-05 16:27:39,591][15444] Updated weights for policy 0, policy_version 18961 (0.0022) [2024-08-05 16:27:43,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.7, 300 sec: 24298.3). Total num frames: 155402240. Throughput: 0: 6090.7. Samples: 38851910. Policy #0 lag: (min: 0.0, avg: 4.0, max: 9.0) [2024-08-05 16:27:43,119][15372] Avg episode reward: [(0, '36.718')] [2024-08-05 16:27:43,331][15444] Updated weights for policy 0, policy_version 18971 (0.0027) [2024-08-05 16:27:46,459][15444] Updated weights for policy 0, policy_version 18981 (0.0022) [2024-08-05 16:27:48,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24166.4, 300 sec: 24270.5). Total num frames: 155525120. Throughput: 0: 6084.4. Samples: 38887930. Policy #0 lag: (min: 0.0, avg: 4.0, max: 9.0) [2024-08-05 16:27:48,126][15372] Avg episode reward: [(0, '36.871')] [2024-08-05 16:27:49,977][15444] Updated weights for policy 0, policy_version 18991 (0.0013) [2024-08-05 16:27:53,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24302.9, 300 sec: 24298.3). Total num frames: 155648000. Throughput: 0: 6081.3. Samples: 38906140. Policy #0 lag: (min: 0.0, avg: 3.0, max: 8.0) [2024-08-05 16:27:53,119][15372] Avg episode reward: [(0, '37.722')] [2024-08-05 16:27:53,523][15444] Updated weights for policy 0, policy_version 19001 (0.0016) [2024-08-05 16:27:56,651][15444] Updated weights for policy 0, policy_version 19011 (0.0013) [2024-08-05 16:27:58,119][15372] Fps is (10 sec: 24574.6, 60 sec: 24302.7, 300 sec: 24298.3). Total num frames: 155770880. Throughput: 0: 6094.4. Samples: 38942610. Policy #0 lag: (min: 0.0, avg: 3.0, max: 8.0) [2024-08-05 16:27:58,119][15372] Avg episode reward: [(0, '38.159')] [2024-08-05 16:28:00,011][15444] Updated weights for policy 0, policy_version 19021 (0.0022) [2024-08-05 16:28:03,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24302.9, 300 sec: 24298.3). Total num frames: 155893760. Throughput: 0: 6101.6. Samples: 38979450. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 16:28:03,126][15372] Avg episode reward: [(0, '38.667')] [2024-08-05 16:28:03,298][15444] Updated weights for policy 0, policy_version 19031 (0.0022) [2024-08-05 16:28:06,228][15417] Signal inference workers to stop experience collection... (6900 times) [2024-08-05 16:28:06,229][15417] Signal inference workers to resume experience collection... (6900 times) [2024-08-05 16:28:06,270][15444] InferenceWorker_p0-w0: stopping experience collection (6900 times) [2024-08-05 16:28:06,275][15444] InferenceWorker_p0-w0: resuming experience collection (6900 times) [2024-08-05 16:28:06,691][15444] Updated weights for policy 0, policy_version 19041 (0.0017) [2024-08-05 16:28:08,118][15372] Fps is (10 sec: 24577.3, 60 sec: 24439.5, 300 sec: 24298.3). Total num frames: 156016640. Throughput: 0: 6111.8. Samples: 38998240. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 16:28:08,119][15372] Avg episode reward: [(0, '37.642')] [2024-08-05 16:28:10,178][15444] Updated weights for policy 0, policy_version 19051 (0.0012) [2024-08-05 16:28:13,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24439.6, 300 sec: 24326.1). Total num frames: 156139520. Throughput: 0: 6110.5. Samples: 39034610. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 16:28:13,126][15372] Avg episode reward: [(0, '36.709')] [2024-08-05 16:28:13,277][15444] Updated weights for policy 0, policy_version 19061 (0.0013) [2024-08-05 16:28:16,947][15444] Updated weights for policy 0, policy_version 19071 (0.0012) [2024-08-05 16:28:18,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24439.5, 300 sec: 24298.3). Total num frames: 156262400. Throughput: 0: 6081.3. Samples: 39070630. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 16:28:18,119][15372] Avg episode reward: [(0, '36.604')] [2024-08-05 16:28:18,122][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000019075_156262400.pth... [2024-08-05 16:28:18,233][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000018362_150421504.pth [2024-08-05 16:28:20,048][15444] Updated weights for policy 0, policy_version 19081 (0.0017) [2024-08-05 16:28:23,119][15372] Fps is (10 sec: 23756.3, 60 sec: 24302.8, 300 sec: 24298.7). Total num frames: 156377088. Throughput: 0: 6084.4. Samples: 39088890. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 16:28:23,126][15372] Avg episode reward: [(0, '36.490')] [2024-08-05 16:28:23,644][15444] Updated weights for policy 0, policy_version 19091 (0.0011) [2024-08-05 16:28:27,110][15444] Updated weights for policy 0, policy_version 19101 (0.0026) [2024-08-05 16:28:28,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24439.5, 300 sec: 24298.3). Total num frames: 156499968. Throughput: 0: 6070.4. Samples: 39125080. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 16:28:28,119][15372] Avg episode reward: [(0, '37.203')] [2024-08-05 16:28:30,239][15444] Updated weights for policy 0, policy_version 19111 (0.0020) [2024-08-05 16:28:33,118][15372] Fps is (10 sec: 24576.6, 60 sec: 24302.9, 300 sec: 24298.3). Total num frames: 156622848. Throughput: 0: 6074.0. Samples: 39161260. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 16:28:33,126][15372] Avg episode reward: [(0, '37.451')] [2024-08-05 16:28:33,976][15444] Updated weights for policy 0, policy_version 19121 (0.0013) [2024-08-05 16:28:36,868][15444] Updated weights for policy 0, policy_version 19131 (0.0011) [2024-08-05 16:28:38,119][15372] Fps is (10 sec: 22937.3, 60 sec: 24166.4, 300 sec: 24270.5). Total num frames: 156729344. Throughput: 0: 6084.0. Samples: 39179920. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 16:28:38,119][15372] Avg episode reward: [(0, '37.652')] [2024-08-05 16:28:40,622][15444] Updated weights for policy 0, policy_version 19141 (0.0019) [2024-08-05 16:28:43,119][15372] Fps is (10 sec: 23756.6, 60 sec: 24302.9, 300 sec: 24298.3). Total num frames: 156860416. Throughput: 0: 6065.4. Samples: 39215550. Policy #0 lag: (min: 2.0, avg: 4.0, max: 9.0) [2024-08-05 16:28:43,127][15372] Avg episode reward: [(0, '37.514')] [2024-08-05 16:28:44,066][15444] Updated weights for policy 0, policy_version 19151 (0.0029) [2024-08-05 16:28:45,170][15417] Signal inference workers to stop experience collection... (6950 times) [2024-08-05 16:28:45,171][15417] Signal inference workers to resume experience collection... (6950 times) [2024-08-05 16:28:45,225][15444] InferenceWorker_p0-w0: stopping experience collection (6950 times) [2024-08-05 16:28:45,228][15444] InferenceWorker_p0-w0: resuming experience collection (6950 times) [2024-08-05 16:28:47,270][15444] Updated weights for policy 0, policy_version 19161 (0.0021) [2024-08-05 16:28:48,118][15372] Fps is (10 sec: 25395.6, 60 sec: 24302.9, 300 sec: 24298.3). Total num frames: 156983296. Throughput: 0: 6052.0. Samples: 39251790. Policy #0 lag: (min: 2.0, avg: 4.0, max: 9.0) [2024-08-05 16:28:48,119][15372] Avg episode reward: [(0, '36.659')] [2024-08-05 16:28:50,792][15444] Updated weights for policy 0, policy_version 19171 (0.0012) [2024-08-05 16:28:53,141][15372] Fps is (10 sec: 24520.9, 60 sec: 24293.8, 300 sec: 24324.2). Total num frames: 157106176. Throughput: 0: 6055.9. Samples: 39270890. Policy #0 lag: (min: 2.0, avg: 4.0, max: 9.0) [2024-08-05 16:28:53,141][15372] Avg episode reward: [(0, '36.834')] [2024-08-05 16:28:54,001][15444] Updated weights for policy 0, policy_version 19181 (0.0023) [2024-08-05 16:28:57,354][15444] Updated weights for policy 0, policy_version 19191 (0.0018) [2024-08-05 16:28:58,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24303.1, 300 sec: 24326.1). Total num frames: 157229056. Throughput: 0: 6065.1. Samples: 39307540. Policy #0 lag: (min: 1.0, avg: 4.7, max: 8.0) [2024-08-05 16:28:58,119][15372] Avg episode reward: [(0, '37.817')] [2024-08-05 16:29:01,099][15444] Updated weights for policy 0, policy_version 19201 (0.0047) [2024-08-05 16:29:03,119][15372] Fps is (10 sec: 23810.3, 60 sec: 24166.4, 300 sec: 24298.3). Total num frames: 157343744. Throughput: 0: 6064.0. Samples: 39343510. Policy #0 lag: (min: 1.0, avg: 4.7, max: 8.0) [2024-08-05 16:29:03,119][15372] Avg episode reward: [(0, '37.716')] [2024-08-05 16:29:03,974][15444] Updated weights for policy 0, policy_version 19211 (0.0010) [2024-08-05 16:29:07,794][15444] Updated weights for policy 0, policy_version 19221 (0.0010) [2024-08-05 16:29:08,119][15372] Fps is (10 sec: 23755.9, 60 sec: 24166.2, 300 sec: 24326.0). Total num frames: 157466624. Throughput: 0: 6059.3. Samples: 39361560. Policy #0 lag: (min: 1.0, avg: 4.7, max: 8.0) [2024-08-05 16:29:08,119][15372] Avg episode reward: [(0, '36.594')] [2024-08-05 16:29:10,994][15444] Updated weights for policy 0, policy_version 19231 (0.0018) [2024-08-05 16:29:13,119][15372] Fps is (10 sec: 24575.9, 60 sec: 24166.3, 300 sec: 24326.1). Total num frames: 157589504. Throughput: 0: 6047.3. Samples: 39397210. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 16:29:13,119][15372] Avg episode reward: [(0, '37.301')] [2024-08-05 16:29:14,475][15444] Updated weights for policy 0, policy_version 19241 (0.0012) [2024-08-05 16:29:17,843][15444] Updated weights for policy 0, policy_version 19251 (0.0018) [2024-08-05 16:29:18,118][15372] Fps is (10 sec: 23757.6, 60 sec: 24029.9, 300 sec: 24298.3). Total num frames: 157704192. Throughput: 0: 6046.9. Samples: 39433370. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 16:29:18,119][15372] Avg episode reward: [(0, '36.030')] [2024-08-05 16:29:20,963][15444] Updated weights for policy 0, policy_version 19261 (0.0011) [2024-08-05 16:29:22,787][15417] Signal inference workers to stop experience collection... (7000 times) [2024-08-05 16:29:22,795][15417] Signal inference workers to resume experience collection... (7000 times) [2024-08-05 16:29:22,840][15444] InferenceWorker_p0-w0: stopping experience collection (7000 times) [2024-08-05 16:29:22,840][15444] InferenceWorker_p0-w0: resuming experience collection (7000 times) [2024-08-05 16:29:23,118][15372] Fps is (10 sec: 24576.3, 60 sec: 24303.0, 300 sec: 24326.1). Total num frames: 157835264. Throughput: 0: 6050.9. Samples: 39452210. Policy #0 lag: (min: 1.0, avg: 3.3, max: 7.0) [2024-08-05 16:29:23,119][15372] Avg episode reward: [(0, '36.237')] [2024-08-05 16:29:24,656][15444] Updated weights for policy 0, policy_version 19271 (0.0015) [2024-08-05 16:29:27,885][15444] Updated weights for policy 0, policy_version 19281 (0.0020) [2024-08-05 16:29:28,118][15372] Fps is (10 sec: 25395.2, 60 sec: 24302.9, 300 sec: 24326.1). Total num frames: 157958144. Throughput: 0: 6073.1. Samples: 39488840. Policy #0 lag: (min: 1.0, avg: 3.3, max: 7.0) [2024-08-05 16:29:28,119][15372] Avg episode reward: [(0, '36.769')] [2024-08-05 16:29:31,248][15444] Updated weights for policy 0, policy_version 19291 (0.0015) [2024-08-05 16:29:33,118][15372] Fps is (10 sec: 23756.6, 60 sec: 24166.4, 300 sec: 24326.1). Total num frames: 158072832. Throughput: 0: 6056.2. Samples: 39524320. Policy #0 lag: (min: 1.0, avg: 3.3, max: 7.0) [2024-08-05 16:29:33,126][15372] Avg episode reward: [(0, '37.594')] [2024-08-05 16:29:34,438][15444] Updated weights for policy 0, policy_version 19301 (0.0013) [2024-08-05 16:29:38,079][15444] Updated weights for policy 0, policy_version 19311 (0.0026) [2024-08-05 16:29:38,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24439.5, 300 sec: 24326.1). Total num frames: 158195712. Throughput: 0: 6053.3. Samples: 39543150. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 16:29:38,119][15372] Avg episode reward: [(0, '38.202')] [2024-08-05 16:29:41,386][15444] Updated weights for policy 0, policy_version 19321 (0.0012) [2024-08-05 16:29:43,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24303.0, 300 sec: 24298.3). Total num frames: 158318592. Throughput: 0: 6035.8. Samples: 39579150. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 16:29:43,119][15372] Avg episode reward: [(0, '37.915')] [2024-08-05 16:29:44,757][15444] Updated weights for policy 0, policy_version 19331 (0.0021) [2024-08-05 16:29:48,119][15372] Fps is (10 sec: 23756.5, 60 sec: 24166.3, 300 sec: 24298.3). Total num frames: 158433280. Throughput: 0: 6048.4. Samples: 39615690. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 16:29:48,126][15372] Avg episode reward: [(0, '37.279')] [2024-08-05 16:29:48,192][15444] Updated weights for policy 0, policy_version 19341 (0.0023) [2024-08-05 16:29:51,307][15444] Updated weights for policy 0, policy_version 19351 (0.0020) [2024-08-05 16:29:53,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24175.5, 300 sec: 24270.8). Total num frames: 158556160. Throughput: 0: 6058.7. Samples: 39634200. Policy #0 lag: (min: 0.0, avg: 4.2, max: 9.0) [2024-08-05 16:29:53,126][15372] Avg episode reward: [(0, '37.875')] [2024-08-05 16:29:55,102][15444] Updated weights for policy 0, policy_version 19361 (0.0020) [2024-08-05 16:29:58,118][15372] Fps is (10 sec: 24576.4, 60 sec: 24166.4, 300 sec: 24298.3). Total num frames: 158679040. Throughput: 0: 6069.3. Samples: 39670330. Policy #0 lag: (min: 0.0, avg: 4.2, max: 9.0) [2024-08-05 16:29:58,126][15372] Avg episode reward: [(0, '37.946')] [2024-08-05 16:29:58,277][15444] Updated weights for policy 0, policy_version 19371 (0.0013) [2024-08-05 16:30:01,639][15444] Updated weights for policy 0, policy_version 19381 (0.0019) [2024-08-05 16:30:03,119][15372] Fps is (10 sec: 23756.6, 60 sec: 24166.4, 300 sec: 24270.5). Total num frames: 158793728. Throughput: 0: 6060.9. Samples: 39706110. Policy #0 lag: (min: 0.0, avg: 4.2, max: 9.0) [2024-08-05 16:30:03,126][15372] Avg episode reward: [(0, '38.079')] [2024-08-05 16:30:05,179][15444] Updated weights for policy 0, policy_version 19391 (0.0012) [2024-08-05 16:30:05,844][15417] Signal inference workers to stop experience collection... (7050 times) [2024-08-05 16:30:05,855][15417] Signal inference workers to resume experience collection... (7050 times) [2024-08-05 16:30:05,879][15444] InferenceWorker_p0-w0: stopping experience collection (7050 times) [2024-08-05 16:30:05,879][15444] InferenceWorker_p0-w0: resuming experience collection (7050 times) [2024-08-05 16:30:08,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24303.1, 300 sec: 24298.3). Total num frames: 158924800. Throughput: 0: 6059.3. Samples: 39724880. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 16:30:08,119][15372] Avg episode reward: [(0, '36.834')] [2024-08-05 16:30:08,172][15444] Updated weights for policy 0, policy_version 19401 (0.0024) [2024-08-05 16:30:11,807][15444] Updated weights for policy 0, policy_version 19411 (0.0024) [2024-08-05 16:30:13,120][15372] Fps is (10 sec: 24571.6, 60 sec: 24165.7, 300 sec: 24270.4). Total num frames: 159039488. Throughput: 0: 6053.8. Samples: 39761270. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 16:30:13,121][15372] Avg episode reward: [(0, '36.505')] [2024-08-05 16:30:15,254][15444] Updated weights for policy 0, policy_version 19421 (0.0020) [2024-08-05 16:30:18,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24439.5, 300 sec: 24298.3). Total num frames: 159170560. Throughput: 0: 6075.8. Samples: 39797730. Policy #0 lag: (min: 0.0, avg: 4.3, max: 7.0) [2024-08-05 16:30:18,127][15372] Avg episode reward: [(0, '37.858')] [2024-08-05 16:30:18,130][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000019430_159170560.pth... [2024-08-05 16:30:18,242][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000018719_153346048.pth [2024-08-05 16:30:18,572][15444] Updated weights for policy 0, policy_version 19431 (0.0023) [2024-08-05 16:30:22,161][15444] Updated weights for policy 0, policy_version 19441 (0.0018) [2024-08-05 16:30:23,122][15372] Fps is (10 sec: 24572.5, 60 sec: 24165.1, 300 sec: 24298.0). Total num frames: 159285248. Throughput: 0: 6052.7. Samples: 39815540. Policy #0 lag: (min: 0.0, avg: 4.3, max: 7.0) [2024-08-05 16:30:23,122][15372] Avg episode reward: [(0, '37.589')] [2024-08-05 16:30:25,174][15444] Updated weights for policy 0, policy_version 19451 (0.0023) [2024-08-05 16:30:28,119][15372] Fps is (10 sec: 22936.7, 60 sec: 24029.7, 300 sec: 24270.5). Total num frames: 159399936. Throughput: 0: 6063.3. Samples: 39852000. Policy #0 lag: (min: 0.0, avg: 4.3, max: 7.0) [2024-08-05 16:30:28,127][15372] Avg episode reward: [(0, '37.223')] [2024-08-05 16:30:28,801][15444] Updated weights for policy 0, policy_version 19461 (0.0018) [2024-08-05 16:30:32,381][15444] Updated weights for policy 0, policy_version 19471 (0.0016) [2024-08-05 16:30:33,119][15372] Fps is (10 sec: 23764.2, 60 sec: 24166.4, 300 sec: 24242.8). Total num frames: 159522816. Throughput: 0: 6047.6. Samples: 39887830. Policy #0 lag: (min: 1.0, avg: 3.5, max: 8.0) [2024-08-05 16:30:33,119][15372] Avg episode reward: [(0, '36.345')] [2024-08-05 16:30:35,491][15444] Updated weights for policy 0, policy_version 19481 (0.0022) [2024-08-05 16:30:38,118][15372] Fps is (10 sec: 24577.0, 60 sec: 24166.4, 300 sec: 24270.5). Total num frames: 159645696. Throughput: 0: 6031.3. Samples: 39905610. Policy #0 lag: (min: 1.0, avg: 3.5, max: 8.0) [2024-08-05 16:30:38,126][15372] Avg episode reward: [(0, '37.668')] [2024-08-05 16:30:39,129][15444] Updated weights for policy 0, policy_version 19491 (0.0012) [2024-08-05 16:30:42,266][15444] Updated weights for policy 0, policy_version 19501 (0.0020) [2024-08-05 16:30:43,118][15372] Fps is (10 sec: 23757.2, 60 sec: 24029.9, 300 sec: 24215.0). Total num frames: 159760384. Throughput: 0: 6030.4. Samples: 39941700. Policy #0 lag: (min: 1.0, avg: 3.5, max: 8.0) [2024-08-05 16:30:43,126][15372] Avg episode reward: [(0, '38.303')] [2024-08-05 16:30:43,409][15417] Signal inference workers to stop experience collection... (7100 times) [2024-08-05 16:30:43,414][15417] Signal inference workers to resume experience collection... (7100 times) [2024-08-05 16:30:43,490][15444] InferenceWorker_p0-w0: stopping experience collection (7100 times) [2024-08-05 16:30:43,490][15444] InferenceWorker_p0-w0: resuming experience collection (7100 times) [2024-08-05 16:30:45,970][15444] Updated weights for policy 0, policy_version 19511 (0.0011) [2024-08-05 16:30:48,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.5, 300 sec: 24242.8). Total num frames: 159883264. Throughput: 0: 6026.5. Samples: 39977300. Policy #0 lag: (min: 2.0, avg: 4.6, max: 8.0) [2024-08-05 16:30:48,119][15372] Avg episode reward: [(0, '37.804')] [2024-08-05 16:30:49,567][15444] Updated weights for policy 0, policy_version 19521 (0.0026) [2024-08-05 16:30:52,656][15444] Updated weights for policy 0, policy_version 19531 (0.0036) [2024-08-05 16:30:53,119][15372] Fps is (10 sec: 24575.5, 60 sec: 24166.3, 300 sec: 24242.8). Total num frames: 160006144. Throughput: 0: 6006.6. Samples: 39995180. Policy #0 lag: (min: 2.0, avg: 4.6, max: 8.0) [2024-08-05 16:30:53,119][15372] Avg episode reward: [(0, '38.566')] [2024-08-05 16:30:56,114][15444] Updated weights for policy 0, policy_version 19541 (0.0017) [2024-08-05 16:30:58,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24166.4, 300 sec: 24242.8). Total num frames: 160129024. Throughput: 0: 6003.8. Samples: 40031430. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 16:30:58,126][15372] Avg episode reward: [(0, '37.828')] [2024-08-05 16:30:59,417][15444] Updated weights for policy 0, policy_version 19551 (0.0019) [2024-08-05 16:31:03,098][15444] Updated weights for policy 0, policy_version 19561 (0.0015) [2024-08-05 16:31:03,119][15372] Fps is (10 sec: 23757.0, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 160243712. Throughput: 0: 5996.0. Samples: 40067550. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 16:31:03,119][15372] Avg episode reward: [(0, '36.841')] [2024-08-05 16:31:06,276][15444] Updated weights for policy 0, policy_version 19571 (0.0017) [2024-08-05 16:31:08,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24029.9, 300 sec: 24215.0). Total num frames: 160366592. Throughput: 0: 6003.5. Samples: 40085680. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 16:31:08,126][15372] Avg episode reward: [(0, '37.115')] [2024-08-05 16:31:09,627][15444] Updated weights for policy 0, policy_version 19581 (0.0024) [2024-08-05 16:31:13,118][15372] Fps is (10 sec: 23757.2, 60 sec: 24030.6, 300 sec: 24187.2). Total num frames: 160481280. Throughput: 0: 6004.3. Samples: 40122190. Policy #0 lag: (min: 2.0, avg: 4.1, max: 8.0) [2024-08-05 16:31:13,126][15372] Avg episode reward: [(0, '37.620')] [2024-08-05 16:31:13,195][15444] Updated weights for policy 0, policy_version 19591 (0.0012) [2024-08-05 16:31:16,528][15444] Updated weights for policy 0, policy_version 19601 (0.0019) [2024-08-05 16:31:18,118][15372] Fps is (10 sec: 23756.8, 60 sec: 23893.3, 300 sec: 24215.0). Total num frames: 160604160. Throughput: 0: 6012.9. Samples: 40158410. Policy #0 lag: (min: 2.0, avg: 4.1, max: 8.0) [2024-08-05 16:31:18,126][15372] Avg episode reward: [(0, '37.202')] [2024-08-05 16:31:19,922][15444] Updated weights for policy 0, policy_version 19611 (0.0012) [2024-08-05 16:31:23,119][15372] Fps is (10 sec: 24575.5, 60 sec: 24031.1, 300 sec: 24187.2). Total num frames: 160727040. Throughput: 0: 6024.9. Samples: 40176730. Policy #0 lag: (min: 2.0, avg: 4.1, max: 8.0) [2024-08-05 16:31:23,126][15372] Avg episode reward: [(0, '37.001')] [2024-08-05 16:31:23,256][15444] Updated weights for policy 0, policy_version 19621 (0.0020) [2024-08-05 16:31:26,609][15444] Updated weights for policy 0, policy_version 19631 (0.0021) [2024-08-05 16:31:26,741][15417] Signal inference workers to stop experience collection... (7150 times) [2024-08-05 16:31:26,742][15417] Signal inference workers to resume experience collection... (7150 times) [2024-08-05 16:31:26,821][15444] InferenceWorker_p0-w0: stopping experience collection (7150 times) [2024-08-05 16:31:26,826][15444] InferenceWorker_p0-w0: resuming experience collection (7150 times) [2024-08-05 16:31:28,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24166.6, 300 sec: 24243.4). Total num frames: 160849920. Throughput: 0: 6018.0. Samples: 40212510. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 16:31:28,119][15372] Avg episode reward: [(0, '37.103')] [2024-08-05 16:31:29,964][15444] Updated weights for policy 0, policy_version 19641 (0.0037) [2024-08-05 16:31:33,118][15372] Fps is (10 sec: 24576.4, 60 sec: 24166.5, 300 sec: 24215.0). Total num frames: 160972800. Throughput: 0: 6039.1. Samples: 40249060. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 16:31:33,126][15372] Avg episode reward: [(0, '36.694')] [2024-08-05 16:31:33,323][15444] Updated weights for policy 0, policy_version 19651 (0.0012) [2024-08-05 16:31:36,831][15444] Updated weights for policy 0, policy_version 19661 (0.0013) [2024-08-05 16:31:38,119][15372] Fps is (10 sec: 24575.7, 60 sec: 24166.3, 300 sec: 24215.0). Total num frames: 161095680. Throughput: 0: 6051.6. Samples: 40267500. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 16:31:38,119][15372] Avg episode reward: [(0, '37.382')] [2024-08-05 16:31:40,055][15444] Updated weights for policy 0, policy_version 19671 (0.0010) [2024-08-05 16:31:43,119][15372] Fps is (10 sec: 24575.6, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 161218560. Throughput: 0: 6059.5. Samples: 40304110. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 16:31:43,126][15372] Avg episode reward: [(0, '37.430')] [2024-08-05 16:31:43,487][15444] Updated weights for policy 0, policy_version 19681 (0.0015) [2024-08-05 16:31:47,007][15444] Updated weights for policy 0, policy_version 19691 (0.0011) [2024-08-05 16:31:48,118][15372] Fps is (10 sec: 23757.2, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 161333248. Throughput: 0: 6056.2. Samples: 40340080. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 16:31:48,119][15372] Avg episode reward: [(0, '38.008')] [2024-08-05 16:31:50,149][15444] Updated weights for policy 0, policy_version 19701 (0.0013) [2024-08-05 16:31:53,119][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 161456128. Throughput: 0: 6066.4. Samples: 40358670. Policy #0 lag: (min: 1.0, avg: 4.3, max: 9.0) [2024-08-05 16:31:53,126][15372] Avg episode reward: [(0, '38.336')] [2024-08-05 16:31:53,802][15444] Updated weights for policy 0, policy_version 19711 (0.0016) [2024-08-05 16:31:56,793][15444] Updated weights for policy 0, policy_version 19721 (0.0017) [2024-08-05 16:31:58,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24029.9, 300 sec: 24187.2). Total num frames: 161570816. Throughput: 0: 6051.5. Samples: 40394510. Policy #0 lag: (min: 1.0, avg: 4.3, max: 9.0) [2024-08-05 16:31:58,119][15372] Avg episode reward: [(0, '37.586')] [2024-08-05 16:32:00,456][15444] Updated weights for policy 0, policy_version 19731 (0.0024) [2024-08-05 16:32:03,118][15372] Fps is (10 sec: 24576.3, 60 sec: 24303.0, 300 sec: 24242.8). Total num frames: 161701888. Throughput: 0: 6054.0. Samples: 40430840. Policy #0 lag: (min: 1.0, avg: 4.3, max: 9.0) [2024-08-05 16:32:03,126][15372] Avg episode reward: [(0, '38.052')] [2024-08-05 16:32:03,967][15444] Updated weights for policy 0, policy_version 19741 (0.0022) [2024-08-05 16:32:07,180][15444] Updated weights for policy 0, policy_version 19751 (0.0027) [2024-08-05 16:32:08,119][15372] Fps is (10 sec: 24574.9, 60 sec: 24166.2, 300 sec: 24215.0). Total num frames: 161816576. Throughput: 0: 6041.5. Samples: 40448600. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 16:32:08,127][15372] Avg episode reward: [(0, '39.095')] [2024-08-05 16:32:09,199][15417] Signal inference workers to stop experience collection... (7200 times) [2024-08-05 16:32:09,199][15417] Signal inference workers to resume experience collection... (7200 times) [2024-08-05 16:32:09,234][15444] InferenceWorker_p0-w0: stopping experience collection (7200 times) [2024-08-05 16:32:09,234][15444] InferenceWorker_p0-w0: resuming experience collection (7200 times) [2024-08-05 16:32:10,838][15444] Updated weights for policy 0, policy_version 19761 (0.0026) [2024-08-05 16:32:13,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 161939456. Throughput: 0: 6053.8. Samples: 40484930. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 16:32:13,119][15372] Avg episode reward: [(0, '37.690')] [2024-08-05 16:32:13,898][15444] Updated weights for policy 0, policy_version 19771 (0.0015) [2024-08-05 16:32:17,498][15444] Updated weights for policy 0, policy_version 19781 (0.0028) [2024-08-05 16:32:18,118][15372] Fps is (10 sec: 24577.2, 60 sec: 24303.0, 300 sec: 24215.0). Total num frames: 162062336. Throughput: 0: 6042.4. Samples: 40520970. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 16:32:18,119][15372] Avg episode reward: [(0, '38.002')] [2024-08-05 16:32:18,122][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000019783_162062336.pth... [2024-08-05 16:32:18,253][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000019075_156262400.pth [2024-08-05 16:32:20,818][15444] Updated weights for policy 0, policy_version 19791 (0.0017) [2024-08-05 16:32:23,119][15372] Fps is (10 sec: 23756.5, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 162177024. Throughput: 0: 6035.1. Samples: 40539080. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 16:32:23,119][15372] Avg episode reward: [(0, '37.666')] [2024-08-05 16:32:24,238][15444] Updated weights for policy 0, policy_version 19801 (0.0027) [2024-08-05 16:32:27,633][15444] Updated weights for policy 0, policy_version 19811 (0.0013) [2024-08-05 16:32:28,119][15372] Fps is (10 sec: 23755.9, 60 sec: 24166.2, 300 sec: 24187.2). Total num frames: 162299904. Throughput: 0: 6040.9. Samples: 40575950. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 16:32:28,119][15372] Avg episode reward: [(0, '37.902')] [2024-08-05 16:32:30,822][15444] Updated weights for policy 0, policy_version 19821 (0.0030) [2024-08-05 16:32:33,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 162422784. Throughput: 0: 6040.2. Samples: 40611890. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 16:32:33,126][15372] Avg episode reward: [(0, '38.822')] [2024-08-05 16:32:34,444][15444] Updated weights for policy 0, policy_version 19831 (0.0016) [2024-08-05 16:32:37,770][15444] Updated weights for policy 0, policy_version 19841 (0.0017) [2024-08-05 16:32:38,121][15372] Fps is (10 sec: 23751.5, 60 sec: 24028.9, 300 sec: 24187.0). Total num frames: 162537472. Throughput: 0: 6021.7. Samples: 40629660. Policy #0 lag: (min: 0.0, avg: 4.2, max: 7.0) [2024-08-05 16:32:38,121][15372] Avg episode reward: [(0, '38.053')] [2024-08-05 16:32:41,273][15444] Updated weights for policy 0, policy_version 19851 (0.0019) [2024-08-05 16:32:43,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.9, 300 sec: 24187.2). Total num frames: 162660352. Throughput: 0: 6021.6. Samples: 40665480. Policy #0 lag: (min: 0.0, avg: 4.2, max: 7.0) [2024-08-05 16:32:43,126][15372] Avg episode reward: [(0, '37.420')] [2024-08-05 16:32:44,774][15444] Updated weights for policy 0, policy_version 19861 (0.0012) [2024-08-05 16:32:47,871][15444] Updated weights for policy 0, policy_version 19871 (0.0021) [2024-08-05 16:32:48,118][15372] Fps is (10 sec: 24582.4, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 162783232. Throughput: 0: 6026.4. Samples: 40702030. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 16:32:48,119][15372] Avg episode reward: [(0, '37.514')] [2024-08-05 16:32:51,311][15444] Updated weights for policy 0, policy_version 19881 (0.0013) [2024-08-05 16:32:53,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 162897920. Throughput: 0: 6048.7. Samples: 40720790. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 16:32:53,119][15372] Avg episode reward: [(0, '37.880')] [2024-08-05 16:32:54,571][15444] Updated weights for policy 0, policy_version 19891 (0.0011) [2024-08-05 16:32:55,257][15417] Signal inference workers to stop experience collection... (7250 times) [2024-08-05 16:32:55,266][15417] Signal inference workers to resume experience collection... (7250 times) [2024-08-05 16:32:55,313][15444] InferenceWorker_p0-w0: stopping experience collection (7250 times) [2024-08-05 16:32:55,321][15444] InferenceWorker_p0-w0: resuming experience collection (7250 times) [2024-08-05 16:32:57,949][15444] Updated weights for policy 0, policy_version 19901 (0.0016) [2024-08-05 16:32:58,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 163028992. Throughput: 0: 6050.2. Samples: 40757190. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 16:32:58,119][15372] Avg episode reward: [(0, '38.354')] [2024-08-05 16:33:01,466][15444] Updated weights for policy 0, policy_version 19911 (0.0013) [2024-08-05 16:33:03,118][15372] Fps is (10 sec: 25395.2, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 163151872. Throughput: 0: 6054.0. Samples: 40793400. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 16:33:03,126][15372] Avg episode reward: [(0, '37.779')] [2024-08-05 16:33:04,851][15444] Updated weights for policy 0, policy_version 19921 (0.0013) [2024-08-05 16:33:08,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.6, 300 sec: 24159.5). Total num frames: 163266560. Throughput: 0: 6067.1. Samples: 40812100. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 16:33:08,126][15372] Avg episode reward: [(0, '37.914')] [2024-08-05 16:33:08,294][15444] Updated weights for policy 0, policy_version 19931 (0.0017) [2024-08-05 16:33:11,533][15444] Updated weights for policy 0, policy_version 19941 (0.0024) [2024-08-05 16:33:13,119][15372] Fps is (10 sec: 23756.6, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 163389440. Throughput: 0: 6047.4. Samples: 40848080. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 16:33:13,126][15372] Avg episode reward: [(0, '38.093')] [2024-08-05 16:33:14,820][15444] Updated weights for policy 0, policy_version 19951 (0.0030) [2024-08-05 16:33:18,119][15372] Fps is (10 sec: 24575.7, 60 sec: 24166.3, 300 sec: 24187.2). Total num frames: 163512320. Throughput: 0: 6063.3. Samples: 40884740. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 16:33:18,126][15372] Avg episode reward: [(0, '38.435')] [2024-08-05 16:33:18,236][15444] Updated weights for policy 0, policy_version 19961 (0.0024) [2024-08-05 16:33:21,724][15444] Updated weights for policy 0, policy_version 19971 (0.0011) [2024-08-05 16:33:23,119][15372] Fps is (10 sec: 24575.8, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 163635200. Throughput: 0: 6080.8. Samples: 40903280. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 16:33:23,119][15372] Avg episode reward: [(0, '37.914')] [2024-08-05 16:33:24,846][15444] Updated weights for policy 0, policy_version 19981 (0.0020) [2024-08-05 16:33:28,118][15372] Fps is (10 sec: 23757.2, 60 sec: 24166.6, 300 sec: 24159.5). Total num frames: 163749888. Throughput: 0: 6100.4. Samples: 40940000. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 16:33:28,126][15372] Avg episode reward: [(0, '37.690')] [2024-08-05 16:33:28,383][15444] Updated weights for policy 0, policy_version 19991 (0.0012) [2024-08-05 16:33:31,509][15444] Updated weights for policy 0, policy_version 20001 (0.0017) [2024-08-05 16:33:33,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24302.9, 300 sec: 24242.8). Total num frames: 163880960. Throughput: 0: 6087.6. Samples: 40975970. Policy #0 lag: (min: 1.0, avg: 3.8, max: 8.0) [2024-08-05 16:33:33,126][15372] Avg episode reward: [(0, '37.112')] [2024-08-05 16:33:35,125][15444] Updated weights for policy 0, policy_version 20011 (0.0019) [2024-08-05 16:33:38,119][15372] Fps is (10 sec: 25394.7, 60 sec: 24440.5, 300 sec: 24215.0). Total num frames: 164003840. Throughput: 0: 6074.4. Samples: 40994140. Policy #0 lag: (min: 1.0, avg: 3.8, max: 8.0) [2024-08-05 16:33:38,119][15372] Avg episode reward: [(0, '37.699')] [2024-08-05 16:33:38,641][15444] Updated weights for policy 0, policy_version 20021 (0.0011) [2024-08-05 16:33:41,793][15444] Updated weights for policy 0, policy_version 20031 (0.0026) [2024-08-05 16:33:43,119][15372] Fps is (10 sec: 23756.7, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 164118528. Throughput: 0: 6069.8. Samples: 41030330. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 16:33:43,119][15372] Avg episode reward: [(0, '37.964')] [2024-08-05 16:33:45,307][15444] Updated weights for policy 0, policy_version 20041 (0.0011) [2024-08-05 16:33:48,058][15417] Signal inference workers to stop experience collection... (7300 times) [2024-08-05 16:33:48,058][15417] Signal inference workers to resume experience collection... (7300 times) [2024-08-05 16:33:48,088][15444] InferenceWorker_p0-w0: stopping experience collection (7300 times) [2024-08-05 16:33:48,099][15444] InferenceWorker_p0-w0: resuming experience collection (7300 times) [2024-08-05 16:33:48,119][15372] Fps is (10 sec: 23756.4, 60 sec: 24302.8, 300 sec: 24189.0). Total num frames: 164241408. Throughput: 0: 6086.2. Samples: 41067280. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 16:33:48,119][15372] Avg episode reward: [(0, '37.511')] [2024-08-05 16:33:48,401][15444] Updated weights for policy 0, policy_version 20051 (0.0023) [2024-08-05 16:33:52,248][15444] Updated weights for policy 0, policy_version 20061 (0.0021) [2024-08-05 16:33:53,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24439.5, 300 sec: 24187.2). Total num frames: 164364288. Throughput: 0: 6069.6. Samples: 41085230. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 16:33:53,119][15372] Avg episode reward: [(0, '37.976')] [2024-08-05 16:33:55,537][15444] Updated weights for policy 0, policy_version 20071 (0.0020) [2024-08-05 16:33:58,119][15372] Fps is (10 sec: 23757.2, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 164478976. Throughput: 0: 6066.2. Samples: 41121060. Policy #0 lag: (min: 1.0, avg: 3.9, max: 8.0) [2024-08-05 16:33:58,126][15372] Avg episode reward: [(0, '38.311')] [2024-08-05 16:33:58,824][15444] Updated weights for policy 0, policy_version 20081 (0.0015) [2024-08-05 16:34:02,424][15444] Updated weights for policy 0, policy_version 20091 (0.0029) [2024-08-05 16:34:03,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24187.3). Total num frames: 164601856. Throughput: 0: 6045.4. Samples: 41156780. Policy #0 lag: (min: 1.0, avg: 3.9, max: 8.0) [2024-08-05 16:34:03,119][15372] Avg episode reward: [(0, '38.049')] [2024-08-05 16:34:05,410][15444] Updated weights for policy 0, policy_version 20101 (0.0013) [2024-08-05 16:34:08,118][15372] Fps is (10 sec: 24576.3, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 164724736. Throughput: 0: 6041.1. Samples: 41175130. Policy #0 lag: (min: 1.0, avg: 3.9, max: 8.0) [2024-08-05 16:34:08,126][15372] Avg episode reward: [(0, '37.550')] [2024-08-05 16:34:09,207][15444] Updated weights for policy 0, policy_version 20111 (0.0015) [2024-08-05 16:34:12,622][15444] Updated weights for policy 0, policy_version 20121 (0.0013) [2024-08-05 16:34:13,119][15372] Fps is (10 sec: 23756.6, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 164839424. Throughput: 0: 6014.7. Samples: 41210660. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 16:34:13,119][15372] Avg episode reward: [(0, '37.206')] [2024-08-05 16:34:15,857][15444] Updated weights for policy 0, policy_version 20131 (0.0012) [2024-08-05 16:34:18,119][15372] Fps is (10 sec: 23755.1, 60 sec: 24166.2, 300 sec: 24159.4). Total num frames: 164962304. Throughput: 0: 6020.6. Samples: 41246900. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 16:34:18,127][15372] Avg episode reward: [(0, '37.026')] [2024-08-05 16:34:18,185][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000020138_164970496.pth... [2024-08-05 16:34:18,311][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000019430_159170560.pth [2024-08-05 16:34:19,459][15444] Updated weights for policy 0, policy_version 20141 (0.0026) [2024-08-05 16:34:22,837][15444] Updated weights for policy 0, policy_version 20151 (0.0034) [2024-08-05 16:34:23,118][15372] Fps is (10 sec: 23757.1, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 165076992. Throughput: 0: 6017.4. Samples: 41264920. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 16:34:23,119][15372] Avg episode reward: [(0, '38.100')] [2024-08-05 16:34:26,038][15444] Updated weights for policy 0, policy_version 20161 (0.0012) [2024-08-05 16:34:28,118][15372] Fps is (10 sec: 24577.8, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 165208064. Throughput: 0: 6014.5. Samples: 41300980. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 16:34:28,119][15372] Avg episode reward: [(0, '38.132')] [2024-08-05 16:34:29,609][15444] Updated weights for policy 0, policy_version 20171 (0.0027) [2024-08-05 16:34:32,961][15444] Updated weights for policy 0, policy_version 20181 (0.0012) [2024-08-05 16:34:33,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 165322752. Throughput: 0: 5996.3. Samples: 41337110. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 16:34:33,119][15372] Avg episode reward: [(0, '37.913')] [2024-08-05 16:34:36,204][15444] Updated weights for policy 0, policy_version 20191 (0.0019) [2024-08-05 16:34:38,119][15372] Fps is (10 sec: 23755.4, 60 sec: 24029.7, 300 sec: 24159.4). Total num frames: 165445632. Throughput: 0: 6010.4. Samples: 41355700. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 16:34:38,127][15372] Avg episode reward: [(0, '38.615')] [2024-08-05 16:34:39,584][15444] Updated weights for policy 0, policy_version 20201 (0.0011) [2024-08-05 16:34:42,404][15417] Signal inference workers to stop experience collection... (7350 times) [2024-08-05 16:34:42,406][15417] Signal inference workers to resume experience collection... (7350 times) [2024-08-05 16:34:42,473][15444] InferenceWorker_p0-w0: stopping experience collection (7350 times) [2024-08-05 16:34:42,473][15444] InferenceWorker_p0-w0: resuming experience collection (7350 times) [2024-08-05 16:34:42,787][15444] Updated weights for policy 0, policy_version 20211 (0.0012) [2024-08-05 16:34:43,119][15372] Fps is (10 sec: 25395.0, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 165576704. Throughput: 0: 6044.2. Samples: 41393050. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 16:34:43,119][15372] Avg episode reward: [(0, '38.693')] [2024-08-05 16:34:46,410][15444] Updated weights for policy 0, policy_version 20221 (0.0014) [2024-08-05 16:34:48,119][15372] Fps is (10 sec: 23757.8, 60 sec: 24030.0, 300 sec: 24159.4). Total num frames: 165683200. Throughput: 0: 6042.0. Samples: 41428670. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 16:34:48,119][15372] Avg episode reward: [(0, '38.459')] [2024-08-05 16:34:49,580][15444] Updated weights for policy 0, policy_version 20231 (0.0015) [2024-08-05 16:34:53,015][15444] Updated weights for policy 0, policy_version 20241 (0.0012) [2024-08-05 16:34:53,119][15372] Fps is (10 sec: 23755.8, 60 sec: 24166.2, 300 sec: 24187.2). Total num frames: 165814272. Throughput: 0: 6057.3. Samples: 41447710. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 16:34:53,119][15372] Avg episode reward: [(0, '38.492')] [2024-08-05 16:34:56,482][15444] Updated weights for policy 0, policy_version 20251 (0.0014) [2024-08-05 16:34:58,119][15372] Fps is (10 sec: 24576.1, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 165928960. Throughput: 0: 6063.1. Samples: 41483500. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 16:34:58,126][15372] Avg episode reward: [(0, '37.708')] [2024-08-05 16:34:59,559][15444] Updated weights for policy 0, policy_version 20261 (0.0014) [2024-08-05 16:35:03,118][15372] Fps is (10 sec: 23758.0, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 166051840. Throughput: 0: 6065.0. Samples: 41519820. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 16:35:03,126][15372] Avg episode reward: [(0, '38.252')] [2024-08-05 16:35:03,297][15444] Updated weights for policy 0, policy_version 20271 (0.0013) [2024-08-05 16:35:06,451][15444] Updated weights for policy 0, policy_version 20281 (0.0012) [2024-08-05 16:35:08,119][15372] Fps is (10 sec: 24575.7, 60 sec: 24166.3, 300 sec: 24187.4). Total num frames: 166174720. Throughput: 0: 6071.5. Samples: 41538140. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 16:35:08,127][15372] Avg episode reward: [(0, '38.201')] [2024-08-05 16:35:09,890][15444] Updated weights for policy 0, policy_version 20291 (0.0016) [2024-08-05 16:35:13,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24303.0, 300 sec: 24159.5). Total num frames: 166297600. Throughput: 0: 6084.2. Samples: 41574770. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 16:35:13,126][15372] Avg episode reward: [(0, '38.576')] [2024-08-05 16:35:13,418][15444] Updated weights for policy 0, policy_version 20301 (0.0027) [2024-08-05 16:35:16,646][15444] Updated weights for policy 0, policy_version 20311 (0.0012) [2024-08-05 16:35:18,119][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.6, 300 sec: 24159.7). Total num frames: 166412288. Throughput: 0: 6080.9. Samples: 41610750. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 16:35:18,127][15372] Avg episode reward: [(0, '37.975')] [2024-08-05 16:35:20,158][15444] Updated weights for policy 0, policy_version 20321 (0.0017) [2024-08-05 16:35:23,119][15372] Fps is (10 sec: 24575.5, 60 sec: 24439.4, 300 sec: 24215.0). Total num frames: 166543360. Throughput: 0: 6069.4. Samples: 41628820. Policy #0 lag: (min: 1.0, avg: 3.9, max: 8.0) [2024-08-05 16:35:23,126][15372] Avg episode reward: [(0, '38.216')] [2024-08-05 16:35:23,463][15444] Updated weights for policy 0, policy_version 20331 (0.0019) [2024-08-05 16:35:26,429][15417] Signal inference workers to stop experience collection... (7400 times) [2024-08-05 16:35:26,430][15417] Signal inference workers to resume experience collection... (7400 times) [2024-08-05 16:35:26,482][15444] InferenceWorker_p0-w0: stopping experience collection (7400 times) [2024-08-05 16:35:26,488][15444] InferenceWorker_p0-w0: resuming experience collection (7400 times) [2024-08-05 16:35:26,725][15444] Updated weights for policy 0, policy_version 20341 (0.0012) [2024-08-05 16:35:28,118][15372] Fps is (10 sec: 24576.5, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 166658048. Throughput: 0: 6051.8. Samples: 41665380. Policy #0 lag: (min: 1.0, avg: 3.9, max: 8.0) [2024-08-05 16:35:28,119][15372] Avg episode reward: [(0, '37.964')] [2024-08-05 16:35:30,298][15444] Updated weights for policy 0, policy_version 20351 (0.0012) [2024-08-05 16:35:33,119][15372] Fps is (10 sec: 24576.1, 60 sec: 24439.4, 300 sec: 24215.0). Total num frames: 166789120. Throughput: 0: 6082.0. Samples: 41702360. Policy #0 lag: (min: 1.0, avg: 3.9, max: 8.0) [2024-08-05 16:35:33,127][15372] Avg episode reward: [(0, '38.422')] [2024-08-05 16:35:33,648][15444] Updated weights for policy 0, policy_version 20361 (0.0013) [2024-08-05 16:35:36,904][15444] Updated weights for policy 0, policy_version 20371 (0.0016) [2024-08-05 16:35:38,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24303.1, 300 sec: 24215.0). Total num frames: 166903808. Throughput: 0: 6061.2. Samples: 41720460. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 16:35:38,119][15372] Avg episode reward: [(0, '37.844')] [2024-08-05 16:35:40,381][15444] Updated weights for policy 0, policy_version 20381 (0.0022) [2024-08-05 16:35:43,118][15372] Fps is (10 sec: 22937.9, 60 sec: 24029.9, 300 sec: 24187.2). Total num frames: 167018496. Throughput: 0: 6074.0. Samples: 41756830. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 16:35:43,119][15372] Avg episode reward: [(0, '37.919')] [2024-08-05 16:35:43,664][15444] Updated weights for policy 0, policy_version 20391 (0.0023) [2024-08-05 16:35:47,218][15444] Updated weights for policy 0, policy_version 20401 (0.0011) [2024-08-05 16:35:48,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24439.5, 300 sec: 24215.0). Total num frames: 167149568. Throughput: 0: 6065.8. Samples: 41792780. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 16:35:48,119][15372] Avg episode reward: [(0, '38.318')] [2024-08-05 16:35:50,554][15444] Updated weights for policy 0, policy_version 20411 (0.0032) [2024-08-05 16:35:53,119][15372] Fps is (10 sec: 25393.8, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 167272448. Throughput: 0: 6066.4. Samples: 41811130. Policy #0 lag: (min: 1.0, avg: 3.8, max: 8.0) [2024-08-05 16:35:53,119][15372] Avg episode reward: [(0, '38.708')] [2024-08-05 16:35:53,810][15444] Updated weights for policy 0, policy_version 20421 (0.0016) [2024-08-05 16:35:57,238][15444] Updated weights for policy 0, policy_version 20431 (0.0024) [2024-08-05 16:35:58,119][15372] Fps is (10 sec: 23756.2, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 167387136. Throughput: 0: 6056.2. Samples: 41847300. Policy #0 lag: (min: 1.0, avg: 3.8, max: 8.0) [2024-08-05 16:35:58,119][15372] Avg episode reward: [(0, '38.527')] [2024-08-05 16:36:00,363][15444] Updated weights for policy 0, policy_version 20441 (0.0031) [2024-08-05 16:36:03,118][15372] Fps is (10 sec: 23758.2, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 167510016. Throughput: 0: 6079.1. Samples: 41884310. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 16:36:03,119][15372] Avg episode reward: [(0, '38.170')] [2024-08-05 16:36:03,915][15444] Updated weights for policy 0, policy_version 20451 (0.0024) [2024-08-05 16:36:07,279][15444] Updated weights for policy 0, policy_version 20461 (0.0020) [2024-08-05 16:36:08,118][15372] Fps is (10 sec: 24576.6, 60 sec: 24303.0, 300 sec: 24242.8). Total num frames: 167632896. Throughput: 0: 6083.8. Samples: 41902590. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 16:36:08,119][15372] Avg episode reward: [(0, '37.688')] [2024-08-05 16:36:10,681][15444] Updated weights for policy 0, policy_version 20471 (0.0011) [2024-08-05 16:36:13,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24303.0, 300 sec: 24242.8). Total num frames: 167755776. Throughput: 0: 6073.1. Samples: 41938670. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 16:36:13,126][15372] Avg episode reward: [(0, '38.121')] [2024-08-05 16:36:14,313][15444] Updated weights for policy 0, policy_version 20481 (0.0026) [2024-08-05 16:36:16,414][15417] Signal inference workers to stop experience collection... (7450 times) [2024-08-05 16:36:16,414][15417] Signal inference workers to resume experience collection... (7450 times) [2024-08-05 16:36:16,461][15444] InferenceWorker_p0-w0: stopping experience collection (7450 times) [2024-08-05 16:36:16,470][15444] InferenceWorker_p0-w0: resuming experience collection (7450 times) [2024-08-05 16:36:17,423][15444] Updated weights for policy 0, policy_version 20491 (0.0027) [2024-08-05 16:36:18,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24303.0, 300 sec: 24215.0). Total num frames: 167870464. Throughput: 0: 6054.0. Samples: 41974790. Policy #0 lag: (min: 1.0, avg: 4.4, max: 8.0) [2024-08-05 16:36:18,126][15372] Avg episode reward: [(0, '37.943')] [2024-08-05 16:36:18,130][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000020492_167870464.pth... [2024-08-05 16:36:18,298][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000019783_162062336.pth [2024-08-05 16:36:20,768][15444] Updated weights for policy 0, policy_version 20501 (0.0016) [2024-08-05 16:36:23,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.5, 300 sec: 24215.0). Total num frames: 167993344. Throughput: 0: 6072.2. Samples: 41993710. Policy #0 lag: (min: 1.0, avg: 4.4, max: 8.0) [2024-08-05 16:36:23,119][15372] Avg episode reward: [(0, '39.197')] [2024-08-05 16:36:23,967][15444] Updated weights for policy 0, policy_version 20511 (0.0012) [2024-08-05 16:36:27,708][15444] Updated weights for policy 0, policy_version 20521 (0.0026) [2024-08-05 16:36:28,118][15372] Fps is (10 sec: 25395.2, 60 sec: 24439.5, 300 sec: 24242.8). Total num frames: 168124416. Throughput: 0: 6080.2. Samples: 42030440. Policy #0 lag: (min: 1.0, avg: 4.4, max: 8.0) [2024-08-05 16:36:28,119][15372] Avg episode reward: [(0, '39.406')] [2024-08-05 16:36:30,996][15444] Updated weights for policy 0, policy_version 20531 (0.0019) [2024-08-05 16:36:33,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24166.5, 300 sec: 24215.0). Total num frames: 168239104. Throughput: 0: 6068.0. Samples: 42065840. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 16:36:33,119][15372] Avg episode reward: [(0, '38.165')] [2024-08-05 16:36:34,485][15444] Updated weights for policy 0, policy_version 20541 (0.0013) [2024-08-05 16:36:37,857][15444] Updated weights for policy 0, policy_version 20551 (0.0011) [2024-08-05 16:36:38,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24303.0, 300 sec: 24215.0). Total num frames: 168361984. Throughput: 0: 6059.6. Samples: 42083810. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 16:36:38,119][15372] Avg episode reward: [(0, '38.314')] [2024-08-05 16:36:41,011][15444] Updated weights for policy 0, policy_version 20561 (0.0013) [2024-08-05 16:36:43,119][15372] Fps is (10 sec: 23754.9, 60 sec: 24302.6, 300 sec: 24214.9). Total num frames: 168476672. Throughput: 0: 6057.3. Samples: 42119880. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 16:36:43,120][15372] Avg episode reward: [(0, '38.541')] [2024-08-05 16:36:44,724][15444] Updated weights for policy 0, policy_version 20571 (0.0011) [2024-08-05 16:36:47,874][15444] Updated weights for policy 0, policy_version 20581 (0.0017) [2024-08-05 16:36:48,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 168599552. Throughput: 0: 6034.0. Samples: 42155840. Policy #0 lag: (min: 1.0, avg: 3.4, max: 7.0) [2024-08-05 16:36:48,119][15372] Avg episode reward: [(0, '38.203')] [2024-08-05 16:36:51,289][15444] Updated weights for policy 0, policy_version 20591 (0.0011) [2024-08-05 16:36:53,118][15372] Fps is (10 sec: 24578.0, 60 sec: 24166.6, 300 sec: 24242.8). Total num frames: 168722432. Throughput: 0: 6051.3. Samples: 42174900. Policy #0 lag: (min: 1.0, avg: 3.4, max: 7.0) [2024-08-05 16:36:53,126][15372] Avg episode reward: [(0, '37.987')] [2024-08-05 16:36:54,133][15417] Signal inference workers to stop experience collection... (7500 times) [2024-08-05 16:36:54,134][15417] Signal inference workers to resume experience collection... (7500 times) [2024-08-05 16:36:54,181][15444] InferenceWorker_p0-w0: stopping experience collection (7500 times) [2024-08-05 16:36:54,182][15444] InferenceWorker_p0-w0: resuming experience collection (7500 times) [2024-08-05 16:36:54,477][15444] Updated weights for policy 0, policy_version 20601 (0.0011) [2024-08-05 16:36:58,114][15444] Updated weights for policy 0, policy_version 20611 (0.0012) [2024-08-05 16:36:58,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24303.0, 300 sec: 24215.0). Total num frames: 168845312. Throughput: 0: 6057.1. Samples: 42211240. Policy #0 lag: (min: 1.0, avg: 3.4, max: 7.0) [2024-08-05 16:36:58,122][15372] Avg episode reward: [(0, '37.821')] [2024-08-05 16:37:01,834][15444] Updated weights for policy 0, policy_version 20621 (0.0026) [2024-08-05 16:37:03,119][15372] Fps is (10 sec: 24574.6, 60 sec: 24302.7, 300 sec: 24242.8). Total num frames: 168968192. Throughput: 0: 6058.2. Samples: 42247410. Policy #0 lag: (min: 0.0, avg: 4.4, max: 7.0) [2024-08-05 16:37:03,119][15372] Avg episode reward: [(0, '37.625')] [2024-08-05 16:37:04,761][15444] Updated weights for policy 0, policy_version 20631 (0.0020) [2024-08-05 16:37:08,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 169082880. Throughput: 0: 6050.9. Samples: 42266000. Policy #0 lag: (min: 0.0, avg: 4.4, max: 7.0) [2024-08-05 16:37:08,126][15372] Avg episode reward: [(0, '37.000')] [2024-08-05 16:37:08,327][15444] Updated weights for policy 0, policy_version 20641 (0.0030) [2024-08-05 16:37:11,323][15444] Updated weights for policy 0, policy_version 20651 (0.0013) [2024-08-05 16:37:13,119][15372] Fps is (10 sec: 23757.7, 60 sec: 24166.3, 300 sec: 24215.0). Total num frames: 169205760. Throughput: 0: 6021.1. Samples: 42301390. Policy #0 lag: (min: 1.0, avg: 3.7, max: 7.0) [2024-08-05 16:37:13,126][15372] Avg episode reward: [(0, '38.463')] [2024-08-05 16:37:15,025][15444] Updated weights for policy 0, policy_version 20661 (0.0013) [2024-08-05 16:37:18,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24302.9, 300 sec: 24242.8). Total num frames: 169328640. Throughput: 0: 6050.7. Samples: 42338120. Policy #0 lag: (min: 1.0, avg: 3.7, max: 7.0) [2024-08-05 16:37:18,126][15372] Avg episode reward: [(0, '38.664')] [2024-08-05 16:37:18,469][15444] Updated weights for policy 0, policy_version 20671 (0.0011) [2024-08-05 16:37:21,711][15444] Updated weights for policy 0, policy_version 20681 (0.0016) [2024-08-05 16:37:23,118][15372] Fps is (10 sec: 23757.2, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 169443328. Throughput: 0: 6067.8. Samples: 42356860. Policy #0 lag: (min: 1.0, avg: 3.7, max: 7.0) [2024-08-05 16:37:23,119][15372] Avg episode reward: [(0, '37.649')] [2024-08-05 16:37:25,303][15444] Updated weights for policy 0, policy_version 20691 (0.0010) [2024-08-05 16:37:27,272][15417] Signal inference workers to stop experience collection... (7550 times) [2024-08-05 16:37:27,280][15417] Signal inference workers to resume experience collection... (7550 times) [2024-08-05 16:37:27,317][15444] InferenceWorker_p0-w0: stopping experience collection (7550 times) [2024-08-05 16:37:27,317][15444] InferenceWorker_p0-w0: resuming experience collection (7550 times) [2024-08-05 16:37:28,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.4, 300 sec: 24242.8). Total num frames: 169574400. Throughput: 0: 6063.4. Samples: 42392730. Policy #0 lag: (min: 0.0, avg: 4.2, max: 9.0) [2024-08-05 16:37:28,119][15372] Avg episode reward: [(0, '38.053')] [2024-08-05 16:37:28,315][15444] Updated weights for policy 0, policy_version 20701 (0.0021) [2024-08-05 16:37:31,918][15444] Updated weights for policy 0, policy_version 20711 (0.0015) [2024-08-05 16:37:33,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.4, 300 sec: 24243.0). Total num frames: 169689088. Throughput: 0: 6077.1. Samples: 42429310. Policy #0 lag: (min: 0.0, avg: 4.2, max: 9.0) [2024-08-05 16:37:33,119][15372] Avg episode reward: [(0, '38.556')] [2024-08-05 16:37:35,345][15444] Updated weights for policy 0, policy_version 20721 (0.0018) [2024-08-05 16:37:38,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24242.8). Total num frames: 169811968. Throughput: 0: 6072.2. Samples: 42448150. Policy #0 lag: (min: 0.0, avg: 4.2, max: 9.0) [2024-08-05 16:37:38,119][15372] Avg episode reward: [(0, '38.766')] [2024-08-05 16:37:38,414][15444] Updated weights for policy 0, policy_version 20731 (0.0012) [2024-08-05 16:37:42,220][15444] Updated weights for policy 0, policy_version 20741 (0.0029) [2024-08-05 16:37:43,118][15372] Fps is (10 sec: 24575.8, 60 sec: 24303.2, 300 sec: 24242.8). Total num frames: 169934848. Throughput: 0: 6051.8. Samples: 42483570. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 16:37:43,119][15372] Avg episode reward: [(0, '37.834')] [2024-08-05 16:37:45,349][15444] Updated weights for policy 0, policy_version 20751 (0.0010) [2024-08-05 16:37:48,119][15372] Fps is (10 sec: 23756.2, 60 sec: 24166.3, 300 sec: 24242.7). Total num frames: 170049536. Throughput: 0: 6050.5. Samples: 42519680. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 16:37:48,119][15372] Avg episode reward: [(0, '37.731')] [2024-08-05 16:37:48,885][15444] Updated weights for policy 0, policy_version 20761 (0.0018) [2024-08-05 16:37:52,629][15444] Updated weights for policy 0, policy_version 20771 (0.0030) [2024-08-05 16:37:53,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 170172416. Throughput: 0: 6041.6. Samples: 42537870. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 16:37:53,119][15372] Avg episode reward: [(0, '37.740')] [2024-08-05 16:37:55,517][15444] Updated weights for policy 0, policy_version 20781 (0.0029) [2024-08-05 16:37:58,118][15372] Fps is (10 sec: 24576.6, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 170295296. Throughput: 0: 6040.9. Samples: 42573230. Policy #0 lag: (min: 0.0, avg: 4.5, max: 9.0) [2024-08-05 16:37:58,126][15372] Avg episode reward: [(0, '37.906')] [2024-08-05 16:37:59,244][15444] Updated weights for policy 0, policy_version 20791 (0.0014) [2024-08-05 16:38:01,941][15417] Signal inference workers to stop experience collection... (7600 times) [2024-08-05 16:38:01,942][15417] Signal inference workers to resume experience collection... (7600 times) [2024-08-05 16:38:02,022][15444] InferenceWorker_p0-w0: stopping experience collection (7600 times) [2024-08-05 16:38:02,023][15444] InferenceWorker_p0-w0: resuming experience collection (7600 times) [2024-08-05 16:38:02,547][15444] Updated weights for policy 0, policy_version 20801 (0.0016) [2024-08-05 16:38:03,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24030.1, 300 sec: 24215.0). Total num frames: 170409984. Throughput: 0: 6019.3. Samples: 42608990. Policy #0 lag: (min: 0.0, avg: 4.5, max: 9.0) [2024-08-05 16:38:03,119][15372] Avg episode reward: [(0, '38.201')] [2024-08-05 16:38:05,915][15444] Updated weights for policy 0, policy_version 20811 (0.0021) [2024-08-05 16:38:08,119][15372] Fps is (10 sec: 22936.5, 60 sec: 24029.7, 300 sec: 24187.2). Total num frames: 170524672. Throughput: 0: 6013.9. Samples: 42627490. Policy #0 lag: (min: 0.0, avg: 4.5, max: 9.0) [2024-08-05 16:38:08,127][15372] Avg episode reward: [(0, '37.320')] [2024-08-05 16:38:09,476][15444] Updated weights for policy 0, policy_version 20821 (0.0014) [2024-08-05 16:38:12,725][15444] Updated weights for policy 0, policy_version 20831 (0.0015) [2024-08-05 16:38:13,119][15372] Fps is (10 sec: 23755.9, 60 sec: 24029.8, 300 sec: 24187.2). Total num frames: 170647552. Throughput: 0: 6016.4. Samples: 42663470. Policy #0 lag: (min: 1.0, avg: 3.6, max: 8.0) [2024-08-05 16:38:13,119][15372] Avg episode reward: [(0, '38.506')] [2024-08-05 16:38:16,480][15444] Updated weights for policy 0, policy_version 20841 (0.0019) [2024-08-05 16:38:18,119][15372] Fps is (10 sec: 24576.9, 60 sec: 24029.8, 300 sec: 24187.2). Total num frames: 170770432. Throughput: 0: 5996.4. Samples: 42699150. Policy #0 lag: (min: 1.0, avg: 3.6, max: 8.0) [2024-08-05 16:38:18,119][15372] Avg episode reward: [(0, '38.686')] [2024-08-05 16:38:18,123][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000020846_170770432.pth... [2024-08-05 16:38:18,269][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000020138_164970496.pth [2024-08-05 16:38:19,717][15444] Updated weights for policy 0, policy_version 20851 (0.0035) [2024-08-05 16:38:23,083][15444] Updated weights for policy 0, policy_version 20861 (0.0020) [2024-08-05 16:38:23,119][15372] Fps is (10 sec: 24576.4, 60 sec: 24166.3, 300 sec: 24215.0). Total num frames: 170893312. Throughput: 0: 5978.6. Samples: 42717190. Policy #0 lag: (min: 1.0, avg: 3.6, max: 8.0) [2024-08-05 16:38:23,119][15372] Avg episode reward: [(0, '37.530')] [2024-08-05 16:38:26,397][15444] Updated weights for policy 0, policy_version 20871 (0.0025) [2024-08-05 16:38:28,119][15372] Fps is (10 sec: 23756.0, 60 sec: 23893.1, 300 sec: 24159.4). Total num frames: 171008000. Throughput: 0: 5986.8. Samples: 42752980. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 16:38:28,127][15372] Avg episode reward: [(0, '37.758')] [2024-08-05 16:38:29,767][15444] Updated weights for policy 0, policy_version 20881 (0.0011) [2024-08-05 16:38:33,119][15372] Fps is (10 sec: 23756.2, 60 sec: 24029.7, 300 sec: 24159.4). Total num frames: 171130880. Throughput: 0: 5995.3. Samples: 42789470. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 16:38:33,127][15372] Avg episode reward: [(0, '37.642')] [2024-08-05 16:38:33,287][15444] Updated weights for policy 0, policy_version 20891 (0.0026) [2024-08-05 16:38:36,431][15444] Updated weights for policy 0, policy_version 20901 (0.0011) [2024-08-05 16:38:38,118][15372] Fps is (10 sec: 24577.3, 60 sec: 24029.9, 300 sec: 24187.2). Total num frames: 171253760. Throughput: 0: 6012.0. Samples: 42808410. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 16:38:38,126][15372] Avg episode reward: [(0, '36.958')] [2024-08-05 16:38:39,898][15444] Updated weights for policy 0, policy_version 20911 (0.0013) [2024-08-05 16:38:43,118][15372] Fps is (10 sec: 24577.1, 60 sec: 24029.9, 300 sec: 24187.3). Total num frames: 171376640. Throughput: 0: 6041.1. Samples: 42845080. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 16:38:43,126][15372] Avg episode reward: [(0, '36.681')] [2024-08-05 16:38:43,323][15444] Updated weights for policy 0, policy_version 20921 (0.0020) [2024-08-05 16:38:46,525][15444] Updated weights for policy 0, policy_version 20931 (0.0017) [2024-08-05 16:38:48,119][15372] Fps is (10 sec: 24574.0, 60 sec: 24166.2, 300 sec: 24187.2). Total num frames: 171499520. Throughput: 0: 6035.2. Samples: 42880580. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 16:38:48,127][15372] Avg episode reward: [(0, '38.112')] [2024-08-05 16:38:50,034][15444] Updated weights for policy 0, policy_version 20941 (0.0012) [2024-08-05 16:38:53,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 171622400. Throughput: 0: 6033.4. Samples: 42898990. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 16:38:53,126][15372] Avg episode reward: [(0, '38.455')] [2024-08-05 16:38:53,459][15444] Updated weights for policy 0, policy_version 20951 (0.0012) [2024-08-05 16:38:56,871][15444] Updated weights for policy 0, policy_version 20961 (0.0019) [2024-08-05 16:38:57,914][15417] Signal inference workers to stop experience collection... (7650 times) [2024-08-05 16:38:57,915][15417] Signal inference workers to resume experience collection... (7650 times) [2024-08-05 16:38:57,979][15444] InferenceWorker_p0-w0: stopping experience collection (7650 times) [2024-08-05 16:38:57,979][15444] InferenceWorker_p0-w0: resuming experience collection (7650 times) [2024-08-05 16:38:58,119][15372] Fps is (10 sec: 24576.3, 60 sec: 24166.2, 300 sec: 24214.9). Total num frames: 171745280. Throughput: 0: 6043.3. Samples: 42935420. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 16:38:58,119][15372] Avg episode reward: [(0, '38.113')] [2024-08-05 16:39:00,063][15444] Updated weights for policy 0, policy_version 20971 (0.0020) [2024-08-05 16:39:03,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 171859968. Throughput: 0: 6067.8. Samples: 42972200. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 16:39:03,126][15372] Avg episode reward: [(0, '37.605')] [2024-08-05 16:39:03,447][15444] Updated weights for policy 0, policy_version 20981 (0.0012) [2024-08-05 16:39:07,054][15444] Updated weights for policy 0, policy_version 20991 (0.0010) [2024-08-05 16:39:08,119][15372] Fps is (10 sec: 24576.8, 60 sec: 24439.5, 300 sec: 24242.7). Total num frames: 171991040. Throughput: 0: 6073.8. Samples: 42990510. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 16:39:08,119][15372] Avg episode reward: [(0, '37.981')] [2024-08-05 16:39:10,141][15444] Updated weights for policy 0, policy_version 21001 (0.0011) [2024-08-05 16:39:13,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24303.1, 300 sec: 24215.1). Total num frames: 172105728. Throughput: 0: 6098.1. Samples: 43027390. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 16:39:13,126][15372] Avg episode reward: [(0, '38.374')] [2024-08-05 16:39:13,624][15444] Updated weights for policy 0, policy_version 21011 (0.0011) [2024-08-05 16:39:16,922][15444] Updated weights for policy 0, policy_version 21021 (0.0019) [2024-08-05 16:39:18,119][15372] Fps is (10 sec: 23757.1, 60 sec: 24302.9, 300 sec: 24242.7). Total num frames: 172228608. Throughput: 0: 6092.0. Samples: 43063610. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 16:39:18,119][15372] Avg episode reward: [(0, '38.080')] [2024-08-05 16:39:20,246][15444] Updated weights for policy 0, policy_version 21031 (0.0027) [2024-08-05 16:39:23,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24303.0, 300 sec: 24215.0). Total num frames: 172351488. Throughput: 0: 6078.9. Samples: 43081960. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 16:39:23,126][15372] Avg episode reward: [(0, '37.452')] [2024-08-05 16:39:23,818][15444] Updated weights for policy 0, policy_version 21041 (0.0012) [2024-08-05 16:39:27,280][15444] Updated weights for policy 0, policy_version 21051 (0.0021) [2024-08-05 16:39:28,119][15372] Fps is (10 sec: 23756.2, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 172466176. Throughput: 0: 6054.4. Samples: 43117530. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 16:39:28,120][15372] Avg episode reward: [(0, '38.615')] [2024-08-05 16:39:30,465][15444] Updated weights for policy 0, policy_version 21061 (0.0018) [2024-08-05 16:39:33,128][15372] Fps is (10 sec: 23734.3, 60 sec: 24299.3, 300 sec: 24214.3). Total num frames: 172589056. Throughput: 0: 6072.6. Samples: 43153900. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 16:39:33,136][15372] Avg episode reward: [(0, '37.777')] [2024-08-05 16:39:34,130][15444] Updated weights for policy 0, policy_version 21071 (0.0017) [2024-08-05 16:39:37,500][15444] Updated weights for policy 0, policy_version 21081 (0.0030) [2024-08-05 16:39:38,118][15372] Fps is (10 sec: 23757.8, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 172703744. Throughput: 0: 6069.1. Samples: 43172100. Policy #0 lag: (min: 1.0, avg: 3.5, max: 7.0) [2024-08-05 16:39:38,119][15372] Avg episode reward: [(0, '37.875')] [2024-08-05 16:39:40,719][15444] Updated weights for policy 0, policy_version 21091 (0.0013) [2024-08-05 16:39:43,118][15372] Fps is (10 sec: 24599.2, 60 sec: 24302.9, 300 sec: 24242.8). Total num frames: 172834816. Throughput: 0: 6051.6. Samples: 43207740. Policy #0 lag: (min: 1.0, avg: 3.5, max: 7.0) [2024-08-05 16:39:43,126][15372] Avg episode reward: [(0, '38.032')] [2024-08-05 16:39:44,523][15444] Updated weights for policy 0, policy_version 21101 (0.0025) [2024-08-05 16:39:47,983][15444] Updated weights for policy 0, policy_version 21111 (0.0031) [2024-08-05 16:39:48,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24030.2, 300 sec: 24159.5). Total num frames: 172941312. Throughput: 0: 6021.8. Samples: 43243180. Policy #0 lag: (min: 1.0, avg: 3.5, max: 7.0) [2024-08-05 16:39:48,119][15372] Avg episode reward: [(0, '37.988')] [2024-08-05 16:39:50,970][15417] Signal inference workers to stop experience collection... (7700 times) [2024-08-05 16:39:50,980][15417] Signal inference workers to resume experience collection... (7700 times) [2024-08-05 16:39:51,021][15444] InferenceWorker_p0-w0: stopping experience collection (7700 times) [2024-08-05 16:39:51,027][15444] InferenceWorker_p0-w0: resuming experience collection (7700 times) [2024-08-05 16:39:51,098][15444] Updated weights for policy 0, policy_version 21121 (0.0037) [2024-08-05 16:39:53,119][15372] Fps is (10 sec: 22937.4, 60 sec: 24029.8, 300 sec: 24187.2). Total num frames: 173064192. Throughput: 0: 6025.6. Samples: 43261660. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 16:39:53,126][15372] Avg episode reward: [(0, '38.831')] [2024-08-05 16:39:54,664][15444] Updated weights for policy 0, policy_version 21131 (0.0034) [2024-08-05 16:39:57,771][15444] Updated weights for policy 0, policy_version 21141 (0.0022) [2024-08-05 16:39:58,119][15372] Fps is (10 sec: 25395.0, 60 sec: 24166.6, 300 sec: 24215.0). Total num frames: 173195264. Throughput: 0: 6022.2. Samples: 43298390. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 16:39:58,119][15372] Avg episode reward: [(0, '38.203')] [2024-08-05 16:40:01,247][15444] Updated weights for policy 0, policy_version 21151 (0.0019) [2024-08-05 16:40:03,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 173309952. Throughput: 0: 6006.0. Samples: 43333880. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 16:40:03,126][15372] Avg episode reward: [(0, '38.495')] [2024-08-05 16:40:04,467][15444] Updated weights for policy 0, policy_version 21161 (0.0010) [2024-08-05 16:40:07,876][15444] Updated weights for policy 0, policy_version 21171 (0.0012) [2024-08-05 16:40:08,118][15372] Fps is (10 sec: 23757.1, 60 sec: 24030.0, 300 sec: 24187.2). Total num frames: 173432832. Throughput: 0: 6011.6. Samples: 43352480. Policy #0 lag: (min: 1.0, avg: 3.5, max: 7.0) [2024-08-05 16:40:08,119][15372] Avg episode reward: [(0, '38.174')] [2024-08-05 16:40:11,439][15444] Updated weights for policy 0, policy_version 21181 (0.0035) [2024-08-05 16:40:13,119][15372] Fps is (10 sec: 24575.6, 60 sec: 24166.3, 300 sec: 24215.0). Total num frames: 173555712. Throughput: 0: 6021.4. Samples: 43388490. Policy #0 lag: (min: 1.0, avg: 3.5, max: 7.0) [2024-08-05 16:40:13,119][15372] Avg episode reward: [(0, '38.077')] [2024-08-05 16:40:14,609][15444] Updated weights for policy 0, policy_version 21191 (0.0011) [2024-08-05 16:40:18,007][15444] Updated weights for policy 0, policy_version 21201 (0.0016) [2024-08-05 16:40:18,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24166.5, 300 sec: 24187.2). Total num frames: 173678592. Throughput: 0: 6039.9. Samples: 43425640. Policy #0 lag: (min: 1.0, avg: 3.5, max: 7.0) [2024-08-05 16:40:18,119][15372] Avg episode reward: [(0, '38.365')] [2024-08-05 16:40:18,122][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000021201_173678592.pth... [2024-08-05 16:40:18,245][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000020492_167870464.pth [2024-08-05 16:40:21,320][15444] Updated weights for policy 0, policy_version 21211 (0.0024) [2024-08-05 16:40:23,119][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.8, 300 sec: 24187.2). Total num frames: 173793280. Throughput: 0: 6032.6. Samples: 43443570. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 16:40:23,126][15372] Avg episode reward: [(0, '38.168')] [2024-08-05 16:40:24,880][15444] Updated weights for policy 0, policy_version 21221 (0.0014) [2024-08-05 16:40:28,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.6, 300 sec: 24159.5). Total num frames: 173916160. Throughput: 0: 6056.0. Samples: 43480260. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 16:40:28,126][15372] Avg episode reward: [(0, '37.712')] [2024-08-05 16:40:28,700][15444] Updated weights for policy 0, policy_version 21231 (0.0035) [2024-08-05 16:40:29,033][15417] Signal inference workers to stop experience collection... (7750 times) [2024-08-05 16:40:29,034][15417] Signal inference workers to resume experience collection... (7750 times) [2024-08-05 16:40:29,077][15444] InferenceWorker_p0-w0: stopping experience collection (7750 times) [2024-08-05 16:40:29,077][15444] InferenceWorker_p0-w0: resuming experience collection (7750 times) [2024-08-05 16:40:31,601][15444] Updated weights for policy 0, policy_version 21241 (0.0017) [2024-08-05 16:40:33,118][15372] Fps is (10 sec: 23757.1, 60 sec: 24033.6, 300 sec: 24159.5). Total num frames: 174030848. Throughput: 0: 6068.2. Samples: 43516250. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 16:40:33,119][15372] Avg episode reward: [(0, '37.575')] [2024-08-05 16:40:35,196][15444] Updated weights for policy 0, policy_version 21251 (0.0016) [2024-08-05 16:40:38,119][15372] Fps is (10 sec: 24575.7, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 174161920. Throughput: 0: 6046.4. Samples: 43533750. Policy #0 lag: (min: 0.0, avg: 3.1, max: 9.0) [2024-08-05 16:40:38,126][15372] Avg episode reward: [(0, '37.814')] [2024-08-05 16:40:38,394][15444] Updated weights for policy 0, policy_version 21261 (0.0037) [2024-08-05 16:40:41,929][15444] Updated weights for policy 0, policy_version 21271 (0.0020) [2024-08-05 16:40:43,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 174276608. Throughput: 0: 6035.3. Samples: 43569980. Policy #0 lag: (min: 0.0, avg: 3.1, max: 9.0) [2024-08-05 16:40:43,119][15372] Avg episode reward: [(0, '38.050')] [2024-08-05 16:40:45,354][15444] Updated weights for policy 0, policy_version 21281 (0.0016) [2024-08-05 16:40:48,118][15372] Fps is (10 sec: 23757.1, 60 sec: 24302.9, 300 sec: 24159.5). Total num frames: 174399488. Throughput: 0: 6066.0. Samples: 43606850. Policy #0 lag: (min: 0.0, avg: 3.1, max: 9.0) [2024-08-05 16:40:48,119][15372] Avg episode reward: [(0, '39.110')] [2024-08-05 16:40:48,522][15444] Updated weights for policy 0, policy_version 21291 (0.0032) [2024-08-05 16:40:52,179][15444] Updated weights for policy 0, policy_version 21301 (0.0019) [2024-08-05 16:40:53,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24303.0, 300 sec: 24187.2). Total num frames: 174522368. Throughput: 0: 6040.9. Samples: 43624320. Policy #0 lag: (min: 0.0, avg: 4.2, max: 7.0) [2024-08-05 16:40:53,119][15372] Avg episode reward: [(0, '38.957')] [2024-08-05 16:40:55,491][15444] Updated weights for policy 0, policy_version 21311 (0.0031) [2024-08-05 16:40:58,119][15372] Fps is (10 sec: 24575.4, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 174645248. Throughput: 0: 6060.4. Samples: 43661210. Policy #0 lag: (min: 0.0, avg: 4.2, max: 7.0) [2024-08-05 16:40:58,119][15372] Avg episode reward: [(0, '38.259')] [2024-08-05 16:40:58,567][15417] Signal inference workers to stop experience collection... (7800 times) [2024-08-05 16:40:58,575][15417] Signal inference workers to resume experience collection... (7800 times) [2024-08-05 16:40:58,642][15444] InferenceWorker_p0-w0: stopping experience collection (7800 times) [2024-08-05 16:40:58,642][15444] InferenceWorker_p0-w0: resuming experience collection (7800 times) [2024-08-05 16:40:58,665][15444] Updated weights for policy 0, policy_version 21321 (0.0038) [2024-08-05 16:41:02,557][15444] Updated weights for policy 0, policy_version 21331 (0.0025) [2024-08-05 16:41:03,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 174759936. Throughput: 0: 6034.7. Samples: 43697200. Policy #0 lag: (min: 0.0, avg: 4.2, max: 7.0) [2024-08-05 16:41:03,119][15372] Avg episode reward: [(0, '37.992')] [2024-08-05 16:41:05,369][15444] Updated weights for policy 0, policy_version 21341 (0.0013) [2024-08-05 16:41:08,118][15372] Fps is (10 sec: 23757.3, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 174882816. Throughput: 0: 6038.9. Samples: 43715320. Policy #0 lag: (min: 0.0, avg: 4.5, max: 7.0) [2024-08-05 16:41:08,119][15372] Avg episode reward: [(0, '38.606')] [2024-08-05 16:41:09,053][15444] Updated weights for policy 0, policy_version 21351 (0.0017) [2024-08-05 16:41:12,515][15444] Updated weights for policy 0, policy_version 21361 (0.0015) [2024-08-05 16:41:13,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 174997504. Throughput: 0: 6018.0. Samples: 43751070. Policy #0 lag: (min: 0.0, avg: 4.5, max: 7.0) [2024-08-05 16:41:13,119][15372] Avg episode reward: [(0, '38.297')] [2024-08-05 16:41:15,655][15444] Updated weights for policy 0, policy_version 21371 (0.0015) [2024-08-05 16:41:18,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 175128576. Throughput: 0: 6026.2. Samples: 43787430. Policy #0 lag: (min: 0.0, avg: 4.5, max: 7.0) [2024-08-05 16:41:18,126][15372] Avg episode reward: [(0, '37.069')] [2024-08-05 16:41:19,199][15444] Updated weights for policy 0, policy_version 21381 (0.0017) [2024-08-05 16:41:22,481][15444] Updated weights for policy 0, policy_version 21391 (0.0021) [2024-08-05 16:41:23,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24166.5, 300 sec: 24131.7). Total num frames: 175243264. Throughput: 0: 6045.4. Samples: 43805790. Policy #0 lag: (min: 0.0, avg: 4.6, max: 9.0) [2024-08-05 16:41:23,119][15372] Avg episode reward: [(0, '36.818')] [2024-08-05 16:41:25,964][15444] Updated weights for policy 0, policy_version 21401 (0.0024) [2024-08-05 16:41:28,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 175366144. Throughput: 0: 6036.9. Samples: 43841640. Policy #0 lag: (min: 0.0, avg: 4.6, max: 9.0) [2024-08-05 16:41:28,126][15372] Avg episode reward: [(0, '37.836')] [2024-08-05 16:41:29,349][15444] Updated weights for policy 0, policy_version 21411 (0.0021) [2024-08-05 16:41:32,748][15444] Updated weights for policy 0, policy_version 21421 (0.0013) [2024-08-05 16:41:33,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 175480832. Throughput: 0: 6018.9. Samples: 43877700. Policy #0 lag: (min: 0.0, avg: 4.6, max: 9.0) [2024-08-05 16:41:33,119][15372] Avg episode reward: [(0, '38.072')] [2024-08-05 16:41:36,099][15444] Updated weights for policy 0, policy_version 21431 (0.0021) [2024-08-05 16:41:38,127][15372] Fps is (10 sec: 24554.2, 60 sec: 24162.9, 300 sec: 24186.6). Total num frames: 175611904. Throughput: 0: 6035.5. Samples: 43895970. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 16:41:38,136][15372] Avg episode reward: [(0, '37.793')] [2024-08-05 16:41:39,600][15444] Updated weights for policy 0, policy_version 21441 (0.0035) [2024-08-05 16:41:42,923][15444] Updated weights for policy 0, policy_version 21451 (0.0016) [2024-08-05 16:41:43,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 175726592. Throughput: 0: 6019.6. Samples: 43932090. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 16:41:43,119][15372] Avg episode reward: [(0, '37.463')] [2024-08-05 16:41:46,473][15444] Updated weights for policy 0, policy_version 21461 (0.0025) [2024-08-05 16:41:48,118][15372] Fps is (10 sec: 23777.9, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 175849472. Throughput: 0: 6013.1. Samples: 43967790. Policy #0 lag: (min: 1.0, avg: 4.2, max: 9.0) [2024-08-05 16:41:48,119][15372] Avg episode reward: [(0, '38.053')] [2024-08-05 16:41:49,071][15417] Signal inference workers to stop experience collection... (7850 times) [2024-08-05 16:41:49,079][15417] Signal inference workers to resume experience collection... (7850 times) [2024-08-05 16:41:49,108][15444] InferenceWorker_p0-w0: stopping experience collection (7850 times) [2024-08-05 16:41:49,108][15444] InferenceWorker_p0-w0: resuming experience collection (7850 times) [2024-08-05 16:41:49,708][15444] Updated weights for policy 0, policy_version 21471 (0.0012) [2024-08-05 16:41:53,118][15372] Fps is (10 sec: 23756.6, 60 sec: 24029.8, 300 sec: 24131.7). Total num frames: 175964160. Throughput: 0: 6033.3. Samples: 43986820. Policy #0 lag: (min: 1.0, avg: 4.2, max: 9.0) [2024-08-05 16:41:53,126][15372] Avg episode reward: [(0, '38.808')] [2024-08-05 16:41:53,252][15444] Updated weights for policy 0, policy_version 21481 (0.0018) [2024-08-05 16:41:56,231][15444] Updated weights for policy 0, policy_version 21491 (0.0028) [2024-08-05 16:41:58,119][15372] Fps is (10 sec: 24575.7, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 176095232. Throughput: 0: 6037.3. Samples: 44022750. Policy #0 lag: (min: 1.0, avg: 4.2, max: 9.0) [2024-08-05 16:41:58,126][15372] Avg episode reward: [(0, '38.191')] [2024-08-05 16:41:59,825][15444] Updated weights for policy 0, policy_version 21501 (0.0027) [2024-08-05 16:42:03,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 176209920. Throughput: 0: 6037.6. Samples: 44059120. Policy #0 lag: (min: 1.0, avg: 4.2, max: 9.0) [2024-08-05 16:42:03,126][15372] Avg episode reward: [(0, '37.292')] [2024-08-05 16:42:03,214][15444] Updated weights for policy 0, policy_version 21511 (0.0016) [2024-08-05 16:42:06,408][15444] Updated weights for policy 0, policy_version 21521 (0.0021) [2024-08-05 16:42:08,118][15372] Fps is (10 sec: 23757.1, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 176332800. Throughput: 0: 6054.2. Samples: 44078230. Policy #0 lag: (min: 1.0, avg: 4.2, max: 9.0) [2024-08-05 16:42:08,126][15372] Avg episode reward: [(0, '37.862')] [2024-08-05 16:42:10,117][15444] Updated weights for policy 0, policy_version 21531 (0.0033) [2024-08-05 16:42:13,119][15372] Fps is (10 sec: 24575.1, 60 sec: 24302.8, 300 sec: 24159.4). Total num frames: 176455680. Throughput: 0: 6067.5. Samples: 44114680. Policy #0 lag: (min: 1.0, avg: 4.2, max: 9.0) [2024-08-05 16:42:13,127][15372] Avg episode reward: [(0, '38.514')] [2024-08-05 16:42:13,159][15444] Updated weights for policy 0, policy_version 21541 (0.0035) [2024-08-05 16:42:16,627][15444] Updated weights for policy 0, policy_version 21551 (0.0015) [2024-08-05 16:42:18,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 176578560. Throughput: 0: 6062.2. Samples: 44150500. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 16:42:18,126][15372] Avg episode reward: [(0, '39.210')] [2024-08-05 16:42:18,130][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000021555_176578560.pth... [2024-08-05 16:42:18,255][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000020846_170770432.pth [2024-08-05 16:42:20,183][15444] Updated weights for policy 0, policy_version 21561 (0.0035) [2024-08-05 16:42:23,118][15372] Fps is (10 sec: 24577.0, 60 sec: 24302.9, 300 sec: 24159.5). Total num frames: 176701440. Throughput: 0: 6069.6. Samples: 44169050. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 16:42:23,126][15372] Avg episode reward: [(0, '40.006')] [2024-08-05 16:42:23,127][15417] Saving new best policy, reward=40.006! [2024-08-05 16:42:23,371][15444] Updated weights for policy 0, policy_version 21571 (0.0011) [2024-08-05 16:42:26,782][15444] Updated weights for policy 0, policy_version 21581 (0.0020) [2024-08-05 16:42:28,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 176816128. Throughput: 0: 6057.6. Samples: 44204680. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 16:42:28,119][15372] Avg episode reward: [(0, '39.017')] [2024-08-05 16:42:30,351][15444] Updated weights for policy 0, policy_version 21591 (0.0026) [2024-08-05 16:42:33,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24439.5, 300 sec: 24187.2). Total num frames: 176947200. Throughput: 0: 6096.0. Samples: 44242110. Policy #0 lag: (min: 0.0, avg: 4.4, max: 7.0) [2024-08-05 16:42:33,119][15372] Avg episode reward: [(0, '37.316')] [2024-08-05 16:42:33,430][15444] Updated weights for policy 0, policy_version 21601 (0.0011) [2024-08-05 16:42:36,931][15444] Updated weights for policy 0, policy_version 21611 (0.0030) [2024-08-05 16:42:37,079][15417] Signal inference workers to stop experience collection... (7900 times) [2024-08-05 16:42:37,080][15417] Signal inference workers to resume experience collection... (7900 times) [2024-08-05 16:42:37,132][15444] InferenceWorker_p0-w0: stopping experience collection (7900 times) [2024-08-05 16:42:37,132][15444] InferenceWorker_p0-w0: resuming experience collection (7900 times) [2024-08-05 16:42:38,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24170.0, 300 sec: 24159.5). Total num frames: 177061888. Throughput: 0: 6070.2. Samples: 44259980. Policy #0 lag: (min: 0.0, avg: 4.4, max: 7.0) [2024-08-05 16:42:38,119][15372] Avg episode reward: [(0, '37.082')] [2024-08-05 16:42:40,308][15444] Updated weights for policy 0, policy_version 21621 (0.0018) [2024-08-05 16:42:43,119][15372] Fps is (10 sec: 22936.2, 60 sec: 24166.1, 300 sec: 24159.4). Total num frames: 177176576. Throughput: 0: 6081.5. Samples: 44296420. Policy #0 lag: (min: 0.0, avg: 4.4, max: 7.0) [2024-08-05 16:42:43,119][15372] Avg episode reward: [(0, '37.531')] [2024-08-05 16:42:43,625][15444] Updated weights for policy 0, policy_version 21631 (0.0020) [2024-08-05 16:42:47,287][15444] Updated weights for policy 0, policy_version 21641 (0.0019) [2024-08-05 16:42:48,120][15372] Fps is (10 sec: 24571.7, 60 sec: 24302.2, 300 sec: 24187.1). Total num frames: 177307648. Throughput: 0: 6061.1. Samples: 44331880. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 16:42:48,120][15372] Avg episode reward: [(0, '37.943')] [2024-08-05 16:42:50,422][15444] Updated weights for policy 0, policy_version 21651 (0.0030) [2024-08-05 16:42:53,130][15372] Fps is (10 sec: 24548.7, 60 sec: 24298.2, 300 sec: 24158.5). Total num frames: 177422336. Throughput: 0: 6043.5. Samples: 44350260. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 16:42:53,140][15372] Avg episode reward: [(0, '39.051')] [2024-08-05 16:42:54,022][15444] Updated weights for policy 0, policy_version 21661 (0.0021) [2024-08-05 16:42:57,502][15444] Updated weights for policy 0, policy_version 21671 (0.0017) [2024-08-05 16:42:58,118][15372] Fps is (10 sec: 22941.6, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 177537024. Throughput: 0: 6028.7. Samples: 44385970. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 16:42:58,127][15372] Avg episode reward: [(0, '38.593')] [2024-08-05 16:43:00,732][15444] Updated weights for policy 0, policy_version 21681 (0.0017) [2024-08-05 16:43:03,118][15372] Fps is (10 sec: 23784.7, 60 sec: 24166.4, 300 sec: 24187.3). Total num frames: 177659904. Throughput: 0: 6027.8. Samples: 44421750. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 16:43:03,126][15372] Avg episode reward: [(0, '38.476')] [2024-08-05 16:43:04,549][15444] Updated weights for policy 0, policy_version 21691 (0.0026) [2024-08-05 16:43:07,623][15444] Updated weights for policy 0, policy_version 21701 (0.0019) [2024-08-05 16:43:08,119][15372] Fps is (10 sec: 24573.4, 60 sec: 24166.0, 300 sec: 24187.2). Total num frames: 177782784. Throughput: 0: 6010.3. Samples: 44439520. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 16:43:08,120][15372] Avg episode reward: [(0, '37.799')] [2024-08-05 16:43:11,119][15444] Updated weights for policy 0, policy_version 21711 (0.0024) [2024-08-05 16:43:13,119][15372] Fps is (10 sec: 23755.7, 60 sec: 24029.8, 300 sec: 24159.4). Total num frames: 177897472. Throughput: 0: 6017.9. Samples: 44475490. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 16:43:13,127][15372] Avg episode reward: [(0, '37.716')] [2024-08-05 16:43:14,388][15444] Updated weights for policy 0, policy_version 21721 (0.0021) [2024-08-05 16:43:15,344][15417] Signal inference workers to stop experience collection... (7950 times) [2024-08-05 16:43:15,344][15417] Signal inference workers to resume experience collection... (7950 times) [2024-08-05 16:43:15,380][15444] InferenceWorker_p0-w0: stopping experience collection (7950 times) [2024-08-05 16:43:15,386][15444] InferenceWorker_p0-w0: resuming experience collection (7950 times) [2024-08-05 16:43:17,729][15444] Updated weights for policy 0, policy_version 21731 (0.0023) [2024-08-05 16:43:18,118][15372] Fps is (10 sec: 24578.6, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 178028544. Throughput: 0: 6022.2. Samples: 44513110. Policy #0 lag: (min: 0.0, avg: 4.2, max: 7.0) [2024-08-05 16:43:18,119][15372] Avg episode reward: [(0, '38.180')] [2024-08-05 16:43:21,098][15444] Updated weights for policy 0, policy_version 21741 (0.0022) [2024-08-05 16:43:23,119][15372] Fps is (10 sec: 24575.0, 60 sec: 24029.5, 300 sec: 24187.2). Total num frames: 178143232. Throughput: 0: 6034.6. Samples: 44531540. Policy #0 lag: (min: 0.0, avg: 4.2, max: 7.0) [2024-08-05 16:43:23,126][15372] Avg episode reward: [(0, '37.986')] [2024-08-05 16:43:24,475][15444] Updated weights for policy 0, policy_version 21751 (0.0017) [2024-08-05 16:43:27,971][15444] Updated weights for policy 0, policy_version 21761 (0.0010) [2024-08-05 16:43:28,118][15372] Fps is (10 sec: 23756.6, 60 sec: 24166.4, 300 sec: 24187.3). Total num frames: 178266112. Throughput: 0: 6022.7. Samples: 44567440. Policy #0 lag: (min: 0.0, avg: 4.2, max: 7.0) [2024-08-05 16:43:28,119][15372] Avg episode reward: [(0, '38.335')] [2024-08-05 16:43:31,163][15444] Updated weights for policy 0, policy_version 21771 (0.0024) [2024-08-05 16:43:33,118][15372] Fps is (10 sec: 24578.1, 60 sec: 24029.9, 300 sec: 24187.2). Total num frames: 178388992. Throughput: 0: 6034.2. Samples: 44603410. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 16:43:33,126][15372] Avg episode reward: [(0, '38.568')] [2024-08-05 16:43:34,567][15444] Updated weights for policy 0, policy_version 21781 (0.0023) [2024-08-05 16:43:38,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 178503680. Throughput: 0: 6042.0. Samples: 44622080. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 16:43:38,126][15372] Avg episode reward: [(0, '38.256')] [2024-08-05 16:43:38,204][15444] Updated weights for policy 0, policy_version 21791 (0.0022) [2024-08-05 16:43:41,199][15444] Updated weights for policy 0, policy_version 21801 (0.0018) [2024-08-05 16:43:43,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24303.2, 300 sec: 24187.3). Total num frames: 178634752. Throughput: 0: 6052.0. Samples: 44658310. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 16:43:43,127][15372] Avg episode reward: [(0, '37.694')] [2024-08-05 16:43:44,923][15444] Updated weights for policy 0, policy_version 21811 (0.0012) [2024-08-05 16:43:48,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24030.6, 300 sec: 24159.5). Total num frames: 178749440. Throughput: 0: 6050.7. Samples: 44694030. Policy #0 lag: (min: 1.0, avg: 3.7, max: 7.0) [2024-08-05 16:43:48,126][15372] Avg episode reward: [(0, '37.467')] [2024-08-05 16:43:48,394][15444] Updated weights for policy 0, policy_version 21821 (0.0025) [2024-08-05 16:43:51,543][15444] Updated weights for policy 0, policy_version 21831 (0.0013) [2024-08-05 16:43:51,688][15417] Signal inference workers to stop experience collection... (8000 times) [2024-08-05 16:43:51,690][15417] Signal inference workers to resume experience collection... (8000 times) [2024-08-05 16:43:51,765][15444] InferenceWorker_p0-w0: stopping experience collection (8000 times) [2024-08-05 16:43:51,771][15444] InferenceWorker_p0-w0: resuming experience collection (8000 times) [2024-08-05 16:43:53,119][15372] Fps is (10 sec: 23756.5, 60 sec: 24171.1, 300 sec: 24159.5). Total num frames: 178872320. Throughput: 0: 6068.8. Samples: 44712610. Policy #0 lag: (min: 1.0, avg: 3.7, max: 7.0) [2024-08-05 16:43:53,119][15372] Avg episode reward: [(0, '37.591')] [2024-08-05 16:43:54,981][15444] Updated weights for policy 0, policy_version 21841 (0.0020) [2024-08-05 16:43:58,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 178995200. Throughput: 0: 6079.6. Samples: 44749070. Policy #0 lag: (min: 1.0, avg: 3.7, max: 7.0) [2024-08-05 16:43:58,126][15372] Avg episode reward: [(0, '37.816')] [2024-08-05 16:43:58,250][15444] Updated weights for policy 0, policy_version 21851 (0.0035) [2024-08-05 16:44:01,951][15444] Updated weights for policy 0, policy_version 21861 (0.0021) [2024-08-05 16:44:03,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 179109888. Throughput: 0: 6028.4. Samples: 44784390. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 16:44:03,119][15372] Avg episode reward: [(0, '38.026')] [2024-08-05 16:44:05,302][15444] Updated weights for policy 0, policy_version 21871 (0.0014) [2024-08-05 16:44:08,119][15372] Fps is (10 sec: 23756.5, 60 sec: 24166.8, 300 sec: 24159.4). Total num frames: 179232768. Throughput: 0: 6037.9. Samples: 44803240. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 16:44:08,119][15372] Avg episode reward: [(0, '38.867')] [2024-08-05 16:44:08,535][15444] Updated weights for policy 0, policy_version 21881 (0.0037) [2024-08-05 16:44:12,312][15444] Updated weights for policy 0, policy_version 21891 (0.0011) [2024-08-05 16:44:13,123][15372] Fps is (10 sec: 24565.6, 60 sec: 24301.4, 300 sec: 24159.1). Total num frames: 179355648. Throughput: 0: 6029.4. Samples: 44838790. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 16:44:13,123][15372] Avg episode reward: [(0, '38.909')] [2024-08-05 16:44:15,232][15444] Updated weights for policy 0, policy_version 21901 (0.0031) [2024-08-05 16:44:18,119][15372] Fps is (10 sec: 23756.7, 60 sec: 24029.8, 300 sec: 24131.7). Total num frames: 179470336. Throughput: 0: 6022.4. Samples: 44874420. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 16:44:18,126][15372] Avg episode reward: [(0, '38.124')] [2024-08-05 16:44:18,131][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000021908_179470336.pth... [2024-08-05 16:44:18,261][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000021201_173678592.pth [2024-08-05 16:44:19,093][15444] Updated weights for policy 0, policy_version 21911 (0.0029) [2024-08-05 16:44:22,521][15444] Updated weights for policy 0, policy_version 21921 (0.0025) [2024-08-05 16:44:23,118][15372] Fps is (10 sec: 22947.5, 60 sec: 24030.2, 300 sec: 24131.7). Total num frames: 179585024. Throughput: 0: 6009.1. Samples: 44892490. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 16:44:23,119][15372] Avg episode reward: [(0, '37.720')] [2024-08-05 16:44:23,396][15417] Signal inference workers to stop experience collection... (8050 times) [2024-08-05 16:44:23,396][15417] Signal inference workers to resume experience collection... (8050 times) [2024-08-05 16:44:23,455][15444] InferenceWorker_p0-w0: stopping experience collection (8050 times) [2024-08-05 16:44:23,464][15444] InferenceWorker_p0-w0: resuming experience collection (8050 times) [2024-08-05 16:44:25,698][15444] Updated weights for policy 0, policy_version 21931 (0.0013) [2024-08-05 16:44:28,119][15372] Fps is (10 sec: 24574.5, 60 sec: 24166.1, 300 sec: 24160.2). Total num frames: 179716096. Throughput: 0: 6016.3. Samples: 44929050. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 16:44:28,121][15372] Avg episode reward: [(0, '37.649')] [2024-08-05 16:44:29,194][15444] Updated weights for policy 0, policy_version 21941 (0.0012) [2024-08-05 16:44:32,589][15444] Updated weights for policy 0, policy_version 21951 (0.0026) [2024-08-05 16:44:33,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 179830784. Throughput: 0: 6004.2. Samples: 44964220. Policy #0 lag: (min: 1.0, avg: 3.4, max: 7.0) [2024-08-05 16:44:33,119][15372] Avg episode reward: [(0, '38.612')] [2024-08-05 16:44:35,998][15444] Updated weights for policy 0, policy_version 21961 (0.0023) [2024-08-05 16:44:38,119][15372] Fps is (10 sec: 23758.2, 60 sec: 24166.3, 300 sec: 24131.7). Total num frames: 179953664. Throughput: 0: 6005.1. Samples: 44982840. Policy #0 lag: (min: 1.0, avg: 3.4, max: 7.0) [2024-08-05 16:44:38,119][15372] Avg episode reward: [(0, '39.151')] [2024-08-05 16:44:39,732][15444] Updated weights for policy 0, policy_version 21971 (0.0013) [2024-08-05 16:44:42,669][15444] Updated weights for policy 0, policy_version 21981 (0.0025) [2024-08-05 16:44:43,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24029.9, 300 sec: 24187.2). Total num frames: 180076544. Throughput: 0: 5998.2. Samples: 45018990. Policy #0 lag: (min: 1.0, avg: 3.4, max: 7.0) [2024-08-05 16:44:43,119][15372] Avg episode reward: [(0, '38.183')] [2024-08-05 16:44:46,334][15444] Updated weights for policy 0, policy_version 21991 (0.0027) [2024-08-05 16:44:48,119][15372] Fps is (10 sec: 23757.1, 60 sec: 24029.8, 300 sec: 24159.5). Total num frames: 180191232. Throughput: 0: 6010.7. Samples: 45054870. Policy #0 lag: (min: 0.0, avg: 3.2, max: 7.0) [2024-08-05 16:44:48,119][15372] Avg episode reward: [(0, '37.501')] [2024-08-05 16:44:49,348][15444] Updated weights for policy 0, policy_version 22001 (0.0012) [2024-08-05 16:44:52,932][15444] Updated weights for policy 0, policy_version 22011 (0.0019) [2024-08-05 16:44:53,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 180314112. Throughput: 0: 6010.2. Samples: 45073700. Policy #0 lag: (min: 0.0, avg: 3.2, max: 7.0) [2024-08-05 16:44:53,119][15372] Avg episode reward: [(0, '38.591')] [2024-08-05 16:44:56,322][15444] Updated weights for policy 0, policy_version 22021 (0.0013) [2024-08-05 16:44:58,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 180436992. Throughput: 0: 6018.1. Samples: 45109580. Policy #0 lag: (min: 0.0, avg: 3.2, max: 7.0) [2024-08-05 16:44:58,126][15372] Avg episode reward: [(0, '38.826')] [2024-08-05 16:44:59,506][15444] Updated weights for policy 0, policy_version 22031 (0.0011) [2024-08-05 16:45:03,119][15372] Fps is (10 sec: 23756.2, 60 sec: 24029.8, 300 sec: 24131.7). Total num frames: 180551680. Throughput: 0: 6039.8. Samples: 45146210. Policy #0 lag: (min: 1.0, avg: 3.6, max: 8.0) [2024-08-05 16:45:03,126][15372] Avg episode reward: [(0, '39.014')] [2024-08-05 16:45:03,175][15444] Updated weights for policy 0, policy_version 22041 (0.0011) [2024-08-05 16:45:03,305][15417] Signal inference workers to stop experience collection... (8100 times) [2024-08-05 16:45:03,306][15417] Signal inference workers to resume experience collection... (8100 times) [2024-08-05 16:45:03,342][15444] InferenceWorker_p0-w0: stopping experience collection (8100 times) [2024-08-05 16:45:03,342][15444] InferenceWorker_p0-w0: resuming experience collection (8100 times) [2024-08-05 16:45:06,139][15444] Updated weights for policy 0, policy_version 22051 (0.0025) [2024-08-05 16:45:08,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 180682752. Throughput: 0: 6053.8. Samples: 45164910. Policy #0 lag: (min: 1.0, avg: 3.6, max: 8.0) [2024-08-05 16:45:08,126][15372] Avg episode reward: [(0, '38.265')] [2024-08-05 16:45:09,764][15444] Updated weights for policy 0, policy_version 22061 (0.0033) [2024-08-05 16:45:13,087][15444] Updated weights for policy 0, policy_version 22071 (0.0026) [2024-08-05 16:45:13,118][15372] Fps is (10 sec: 25395.8, 60 sec: 24168.1, 300 sec: 24159.5). Total num frames: 180805632. Throughput: 0: 6050.1. Samples: 45201300. Policy #0 lag: (min: 1.0, avg: 3.6, max: 8.0) [2024-08-05 16:45:13,119][15372] Avg episode reward: [(0, '38.680')] [2024-08-05 16:45:16,374][15444] Updated weights for policy 0, policy_version 22081 (0.0030) [2024-08-05 16:45:18,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.5, 300 sec: 24159.5). Total num frames: 180920320. Throughput: 0: 6060.4. Samples: 45236940. Policy #0 lag: (min: 1.0, avg: 4.4, max: 8.0) [2024-08-05 16:45:18,126][15372] Avg episode reward: [(0, '39.251')] [2024-08-05 16:45:20,103][15444] Updated weights for policy 0, policy_version 22091 (0.0025) [2024-08-05 16:45:23,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24302.9, 300 sec: 24159.5). Total num frames: 181043200. Throughput: 0: 6058.5. Samples: 45255470. Policy #0 lag: (min: 1.0, avg: 4.4, max: 8.0) [2024-08-05 16:45:23,126][15372] Avg episode reward: [(0, '39.134')] [2024-08-05 16:45:23,417][15444] Updated weights for policy 0, policy_version 22101 (0.0011) [2024-08-05 16:45:26,609][15444] Updated weights for policy 0, policy_version 22111 (0.0024) [2024-08-05 16:45:28,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24166.7, 300 sec: 24187.2). Total num frames: 181166080. Throughput: 0: 6055.1. Samples: 45291470. Policy #0 lag: (min: 1.0, avg: 4.4, max: 8.0) [2024-08-05 16:45:28,126][15372] Avg episode reward: [(0, '38.054')] [2024-08-05 16:45:29,994][15444] Updated weights for policy 0, policy_version 22121 (0.0021) [2024-08-05 16:45:33,119][15372] Fps is (10 sec: 24575.6, 60 sec: 24302.9, 300 sec: 24159.5). Total num frames: 181288960. Throughput: 0: 6082.9. Samples: 45328600. Policy #0 lag: (min: 0.0, avg: 4.3, max: 8.0) [2024-08-05 16:45:33,126][15372] Avg episode reward: [(0, '37.285')] [2024-08-05 16:45:33,382][15444] Updated weights for policy 0, policy_version 22131 (0.0010) [2024-08-05 16:45:36,809][15444] Updated weights for policy 0, policy_version 22141 (0.0012) [2024-08-05 16:45:38,120][15372] Fps is (10 sec: 24571.6, 60 sec: 24302.3, 300 sec: 24187.1). Total num frames: 181411840. Throughput: 0: 6072.6. Samples: 45346980. Policy #0 lag: (min: 0.0, avg: 4.3, max: 8.0) [2024-08-05 16:45:38,120][15372] Avg episode reward: [(0, '37.392')] [2024-08-05 16:45:39,857][15444] Updated weights for policy 0, policy_version 22151 (0.0020) [2024-08-05 16:45:43,118][15372] Fps is (10 sec: 24576.4, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 181534720. Throughput: 0: 6093.1. Samples: 45383770. Policy #0 lag: (min: 0.0, avg: 4.3, max: 8.0) [2024-08-05 16:45:43,126][15372] Avg episode reward: [(0, '38.280')] [2024-08-05 16:45:43,510][15444] Updated weights for policy 0, policy_version 22161 (0.0024) [2024-08-05 16:45:46,770][15444] Updated weights for policy 0, policy_version 22171 (0.0011) [2024-08-05 16:45:48,119][15372] Fps is (10 sec: 23760.6, 60 sec: 24302.9, 300 sec: 24159.4). Total num frames: 181649408. Throughput: 0: 6073.6. Samples: 45419520. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 16:45:48,119][15372] Avg episode reward: [(0, '38.165')] [2024-08-05 16:45:48,978][15417] Signal inference workers to stop experience collection... (8150 times) [2024-08-05 16:45:48,981][15417] Signal inference workers to resume experience collection... (8150 times) [2024-08-05 16:45:49,044][15444] InferenceWorker_p0-w0: stopping experience collection (8150 times) [2024-08-05 16:45:49,044][15444] InferenceWorker_p0-w0: resuming experience collection (8150 times) [2024-08-05 16:45:50,125][15444] Updated weights for policy 0, policy_version 22181 (0.0020) [2024-08-05 16:45:53,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24302.9, 300 sec: 24159.5). Total num frames: 181772288. Throughput: 0: 6068.7. Samples: 45438000. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 16:45:53,119][15372] Avg episode reward: [(0, '38.694')] [2024-08-05 16:45:53,725][15444] Updated weights for policy 0, policy_version 22191 (0.0025) [2024-08-05 16:45:56,860][15444] Updated weights for policy 0, policy_version 22201 (0.0020) [2024-08-05 16:45:58,119][15372] Fps is (10 sec: 24575.7, 60 sec: 24302.8, 300 sec: 24187.2). Total num frames: 181895168. Throughput: 0: 6055.5. Samples: 45473800. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 16:45:58,119][15372] Avg episode reward: [(0, '38.110')] [2024-08-05 16:46:00,411][15444] Updated weights for policy 0, policy_version 22211 (0.0037) [2024-08-05 16:46:03,118][15372] Fps is (10 sec: 23756.6, 60 sec: 24303.0, 300 sec: 24159.5). Total num frames: 182009856. Throughput: 0: 6072.4. Samples: 45510200. Policy #0 lag: (min: 0.0, avg: 4.3, max: 8.0) [2024-08-05 16:46:03,126][15372] Avg episode reward: [(0, '38.080')] [2024-08-05 16:46:03,741][15444] Updated weights for policy 0, policy_version 22221 (0.0022) [2024-08-05 16:46:07,191][15444] Updated weights for policy 0, policy_version 22231 (0.0017) [2024-08-05 16:46:08,118][15372] Fps is (10 sec: 23757.4, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 182132736. Throughput: 0: 6068.2. Samples: 45528540. Policy #0 lag: (min: 0.0, avg: 4.3, max: 8.0) [2024-08-05 16:46:08,119][15372] Avg episode reward: [(0, '38.262')] [2024-08-05 16:46:10,524][15444] Updated weights for policy 0, policy_version 22241 (0.0019) [2024-08-05 16:46:13,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 182255616. Throughput: 0: 6076.7. Samples: 45564920. Policy #0 lag: (min: 0.0, avg: 4.3, max: 8.0) [2024-08-05 16:46:13,119][15372] Avg episode reward: [(0, '38.018')] [2024-08-05 16:46:13,861][15444] Updated weights for policy 0, policy_version 22251 (0.0025) [2024-08-05 16:46:17,343][15444] Updated weights for policy 0, policy_version 22261 (0.0029) [2024-08-05 16:46:18,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 182378496. Throughput: 0: 6055.1. Samples: 45601080. Policy #0 lag: (min: 0.0, avg: 3.4, max: 8.0) [2024-08-05 16:46:18,119][15372] Avg episode reward: [(0, '38.803')] [2024-08-05 16:46:18,122][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000022263_182378496.pth... [2024-08-05 16:46:18,272][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000021555_176578560.pth [2024-08-05 16:46:20,515][15444] Updated weights for policy 0, policy_version 22271 (0.0011) [2024-08-05 16:46:23,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 182501376. Throughput: 0: 6050.0. Samples: 45619220. Policy #0 lag: (min: 0.0, avg: 3.4, max: 8.0) [2024-08-05 16:46:23,119][15372] Avg episode reward: [(0, '38.237')] [2024-08-05 16:46:24,109][15444] Updated weights for policy 0, policy_version 22281 (0.0018) [2024-08-05 16:46:27,581][15444] Updated weights for policy 0, policy_version 22291 (0.0012) [2024-08-05 16:46:28,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 182616064. Throughput: 0: 6040.0. Samples: 45655570. Policy #0 lag: (min: 0.0, avg: 3.4, max: 8.0) [2024-08-05 16:46:28,119][15372] Avg episode reward: [(0, '37.946')] [2024-08-05 16:46:30,689][15444] Updated weights for policy 0, policy_version 22301 (0.0014) [2024-08-05 16:46:33,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.5, 300 sec: 24160.2). Total num frames: 182738944. Throughput: 0: 6053.8. Samples: 45691940. Policy #0 lag: (min: 1.0, avg: 3.9, max: 7.0) [2024-08-05 16:46:33,126][15372] Avg episode reward: [(0, '37.855')] [2024-08-05 16:46:34,225][15444] Updated weights for policy 0, policy_version 22311 (0.0012) [2024-08-05 16:46:37,736][15444] Updated weights for policy 0, policy_version 22321 (0.0010) [2024-08-05 16:46:38,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24167.1, 300 sec: 24187.2). Total num frames: 182861824. Throughput: 0: 6034.0. Samples: 45709530. Policy #0 lag: (min: 1.0, avg: 3.9, max: 7.0) [2024-08-05 16:46:38,119][15372] Avg episode reward: [(0, '37.230')] [2024-08-05 16:46:39,533][15417] Signal inference workers to stop experience collection... (8200 times) [2024-08-05 16:46:39,533][15417] Signal inference workers to resume experience collection... (8200 times) [2024-08-05 16:46:39,601][15444] InferenceWorker_p0-w0: stopping experience collection (8200 times) [2024-08-05 16:46:39,601][15444] InferenceWorker_p0-w0: resuming experience collection (8200 times) [2024-08-05 16:46:41,110][15444] Updated weights for policy 0, policy_version 22331 (0.0028) [2024-08-05 16:46:43,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 182984704. Throughput: 0: 6052.7. Samples: 45746170. Policy #0 lag: (min: 1.0, avg: 3.9, max: 7.0) [2024-08-05 16:46:43,119][15372] Avg episode reward: [(0, '37.094')] [2024-08-05 16:46:44,304][15444] Updated weights for policy 0, policy_version 22341 (0.0020) [2024-08-05 16:46:47,753][15444] Updated weights for policy 0, policy_version 22351 (0.0011) [2024-08-05 16:46:48,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.5, 300 sec: 24187.2). Total num frames: 183099392. Throughput: 0: 6070.2. Samples: 45783360. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 16:46:48,119][15372] Avg episode reward: [(0, '38.004')] [2024-08-05 16:46:50,968][15444] Updated weights for policy 0, policy_version 22361 (0.0025) [2024-08-05 16:46:53,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 183230464. Throughput: 0: 6064.4. Samples: 45801440. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 16:46:53,126][15372] Avg episode reward: [(0, '38.987')] [2024-08-05 16:46:54,564][15444] Updated weights for policy 0, policy_version 22371 (0.0020) [2024-08-05 16:46:57,942][15444] Updated weights for policy 0, policy_version 22381 (0.0012) [2024-08-05 16:46:58,118][15372] Fps is (10 sec: 24575.8, 60 sec: 24166.5, 300 sec: 24187.2). Total num frames: 183345152. Throughput: 0: 6050.9. Samples: 45837210. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 16:46:58,119][15372] Avg episode reward: [(0, '38.785')] [2024-08-05 16:47:01,507][15444] Updated weights for policy 0, policy_version 22391 (0.0018) [2024-08-05 16:47:03,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24303.0, 300 sec: 24187.2). Total num frames: 183468032. Throughput: 0: 6041.8. Samples: 45872960. Policy #0 lag: (min: 1.0, avg: 4.1, max: 7.0) [2024-08-05 16:47:03,119][15372] Avg episode reward: [(0, '37.257')] [2024-08-05 16:47:04,760][15444] Updated weights for policy 0, policy_version 22401 (0.0042) [2024-08-05 16:47:08,120][15372] Fps is (10 sec: 23754.0, 60 sec: 24165.9, 300 sec: 24159.4). Total num frames: 183582720. Throughput: 0: 6049.6. Samples: 45891460. Policy #0 lag: (min: 1.0, avg: 4.1, max: 7.0) [2024-08-05 16:47:08,127][15372] Avg episode reward: [(0, '37.117')] [2024-08-05 16:47:08,263][15444] Updated weights for policy 0, policy_version 22411 (0.0010) [2024-08-05 16:47:11,496][15444] Updated weights for policy 0, policy_version 22421 (0.0037) [2024-08-05 16:47:13,119][15372] Fps is (10 sec: 23756.5, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 183705600. Throughput: 0: 6032.9. Samples: 45927050. Policy #0 lag: (min: 1.0, avg: 4.1, max: 7.0) [2024-08-05 16:47:13,126][15372] Avg episode reward: [(0, '38.088')] [2024-08-05 16:47:14,804][15444] Updated weights for policy 0, policy_version 22431 (0.0022) [2024-08-05 16:47:18,118][15372] Fps is (10 sec: 24578.9, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 183828480. Throughput: 0: 6031.3. Samples: 45963350. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 16:47:18,126][15372] Avg episode reward: [(0, '37.891')] [2024-08-05 16:47:18,454][15444] Updated weights for policy 0, policy_version 22441 (0.0026) [2024-08-05 16:47:19,380][15417] Signal inference workers to stop experience collection... (8250 times) [2024-08-05 16:47:19,381][15417] Signal inference workers to resume experience collection... (8250 times) [2024-08-05 16:47:19,428][15444] InferenceWorker_p0-w0: stopping experience collection (8250 times) [2024-08-05 16:47:19,428][15444] InferenceWorker_p0-w0: resuming experience collection (8250 times) [2024-08-05 16:47:21,513][15444] Updated weights for policy 0, policy_version 22451 (0.0021) [2024-08-05 16:47:23,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 183951360. Throughput: 0: 6063.8. Samples: 45982400. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 16:47:23,119][15372] Avg episode reward: [(0, '39.059')] [2024-08-05 16:47:25,087][15444] Updated weights for policy 0, policy_version 22461 (0.0022) [2024-08-05 16:47:28,090][15444] Updated weights for policy 0, policy_version 22471 (0.0024) [2024-08-05 16:47:28,119][15372] Fps is (10 sec: 25394.1, 60 sec: 24439.3, 300 sec: 24187.2). Total num frames: 184082432. Throughput: 0: 6069.9. Samples: 46019320. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 16:47:28,119][15372] Avg episode reward: [(0, '39.704')] [2024-08-05 16:47:31,636][15444] Updated weights for policy 0, policy_version 22481 (0.0018) [2024-08-05 16:47:33,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 184197120. Throughput: 0: 6055.1. Samples: 46055840. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 16:47:33,126][15372] Avg episode reward: [(0, '38.897')] [2024-08-05 16:47:34,851][15444] Updated weights for policy 0, policy_version 22491 (0.0021) [2024-08-05 16:47:38,121][15372] Fps is (10 sec: 23752.8, 60 sec: 24302.1, 300 sec: 24214.9). Total num frames: 184320000. Throughput: 0: 6074.8. Samples: 46074820. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 16:47:38,128][15372] Avg episode reward: [(0, '38.043')] [2024-08-05 16:47:38,187][15444] Updated weights for policy 0, policy_version 22501 (0.0011) [2024-08-05 16:47:41,624][15444] Updated weights for policy 0, policy_version 22511 (0.0011) [2024-08-05 16:47:43,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24302.9, 300 sec: 24187.4). Total num frames: 184442880. Throughput: 0: 6086.7. Samples: 46111110. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 16:47:43,119][15372] Avg episode reward: [(0, '38.299')] [2024-08-05 16:47:44,881][15444] Updated weights for policy 0, policy_version 22521 (0.0011) [2024-08-05 16:47:48,118][15372] Fps is (10 sec: 23761.7, 60 sec: 24302.9, 300 sec: 24188.2). Total num frames: 184557568. Throughput: 0: 6091.3. Samples: 46147070. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 16:47:48,126][15372] Avg episode reward: [(0, '39.165')] [2024-08-05 16:47:48,504][15444] Updated weights for policy 0, policy_version 22531 (0.0013) [2024-08-05 16:47:52,043][15444] Updated weights for policy 0, policy_version 22541 (0.0027) [2024-08-05 16:47:53,119][15372] Fps is (10 sec: 24575.2, 60 sec: 24302.8, 300 sec: 24242.7). Total num frames: 184688640. Throughput: 0: 6087.2. Samples: 46165380. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 16:47:53,119][15372] Avg episode reward: [(0, '38.261')] [2024-08-05 16:47:55,019][15444] Updated weights for policy 0, policy_version 22551 (0.0027) [2024-08-05 16:47:58,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 184803328. Throughput: 0: 6111.1. Samples: 46202050. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 16:47:58,126][15372] Avg episode reward: [(0, '38.503')] [2024-08-05 16:47:58,619][15444] Updated weights for policy 0, policy_version 22561 (0.0021) [2024-08-05 16:48:01,744][15444] Updated weights for policy 0, policy_version 22571 (0.0013) [2024-08-05 16:48:03,120][15372] Fps is (10 sec: 23754.0, 60 sec: 24302.3, 300 sec: 24215.0). Total num frames: 184926208. Throughput: 0: 6106.5. Samples: 46238150. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 16:48:03,120][15372] Avg episode reward: [(0, '38.101')] [2024-08-05 16:48:03,431][15417] Signal inference workers to stop experience collection... (8300 times) [2024-08-05 16:48:03,431][15417] Signal inference workers to resume experience collection... (8300 times) [2024-08-05 16:48:03,467][15444] InferenceWorker_p0-w0: stopping experience collection (8300 times) [2024-08-05 16:48:03,510][15444] InferenceWorker_p0-w0: resuming experience collection (8300 times) [2024-08-05 16:48:05,243][15444] Updated weights for policy 0, policy_version 22581 (0.0023) [2024-08-05 16:48:08,119][15372] Fps is (10 sec: 24575.3, 60 sec: 24439.8, 300 sec: 24242.8). Total num frames: 185049088. Throughput: 0: 6088.9. Samples: 46256400. Policy #0 lag: (min: 1.0, avg: 4.6, max: 9.0) [2024-08-05 16:48:08,119][15372] Avg episode reward: [(0, '38.545')] [2024-08-05 16:48:09,004][15444] Updated weights for policy 0, policy_version 22591 (0.0019) [2024-08-05 16:48:11,823][15444] Updated weights for policy 0, policy_version 22601 (0.0011) [2024-08-05 16:48:13,118][15372] Fps is (10 sec: 23760.4, 60 sec: 24303.0, 300 sec: 24187.2). Total num frames: 185163776. Throughput: 0: 6072.3. Samples: 46292570. Policy #0 lag: (min: 1.0, avg: 4.6, max: 9.0) [2024-08-05 16:48:13,126][15372] Avg episode reward: [(0, '39.062')] [2024-08-05 16:48:15,551][15444] Updated weights for policy 0, policy_version 22611 (0.0028) [2024-08-05 16:48:18,118][15372] Fps is (10 sec: 24576.6, 60 sec: 24439.5, 300 sec: 24242.8). Total num frames: 185294848. Throughput: 0: 6062.4. Samples: 46328650. Policy #0 lag: (min: 1.0, avg: 4.6, max: 9.0) [2024-08-05 16:48:18,126][15372] Avg episode reward: [(0, '39.634')] [2024-08-05 16:48:18,130][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000022619_185294848.pth... [2024-08-05 16:48:18,264][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000021908_179470336.pth [2024-08-05 16:48:19,192][15444] Updated weights for policy 0, policy_version 22621 (0.0018) [2024-08-05 16:48:22,194][15444] Updated weights for policy 0, policy_version 22631 (0.0023) [2024-08-05 16:48:23,119][15372] Fps is (10 sec: 24575.8, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 185409536. Throughput: 0: 6038.9. Samples: 46346560. Policy #0 lag: (min: 0.0, avg: 4.3, max: 8.0) [2024-08-05 16:48:23,126][15372] Avg episode reward: [(0, '39.169')] [2024-08-05 16:48:25,956][15444] Updated weights for policy 0, policy_version 22641 (0.0024) [2024-08-05 16:48:28,119][15372] Fps is (10 sec: 22937.3, 60 sec: 24030.0, 300 sec: 24187.2). Total num frames: 185524224. Throughput: 0: 6024.9. Samples: 46382230. Policy #0 lag: (min: 0.0, avg: 4.3, max: 8.0) [2024-08-05 16:48:28,119][15372] Avg episode reward: [(0, '38.610')] [2024-08-05 16:48:29,130][15444] Updated weights for policy 0, policy_version 22651 (0.0020) [2024-08-05 16:48:32,568][15444] Updated weights for policy 0, policy_version 22661 (0.0012) [2024-08-05 16:48:33,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24302.9, 300 sec: 24242.8). Total num frames: 185655296. Throughput: 0: 6018.5. Samples: 46417900. Policy #0 lag: (min: 0.0, avg: 4.3, max: 8.0) [2024-08-05 16:48:33,119][15372] Avg episode reward: [(0, '38.690')] [2024-08-05 16:48:36,255][15444] Updated weights for policy 0, policy_version 22671 (0.0043) [2024-08-05 16:48:38,118][15372] Fps is (10 sec: 23757.3, 60 sec: 24030.7, 300 sec: 24159.5). Total num frames: 185761792. Throughput: 0: 6023.2. Samples: 46436420. Policy #0 lag: (min: 1.0, avg: 3.2, max: 7.0) [2024-08-05 16:48:38,119][15372] Avg episode reward: [(0, '39.294')] [2024-08-05 16:48:38,880][15417] Signal inference workers to stop experience collection... (8350 times) [2024-08-05 16:48:38,885][15417] Signal inference workers to resume experience collection... (8350 times) [2024-08-05 16:48:38,961][15444] InferenceWorker_p0-w0: stopping experience collection (8350 times) [2024-08-05 16:48:38,962][15444] InferenceWorker_p0-w0: resuming experience collection (8350 times) [2024-08-05 16:48:39,231][15444] Updated weights for policy 0, policy_version 22681 (0.0021) [2024-08-05 16:48:42,843][15444] Updated weights for policy 0, policy_version 22691 (0.0014) [2024-08-05 16:48:43,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 185892864. Throughput: 0: 6017.1. Samples: 46472820. Policy #0 lag: (min: 1.0, avg: 3.2, max: 7.0) [2024-08-05 16:48:43,119][15372] Avg episode reward: [(0, '38.459')] [2024-08-05 16:48:45,990][15444] Updated weights for policy 0, policy_version 22701 (0.0022) [2024-08-05 16:48:48,118][15372] Fps is (10 sec: 25395.1, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 186015744. Throughput: 0: 6016.9. Samples: 46508900. Policy #0 lag: (min: 1.0, avg: 3.2, max: 7.0) [2024-08-05 16:48:48,119][15372] Avg episode reward: [(0, '38.355')] [2024-08-05 16:48:49,686][15444] Updated weights for policy 0, policy_version 22711 (0.0016) [2024-08-05 16:48:53,118][15372] Fps is (10 sec: 22937.7, 60 sec: 23893.5, 300 sec: 24159.5). Total num frames: 186122240. Throughput: 0: 5994.3. Samples: 46526140. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 16:48:53,119][15372] Avg episode reward: [(0, '38.834')] [2024-08-05 16:48:53,282][15444] Updated weights for policy 0, policy_version 22721 (0.0038) [2024-08-05 16:48:56,329][15444] Updated weights for policy 0, policy_version 22731 (0.0013) [2024-08-05 16:48:58,118][15372] Fps is (10 sec: 22937.7, 60 sec: 24029.9, 300 sec: 24187.2). Total num frames: 186245120. Throughput: 0: 5985.3. Samples: 46561910. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 16:48:58,126][15372] Avg episode reward: [(0, '38.668')] [2024-08-05 16:49:00,090][15444] Updated weights for policy 0, policy_version 22741 (0.0012) [2024-08-05 16:49:03,108][15444] Updated weights for policy 0, policy_version 22751 (0.0024) [2024-08-05 16:49:03,118][15372] Fps is (10 sec: 25395.1, 60 sec: 24167.0, 300 sec: 24215.0). Total num frames: 186376192. Throughput: 0: 5996.4. Samples: 46598490. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 16:49:03,122][15372] Avg episode reward: [(0, '37.506')] [2024-08-05 16:49:06,738][15444] Updated weights for policy 0, policy_version 22761 (0.0032) [2024-08-05 16:49:08,119][15372] Fps is (10 sec: 23756.4, 60 sec: 23893.4, 300 sec: 24159.8). Total num frames: 186482688. Throughput: 0: 6004.2. Samples: 46616750. Policy #0 lag: (min: 1.0, avg: 4.7, max: 10.0) [2024-08-05 16:49:08,119][15372] Avg episode reward: [(0, '37.891')] [2024-08-05 16:49:10,196][15444] Updated weights for policy 0, policy_version 22771 (0.0033) [2024-08-05 16:49:13,124][15372] Fps is (10 sec: 22925.3, 60 sec: 24027.7, 300 sec: 24186.8). Total num frames: 186605568. Throughput: 0: 6018.9. Samples: 46653110. Policy #0 lag: (min: 1.0, avg: 4.7, max: 10.0) [2024-08-05 16:49:13,133][15372] Avg episode reward: [(0, '38.085')] [2024-08-05 16:49:13,321][15417] Signal inference workers to stop experience collection... (8400 times) [2024-08-05 16:49:13,322][15417] Signal inference workers to resume experience collection... (8400 times) [2024-08-05 16:49:13,367][15444] InferenceWorker_p0-w0: stopping experience collection (8400 times) [2024-08-05 16:49:13,367][15444] InferenceWorker_p0-w0: resuming experience collection (8400 times) [2024-08-05 16:49:13,420][15444] Updated weights for policy 0, policy_version 22781 (0.0020) [2024-08-05 16:49:17,208][15444] Updated weights for policy 0, policy_version 22791 (0.0012) [2024-08-05 16:49:18,118][15372] Fps is (10 sec: 25395.6, 60 sec: 24029.9, 300 sec: 24242.8). Total num frames: 186736640. Throughput: 0: 6019.3. Samples: 46688770. Policy #0 lag: (min: 1.0, avg: 4.7, max: 10.0) [2024-08-05 16:49:18,119][15372] Avg episode reward: [(0, '38.585')] [2024-08-05 16:49:20,060][15444] Updated weights for policy 0, policy_version 22801 (0.0011) [2024-08-05 16:49:23,119][15372] Fps is (10 sec: 24589.0, 60 sec: 24029.9, 300 sec: 24187.3). Total num frames: 186851328. Throughput: 0: 6019.1. Samples: 46707280. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 16:49:23,126][15372] Avg episode reward: [(0, '38.728')] [2024-08-05 16:49:23,835][15444] Updated weights for policy 0, policy_version 22811 (0.0018) [2024-08-05 16:49:27,259][15444] Updated weights for policy 0, policy_version 22821 (0.0016) [2024-08-05 16:49:28,119][15372] Fps is (10 sec: 22937.4, 60 sec: 24029.9, 300 sec: 24187.2). Total num frames: 186966016. Throughput: 0: 6009.5. Samples: 46743250. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 16:49:28,119][15372] Avg episode reward: [(0, '38.638')] [2024-08-05 16:49:30,440][15444] Updated weights for policy 0, policy_version 22831 (0.0018) [2024-08-05 16:49:33,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24029.9, 300 sec: 24215.0). Total num frames: 187097088. Throughput: 0: 6021.1. Samples: 46779850. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 16:49:33,119][15372] Avg episode reward: [(0, '37.681')] [2024-08-05 16:49:33,998][15444] Updated weights for policy 0, policy_version 22841 (0.0015) [2024-08-05 16:49:36,949][15444] Updated weights for policy 0, policy_version 22851 (0.0025) [2024-08-05 16:49:38,119][15372] Fps is (10 sec: 24575.8, 60 sec: 24166.3, 300 sec: 24187.2). Total num frames: 187211776. Throughput: 0: 6045.1. Samples: 46798170. Policy #0 lag: (min: 1.0, avg: 3.9, max: 8.0) [2024-08-05 16:49:38,126][15372] Avg episode reward: [(0, '38.249')] [2024-08-05 16:49:40,723][15444] Updated weights for policy 0, policy_version 22861 (0.0030) [2024-08-05 16:49:43,045][15417] Signal inference workers to stop experience collection... (8450 times) [2024-08-05 16:49:43,046][15417] Signal inference workers to resume experience collection... (8450 times) [2024-08-05 16:49:43,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24029.9, 300 sec: 24215.0). Total num frames: 187334656. Throughput: 0: 6051.8. Samples: 46834240. Policy #0 lag: (min: 1.0, avg: 3.9, max: 8.0) [2024-08-05 16:49:43,119][15372] Avg episode reward: [(0, '38.441')] [2024-08-05 16:49:43,127][15444] InferenceWorker_p0-w0: stopping experience collection (8450 times) [2024-08-05 16:49:43,134][15444] InferenceWorker_p0-w0: resuming experience collection (8450 times) [2024-08-05 16:49:44,303][15444] Updated weights for policy 0, policy_version 22871 (0.0013) [2024-08-05 16:49:47,402][15444] Updated weights for policy 0, policy_version 22881 (0.0019) [2024-08-05 16:49:48,119][15372] Fps is (10 sec: 24576.2, 60 sec: 24029.9, 300 sec: 24215.0). Total num frames: 187457536. Throughput: 0: 6022.2. Samples: 46869490. Policy #0 lag: (min: 1.0, avg: 3.9, max: 8.0) [2024-08-05 16:49:48,119][15372] Avg episode reward: [(0, '38.375')] [2024-08-05 16:49:50,958][15444] Updated weights for policy 0, policy_version 22891 (0.0011) [2024-08-05 16:49:53,119][15372] Fps is (10 sec: 23755.8, 60 sec: 24166.2, 300 sec: 24187.2). Total num frames: 187572224. Throughput: 0: 6031.5. Samples: 46888170. Policy #0 lag: (min: 0.0, avg: 4.6, max: 8.0) [2024-08-05 16:49:53,119][15372] Avg episode reward: [(0, '38.895')] [2024-08-05 16:49:54,186][15444] Updated weights for policy 0, policy_version 22901 (0.0025) [2024-08-05 16:49:57,662][15444] Updated weights for policy 0, policy_version 22911 (0.0010) [2024-08-05 16:49:58,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 187695104. Throughput: 0: 6020.7. Samples: 46924010. Policy #0 lag: (min: 0.0, avg: 4.6, max: 8.0) [2024-08-05 16:49:58,119][15372] Avg episode reward: [(0, '39.212')] [2024-08-05 16:50:00,883][15444] Updated weights for policy 0, policy_version 22921 (0.0029) [2024-08-05 16:50:03,118][15372] Fps is (10 sec: 24577.0, 60 sec: 24029.9, 300 sec: 24187.2). Total num frames: 187817984. Throughput: 0: 6025.3. Samples: 46959910. Policy #0 lag: (min: 0.0, avg: 4.6, max: 8.0) [2024-08-05 16:50:03,126][15372] Avg episode reward: [(0, '39.016')] [2024-08-05 16:50:04,425][15444] Updated weights for policy 0, policy_version 22931 (0.0028) [2024-08-05 16:50:07,975][15444] Updated weights for policy 0, policy_version 22941 (0.0012) [2024-08-05 16:50:08,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.5, 300 sec: 24159.5). Total num frames: 187932672. Throughput: 0: 6020.9. Samples: 46978220. Policy #0 lag: (min: 0.0, avg: 4.3, max: 8.0) [2024-08-05 16:50:08,119][15372] Avg episode reward: [(0, '39.014')] [2024-08-05 16:50:11,030][15444] Updated weights for policy 0, policy_version 22951 (0.0020) [2024-08-05 16:50:13,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24168.6, 300 sec: 24187.2). Total num frames: 188055552. Throughput: 0: 6020.5. Samples: 47014170. Policy #0 lag: (min: 0.0, avg: 4.3, max: 8.0) [2024-08-05 16:50:13,119][15372] Avg episode reward: [(0, '38.872')] [2024-08-05 16:50:14,689][15444] Updated weights for policy 0, policy_version 22961 (0.0021) [2024-08-05 16:50:18,105][15444] Updated weights for policy 0, policy_version 22971 (0.0014) [2024-08-05 16:50:18,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24029.9, 300 sec: 24187.2). Total num frames: 188178432. Throughput: 0: 5995.1. Samples: 47049630. Policy #0 lag: (min: 0.0, avg: 4.3, max: 8.0) [2024-08-05 16:50:18,119][15372] Avg episode reward: [(0, '39.353')] [2024-08-05 16:50:18,122][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000022971_188178432.pth... [2024-08-05 16:50:18,229][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000022263_182378496.pth [2024-08-05 16:50:21,404][15444] Updated weights for policy 0, policy_version 22981 (0.0016) [2024-08-05 16:50:23,119][15372] Fps is (10 sec: 23755.8, 60 sec: 24029.8, 300 sec: 24159.4). Total num frames: 188293120. Throughput: 0: 5999.5. Samples: 47068150. Policy #0 lag: (min: 0.0, avg: 4.3, max: 8.0) [2024-08-05 16:50:23,127][15372] Avg episode reward: [(0, '38.890')] [2024-08-05 16:50:24,864][15444] Updated weights for policy 0, policy_version 22991 (0.0010) [2024-08-05 16:50:28,070][15444] Updated weights for policy 0, policy_version 23001 (0.0022) [2024-08-05 16:50:28,119][15372] Fps is (10 sec: 24575.2, 60 sec: 24302.8, 300 sec: 24187.2). Total num frames: 188424192. Throughput: 0: 6015.9. Samples: 47104960. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 16:50:28,119][15372] Avg episode reward: [(0, '38.311')] [2024-08-05 16:50:30,048][15417] Signal inference workers to stop experience collection... (8500 times) [2024-08-05 16:50:30,048][15417] Signal inference workers to resume experience collection... (8500 times) [2024-08-05 16:50:30,079][15444] InferenceWorker_p0-w0: stopping experience collection (8500 times) [2024-08-05 16:50:30,123][15444] InferenceWorker_p0-w0: resuming experience collection (8500 times) [2024-08-05 16:50:31,636][15444] Updated weights for policy 0, policy_version 23011 (0.0012) [2024-08-05 16:50:33,119][15372] Fps is (10 sec: 24576.6, 60 sec: 24029.8, 300 sec: 24159.6). Total num frames: 188538880. Throughput: 0: 6042.7. Samples: 47141410. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 16:50:33,119][15372] Avg episode reward: [(0, '39.349')] [2024-08-05 16:50:34,822][15444] Updated weights for policy 0, policy_version 23021 (0.0019) [2024-08-05 16:50:38,118][15372] Fps is (10 sec: 23757.6, 60 sec: 24166.5, 300 sec: 24159.5). Total num frames: 188661760. Throughput: 0: 6040.9. Samples: 47160010. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 16:50:38,119][15372] Avg episode reward: [(0, '39.281')] [2024-08-05 16:50:38,268][15444] Updated weights for policy 0, policy_version 23031 (0.0015) [2024-08-05 16:50:41,620][15444] Updated weights for policy 0, policy_version 23041 (0.0016) [2024-08-05 16:50:43,119][15372] Fps is (10 sec: 24575.9, 60 sec: 24166.3, 300 sec: 24187.2). Total num frames: 188784640. Throughput: 0: 6052.6. Samples: 47196380. Policy #0 lag: (min: 0.0, avg: 4.1, max: 9.0) [2024-08-05 16:50:43,127][15372] Avg episode reward: [(0, '38.513')] [2024-08-05 16:50:44,837][15444] Updated weights for policy 0, policy_version 23051 (0.0031) [2024-08-05 16:50:48,119][15372] Fps is (10 sec: 24575.8, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 188907520. Throughput: 0: 6066.9. Samples: 47232920. Policy #0 lag: (min: 0.0, avg: 4.1, max: 9.0) [2024-08-05 16:50:48,127][15372] Avg episode reward: [(0, '38.466')] [2024-08-05 16:50:48,600][15444] Updated weights for policy 0, policy_version 23061 (0.0015) [2024-08-05 16:50:52,029][15444] Updated weights for policy 0, policy_version 23071 (0.0020) [2024-08-05 16:50:53,118][15372] Fps is (10 sec: 24576.5, 60 sec: 24303.1, 300 sec: 24187.3). Total num frames: 189030400. Throughput: 0: 6069.3. Samples: 47251340. Policy #0 lag: (min: 0.0, avg: 4.1, max: 9.0) [2024-08-05 16:50:53,119][15372] Avg episode reward: [(0, '38.438')] [2024-08-05 16:50:55,008][15444] Updated weights for policy 0, policy_version 23081 (0.0015) [2024-08-05 16:50:58,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 189145088. Throughput: 0: 6078.4. Samples: 47287700. Policy #0 lag: (min: 1.0, avg: 3.4, max: 8.0) [2024-08-05 16:50:58,126][15372] Avg episode reward: [(0, '38.602')] [2024-08-05 16:50:58,663][15444] Updated weights for policy 0, policy_version 23091 (0.0015) [2024-08-05 16:51:02,018][15444] Updated weights for policy 0, policy_version 23101 (0.0023) [2024-08-05 16:51:03,118][15372] Fps is (10 sec: 23756.6, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 189267968. Throughput: 0: 6081.5. Samples: 47323300. Policy #0 lag: (min: 1.0, avg: 3.4, max: 8.0) [2024-08-05 16:51:03,119][15372] Avg episode reward: [(0, '39.656')] [2024-08-05 16:51:05,224][15444] Updated weights for policy 0, policy_version 23111 (0.0023) [2024-08-05 16:51:08,119][15372] Fps is (10 sec: 24574.3, 60 sec: 24302.6, 300 sec: 24187.2). Total num frames: 189390848. Throughput: 0: 6085.3. Samples: 47341990. Policy #0 lag: (min: 1.0, avg: 3.4, max: 8.0) [2024-08-05 16:51:08,127][15372] Avg episode reward: [(0, '38.553')] [2024-08-05 16:51:08,606][15444] Updated weights for policy 0, policy_version 23121 (0.0012) [2024-08-05 16:51:10,769][15417] Signal inference workers to stop experience collection... (8550 times) [2024-08-05 16:51:10,769][15417] Signal inference workers to resume experience collection... (8550 times) [2024-08-05 16:51:10,820][15444] InferenceWorker_p0-w0: stopping experience collection (8550 times) [2024-08-05 16:51:10,820][15444] InferenceWorker_p0-w0: resuming experience collection (8550 times) [2024-08-05 16:51:11,942][15444] Updated weights for policy 0, policy_version 23131 (0.0020) [2024-08-05 16:51:13,119][15372] Fps is (10 sec: 25394.7, 60 sec: 24439.3, 300 sec: 24215.0). Total num frames: 189521920. Throughput: 0: 6086.7. Samples: 47378860. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 16:51:13,119][15372] Avg episode reward: [(0, '38.624')] [2024-08-05 16:51:15,127][15444] Updated weights for policy 0, policy_version 23141 (0.0013) [2024-08-05 16:51:18,118][15372] Fps is (10 sec: 24577.8, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 189636608. Throughput: 0: 6093.1. Samples: 47415600. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 16:51:18,126][15372] Avg episode reward: [(0, '38.703')] [2024-08-05 16:51:18,730][15444] Updated weights for policy 0, policy_version 23151 (0.0028) [2024-08-05 16:51:22,012][15444] Updated weights for policy 0, policy_version 23161 (0.0011) [2024-08-05 16:51:23,118][15372] Fps is (10 sec: 23757.5, 60 sec: 24439.6, 300 sec: 24215.0). Total num frames: 189759488. Throughput: 0: 6088.0. Samples: 47433970. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 16:51:23,119][15372] Avg episode reward: [(0, '38.443')] [2024-08-05 16:51:25,323][15444] Updated weights for policy 0, policy_version 23171 (0.0012) [2024-08-05 16:51:28,119][15372] Fps is (10 sec: 24575.7, 60 sec: 24303.0, 300 sec: 24215.0). Total num frames: 189882368. Throughput: 0: 6094.0. Samples: 47470610. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 16:51:28,119][15372] Avg episode reward: [(0, '37.756')] [2024-08-05 16:51:28,840][15444] Updated weights for policy 0, policy_version 23181 (0.0015) [2024-08-05 16:51:32,066][15444] Updated weights for policy 0, policy_version 23191 (0.0013) [2024-08-05 16:51:33,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24303.0, 300 sec: 24187.2). Total num frames: 189997056. Throughput: 0: 6075.3. Samples: 47506310. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 16:51:33,119][15372] Avg episode reward: [(0, '39.230')] [2024-08-05 16:51:35,452][15444] Updated weights for policy 0, policy_version 23201 (0.0013) [2024-08-05 16:51:38,119][15372] Fps is (10 sec: 24576.1, 60 sec: 24439.4, 300 sec: 24215.0). Total num frames: 190128128. Throughput: 0: 6082.2. Samples: 47525040. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 16:51:38,119][15372] Avg episode reward: [(0, '39.224')] [2024-08-05 16:51:39,051][15444] Updated weights for policy 0, policy_version 23211 (0.0022) [2024-08-05 16:51:42,205][15444] Updated weights for policy 0, policy_version 23221 (0.0018) [2024-08-05 16:51:43,119][15372] Fps is (10 sec: 24575.1, 60 sec: 24302.8, 300 sec: 24215.0). Total num frames: 190242816. Throughput: 0: 6082.8. Samples: 47561430. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 16:51:43,127][15372] Avg episode reward: [(0, '39.842')] [2024-08-05 16:51:45,596][15444] Updated weights for policy 0, policy_version 23231 (0.0015) [2024-08-05 16:51:48,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24303.0, 300 sec: 24187.2). Total num frames: 190365696. Throughput: 0: 6099.3. Samples: 47597770. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 16:51:48,119][15372] Avg episode reward: [(0, '39.948')] [2024-08-05 16:51:49,113][15444] Updated weights for policy 0, policy_version 23241 (0.0035) [2024-08-05 16:51:52,655][15444] Updated weights for policy 0, policy_version 23251 (0.0012) [2024-08-05 16:51:53,118][15372] Fps is (10 sec: 23757.8, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 190480384. Throughput: 0: 6087.9. Samples: 47615940. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 16:51:53,119][15372] Avg episode reward: [(0, '39.274')] [2024-08-05 16:51:55,616][15444] Updated weights for policy 0, policy_version 23261 (0.0019) [2024-08-05 16:51:58,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 190603264. Throughput: 0: 6055.1. Samples: 47651340. Policy #0 lag: (min: 1.0, avg: 4.4, max: 8.0) [2024-08-05 16:51:58,126][15372] Avg episode reward: [(0, '38.591')] [2024-08-05 16:51:59,547][15444] Updated weights for policy 0, policy_version 23271 (0.0013) [2024-08-05 16:52:01,496][15417] Signal inference workers to stop experience collection... (8600 times) [2024-08-05 16:52:01,497][15417] Signal inference workers to resume experience collection... (8600 times) [2024-08-05 16:52:01,569][15444] InferenceWorker_p0-w0: stopping experience collection (8600 times) [2024-08-05 16:52:01,569][15444] InferenceWorker_p0-w0: resuming experience collection (8600 times) [2024-08-05 16:52:02,929][15444] Updated weights for policy 0, policy_version 23281 (0.0014) [2024-08-05 16:52:03,119][15372] Fps is (10 sec: 23756.4, 60 sec: 24166.3, 300 sec: 24187.3). Total num frames: 190717952. Throughput: 0: 6026.4. Samples: 47686790. Policy #0 lag: (min: 1.0, avg: 4.4, max: 8.0) [2024-08-05 16:52:03,119][15372] Avg episode reward: [(0, '38.123')] [2024-08-05 16:52:05,977][15444] Updated weights for policy 0, policy_version 23291 (0.0019) [2024-08-05 16:52:08,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.7, 300 sec: 24187.2). Total num frames: 190840832. Throughput: 0: 6028.7. Samples: 47705260. Policy #0 lag: (min: 1.0, avg: 4.4, max: 8.0) [2024-08-05 16:52:08,119][15372] Avg episode reward: [(0, '38.996')] [2024-08-05 16:52:09,753][15444] Updated weights for policy 0, policy_version 23301 (0.0017) [2024-08-05 16:52:13,119][15372] Fps is (10 sec: 23757.0, 60 sec: 23893.4, 300 sec: 24159.5). Total num frames: 190955520. Throughput: 0: 6007.8. Samples: 47740960. Policy #0 lag: (min: 1.0, avg: 4.4, max: 8.0) [2024-08-05 16:52:13,126][15372] Avg episode reward: [(0, '39.423')] [2024-08-05 16:52:13,222][15444] Updated weights for policy 0, policy_version 23311 (0.0037) [2024-08-05 16:52:16,224][15444] Updated weights for policy 0, policy_version 23321 (0.0011) [2024-08-05 16:52:18,119][15372] Fps is (10 sec: 23756.1, 60 sec: 24029.7, 300 sec: 24159.4). Total num frames: 191078400. Throughput: 0: 6020.0. Samples: 47777210. Policy #0 lag: (min: 1.0, avg: 3.6, max: 7.0) [2024-08-05 16:52:18,127][15372] Avg episode reward: [(0, '38.136')] [2024-08-05 16:52:18,131][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000023326_191086592.pth... [2024-08-05 16:52:18,289][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000022619_185294848.pth [2024-08-05 16:52:20,013][15444] Updated weights for policy 0, policy_version 23331 (0.0021) [2024-08-05 16:52:23,119][15372] Fps is (10 sec: 25395.1, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 191209472. Throughput: 0: 5999.8. Samples: 47795030. Policy #0 lag: (min: 1.0, avg: 3.6, max: 7.0) [2024-08-05 16:52:23,126][15372] Avg episode reward: [(0, '38.030')] [2024-08-05 16:52:23,134][15444] Updated weights for policy 0, policy_version 23341 (0.0019) [2024-08-05 16:52:26,538][15444] Updated weights for policy 0, policy_version 23351 (0.0020) [2024-08-05 16:52:28,118][15372] Fps is (10 sec: 23757.4, 60 sec: 23893.4, 300 sec: 24131.7). Total num frames: 191315968. Throughput: 0: 5991.4. Samples: 47831040. Policy #0 lag: (min: 1.0, avg: 3.6, max: 7.0) [2024-08-05 16:52:28,126][15372] Avg episode reward: [(0, '38.712')] [2024-08-05 16:52:30,065][15444] Updated weights for policy 0, policy_version 23361 (0.0011) [2024-08-05 16:52:33,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24166.4, 300 sec: 24159.6). Total num frames: 191447040. Throughput: 0: 6005.3. Samples: 47868010. Policy #0 lag: (min: 1.0, avg: 4.6, max: 9.0) [2024-08-05 16:52:33,126][15372] Avg episode reward: [(0, '38.505')] [2024-08-05 16:52:33,198][15444] Updated weights for policy 0, policy_version 23371 (0.0017) [2024-08-05 16:52:36,664][15444] Updated weights for policy 0, policy_version 23381 (0.0029) [2024-08-05 16:52:38,119][15372] Fps is (10 sec: 25395.1, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 191569920. Throughput: 0: 6008.9. Samples: 47886340. Policy #0 lag: (min: 1.0, avg: 4.6, max: 9.0) [2024-08-05 16:52:38,126][15372] Avg episode reward: [(0, '38.540')] [2024-08-05 16:52:38,898][15417] Signal inference workers to stop experience collection... (8650 times) [2024-08-05 16:52:38,898][15417] Signal inference workers to resume experience collection... (8650 times) [2024-08-05 16:52:38,928][15444] InferenceWorker_p0-w0: stopping experience collection (8650 times) [2024-08-05 16:52:38,929][15444] InferenceWorker_p0-w0: resuming experience collection (8650 times) [2024-08-05 16:52:40,184][15444] Updated weights for policy 0, policy_version 23391 (0.0022) [2024-08-05 16:52:43,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.6, 300 sec: 24187.2). Total num frames: 191692800. Throughput: 0: 6046.4. Samples: 47923430. Policy #0 lag: (min: 1.0, avg: 4.6, max: 9.0) [2024-08-05 16:52:43,119][15372] Avg episode reward: [(0, '38.733')] [2024-08-05 16:52:43,215][15444] Updated weights for policy 0, policy_version 23401 (0.0011) [2024-08-05 16:52:46,809][15444] Updated weights for policy 0, policy_version 23411 (0.0012) [2024-08-05 16:52:48,119][15372] Fps is (10 sec: 23756.5, 60 sec: 24029.8, 300 sec: 24131.7). Total num frames: 191807488. Throughput: 0: 6039.1. Samples: 47958550. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 16:52:48,119][15372] Avg episode reward: [(0, '38.966')] [2024-08-05 16:52:50,256][15444] Updated weights for policy 0, policy_version 23421 (0.0012) [2024-08-05 16:52:53,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 191930368. Throughput: 0: 6051.6. Samples: 47977580. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 16:52:53,119][15372] Avg episode reward: [(0, '39.324')] [2024-08-05 16:52:53,623][15444] Updated weights for policy 0, policy_version 23431 (0.0022) [2024-08-05 16:52:57,231][15444] Updated weights for policy 0, policy_version 23441 (0.0011) [2024-08-05 16:52:58,119][15372] Fps is (10 sec: 24575.8, 60 sec: 24166.3, 300 sec: 24159.6). Total num frames: 192053248. Throughput: 0: 6060.9. Samples: 48013700. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 16:52:58,119][15372] Avg episode reward: [(0, '39.603')] [2024-08-05 16:53:00,231][15444] Updated weights for policy 0, policy_version 23451 (0.0019) [2024-08-05 16:53:03,119][15372] Fps is (10 sec: 23756.5, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 192167936. Throughput: 0: 6068.0. Samples: 48050270. Policy #0 lag: (min: 1.0, avg: 4.8, max: 10.0) [2024-08-05 16:53:03,127][15372] Avg episode reward: [(0, '38.612')] [2024-08-05 16:53:03,879][15444] Updated weights for policy 0, policy_version 23461 (0.0031) [2024-08-05 16:53:07,311][15444] Updated weights for policy 0, policy_version 23471 (0.0019) [2024-08-05 16:53:08,118][15372] Fps is (10 sec: 23757.6, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 192290816. Throughput: 0: 6081.8. Samples: 48068710. Policy #0 lag: (min: 1.0, avg: 4.8, max: 10.0) [2024-08-05 16:53:08,119][15372] Avg episode reward: [(0, '38.090')] [2024-08-05 16:53:10,362][15444] Updated weights for policy 0, policy_version 23481 (0.0024) [2024-08-05 16:53:13,118][15372] Fps is (10 sec: 25395.5, 60 sec: 24439.5, 300 sec: 24159.5). Total num frames: 192421888. Throughput: 0: 6093.6. Samples: 48105250. Policy #0 lag: (min: 1.0, avg: 4.8, max: 10.0) [2024-08-05 16:53:13,119][15372] Avg episode reward: [(0, '39.591')] [2024-08-05 16:53:14,118][15444] Updated weights for policy 0, policy_version 23491 (0.0019) [2024-08-05 16:53:14,921][15417] Signal inference workers to stop experience collection... (8700 times) [2024-08-05 16:53:14,922][15417] Signal inference workers to resume experience collection... (8700 times) [2024-08-05 16:53:14,993][15444] InferenceWorker_p0-w0: stopping experience collection (8700 times) [2024-08-05 16:53:14,993][15444] InferenceWorker_p0-w0: resuming experience collection (8700 times) [2024-08-05 16:53:17,002][15444] Updated weights for policy 0, policy_version 23501 (0.0011) [2024-08-05 16:53:18,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24303.1, 300 sec: 24159.5). Total num frames: 192536576. Throughput: 0: 6075.8. Samples: 48141420. Policy #0 lag: (min: 1.0, avg: 3.3, max: 8.0) [2024-08-05 16:53:18,126][15372] Avg episode reward: [(0, '40.161')] [2024-08-05 16:53:18,129][15417] Saving new best policy, reward=40.161! [2024-08-05 16:53:20,647][15444] Updated weights for policy 0, policy_version 23511 (0.0019) [2024-08-05 16:53:23,119][15372] Fps is (10 sec: 24575.2, 60 sec: 24302.8, 300 sec: 24215.0). Total num frames: 192667648. Throughput: 0: 6074.2. Samples: 48159680. Policy #0 lag: (min: 1.0, avg: 3.3, max: 8.0) [2024-08-05 16:53:23,119][15372] Avg episode reward: [(0, '39.038')] [2024-08-05 16:53:23,904][15444] Updated weights for policy 0, policy_version 23521 (0.0018) [2024-08-05 16:53:27,418][15444] Updated weights for policy 0, policy_version 23531 (0.0010) [2024-08-05 16:53:28,120][15372] Fps is (10 sec: 24571.5, 60 sec: 24438.7, 300 sec: 24159.3). Total num frames: 192782336. Throughput: 0: 6052.0. Samples: 48195780. Policy #0 lag: (min: 1.0, avg: 3.3, max: 8.0) [2024-08-05 16:53:28,121][15372] Avg episode reward: [(0, '38.281')] [2024-08-05 16:53:31,079][15444] Updated weights for policy 0, policy_version 23541 (0.0012) [2024-08-05 16:53:33,118][15372] Fps is (10 sec: 22938.4, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 192897024. Throughput: 0: 6064.9. Samples: 48231470. Policy #0 lag: (min: 0.0, avg: 4.5, max: 9.0) [2024-08-05 16:53:33,119][15372] Avg episode reward: [(0, '38.566')] [2024-08-05 16:53:34,019][15444] Updated weights for policy 0, policy_version 23551 (0.0020) [2024-08-05 16:53:37,747][15444] Updated weights for policy 0, policy_version 23561 (0.0019) [2024-08-05 16:53:38,119][15372] Fps is (10 sec: 23760.4, 60 sec: 24166.3, 300 sec: 24159.4). Total num frames: 193019904. Throughput: 0: 6043.5. Samples: 48249540. Policy #0 lag: (min: 0.0, avg: 4.5, max: 9.0) [2024-08-05 16:53:38,119][15372] Avg episode reward: [(0, '38.672')] [2024-08-05 16:53:41,094][15444] Updated weights for policy 0, policy_version 23571 (0.0010) [2024-08-05 16:53:43,119][15372] Fps is (10 sec: 24574.8, 60 sec: 24166.2, 300 sec: 24159.4). Total num frames: 193142784. Throughput: 0: 6034.4. Samples: 48285250. Policy #0 lag: (min: 0.0, avg: 4.5, max: 9.0) [2024-08-05 16:53:43,127][15372] Avg episode reward: [(0, '38.749')] [2024-08-05 16:53:43,888][15417] Signal inference workers to stop experience collection... (8750 times) [2024-08-05 16:53:43,889][15417] Signal inference workers to resume experience collection... (8750 times) [2024-08-05 16:53:43,923][15444] InferenceWorker_p0-w0: stopping experience collection (8750 times) [2024-08-05 16:53:43,934][15444] InferenceWorker_p0-w0: resuming experience collection (8750 times) [2024-08-05 16:53:44,552][15444] Updated weights for policy 0, policy_version 23581 (0.0013) [2024-08-05 16:53:47,995][15444] Updated weights for policy 0, policy_version 23591 (0.0027) [2024-08-05 16:53:48,119][15372] Fps is (10 sec: 23757.2, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 193257472. Throughput: 0: 6006.2. Samples: 48320550. Policy #0 lag: (min: 0.0, avg: 4.5, max: 9.0) [2024-08-05 16:53:48,119][15372] Avg episode reward: [(0, '39.011')] [2024-08-05 16:53:51,229][15444] Updated weights for policy 0, policy_version 23601 (0.0029) [2024-08-05 16:53:53,118][15372] Fps is (10 sec: 23757.9, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 193380352. Throughput: 0: 6012.7. Samples: 48339280. Policy #0 lag: (min: 0.0, avg: 3.2, max: 7.0) [2024-08-05 16:53:53,126][15372] Avg episode reward: [(0, '38.176')] [2024-08-05 16:53:54,811][15444] Updated weights for policy 0, policy_version 23611 (0.0021) [2024-08-05 16:53:58,086][15444] Updated weights for policy 0, policy_version 23621 (0.0013) [2024-08-05 16:53:58,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24166.5, 300 sec: 24159.5). Total num frames: 193503232. Throughput: 0: 6009.8. Samples: 48375690. Policy #0 lag: (min: 0.0, avg: 3.2, max: 7.0) [2024-08-05 16:53:58,119][15372] Avg episode reward: [(0, '38.855')] [2024-08-05 16:54:01,368][15444] Updated weights for policy 0, policy_version 23631 (0.0010) [2024-08-05 16:54:03,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.5, 300 sec: 24187.2). Total num frames: 193617920. Throughput: 0: 5995.3. Samples: 48411210. Policy #0 lag: (min: 0.0, avg: 3.2, max: 7.0) [2024-08-05 16:54:03,126][15372] Avg episode reward: [(0, '38.612')] [2024-08-05 16:54:05,143][15444] Updated weights for policy 0, policy_version 23641 (0.0010) [2024-08-05 16:54:08,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.4, 300 sec: 24187.7). Total num frames: 193740800. Throughput: 0: 5996.9. Samples: 48429540. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 16:54:08,126][15372] Avg episode reward: [(0, '38.716')] [2024-08-05 16:54:08,428][15444] Updated weights for policy 0, policy_version 23651 (0.0020) [2024-08-05 16:54:11,742][15444] Updated weights for policy 0, policy_version 23661 (0.0011) [2024-08-05 16:54:13,118][15372] Fps is (10 sec: 24575.8, 60 sec: 24029.8, 300 sec: 24159.5). Total num frames: 193863680. Throughput: 0: 5993.1. Samples: 48465460. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 16:54:13,119][15372] Avg episode reward: [(0, '38.752')] [2024-08-05 16:54:15,130][15444] Updated weights for policy 0, policy_version 23671 (0.0016) [2024-08-05 16:54:18,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 193986560. Throughput: 0: 6023.5. Samples: 48502530. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 16:54:18,126][15372] Avg episode reward: [(0, '39.700')] [2024-08-05 16:54:18,168][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000023680_193986560.pth... [2024-08-05 16:54:18,284][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000022971_188178432.pth [2024-08-05 16:54:18,385][15444] Updated weights for policy 0, policy_version 23681 (0.0020) [2024-08-05 16:54:22,029][15444] Updated weights for policy 0, policy_version 23691 (0.0011) [2024-08-05 16:54:23,118][15372] Fps is (10 sec: 23756.9, 60 sec: 23893.5, 300 sec: 24187.2). Total num frames: 194101248. Throughput: 0: 6007.4. Samples: 48519870. Policy #0 lag: (min: 1.0, avg: 3.5, max: 8.0) [2024-08-05 16:54:23,119][15372] Avg episode reward: [(0, '39.239')] [2024-08-05 16:54:25,333][15444] Updated weights for policy 0, policy_version 23701 (0.0012) [2024-08-05 16:54:26,938][15417] Signal inference workers to stop experience collection... (8800 times) [2024-08-05 16:54:26,939][15417] Signal inference workers to resume experience collection... (8800 times) [2024-08-05 16:54:26,979][15444] InferenceWorker_p0-w0: stopping experience collection (8800 times) [2024-08-05 16:54:26,979][15444] InferenceWorker_p0-w0: resuming experience collection (8800 times) [2024-08-05 16:54:28,119][15372] Fps is (10 sec: 23755.5, 60 sec: 24030.4, 300 sec: 24159.4). Total num frames: 194224128. Throughput: 0: 6018.6. Samples: 48556090. Policy #0 lag: (min: 1.0, avg: 3.5, max: 8.0) [2024-08-05 16:54:28,119][15372] Avg episode reward: [(0, '39.240')] [2024-08-05 16:54:28,635][15444] Updated weights for policy 0, policy_version 23711 (0.0029) [2024-08-05 16:54:32,005][15444] Updated weights for policy 0, policy_version 23721 (0.0032) [2024-08-05 16:54:33,118][15372] Fps is (10 sec: 25395.2, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 194355200. Throughput: 0: 6057.8. Samples: 48593150. Policy #0 lag: (min: 1.0, avg: 3.5, max: 8.0) [2024-08-05 16:54:33,119][15372] Avg episode reward: [(0, '37.853')] [2024-08-05 16:54:35,099][15444] Updated weights for policy 0, policy_version 23731 (0.0018) [2024-08-05 16:54:38,135][15372] Fps is (10 sec: 24536.5, 60 sec: 24159.8, 300 sec: 24185.9). Total num frames: 194469888. Throughput: 0: 6062.6. Samples: 48612200. Policy #0 lag: (min: 1.0, avg: 3.4, max: 7.0) [2024-08-05 16:54:38,143][15372] Avg episode reward: [(0, '38.040')] [2024-08-05 16:54:38,620][15444] Updated weights for policy 0, policy_version 23741 (0.0030) [2024-08-05 16:54:42,007][15444] Updated weights for policy 0, policy_version 23751 (0.0019) [2024-08-05 16:54:43,119][15372] Fps is (10 sec: 23756.3, 60 sec: 24166.5, 300 sec: 24187.2). Total num frames: 194592768. Throughput: 0: 6051.8. Samples: 48648020. Policy #0 lag: (min: 1.0, avg: 3.4, max: 7.0) [2024-08-05 16:54:43,119][15372] Avg episode reward: [(0, '39.206')] [2024-08-05 16:54:45,281][15444] Updated weights for policy 0, policy_version 23761 (0.0017) [2024-08-05 16:54:48,119][15372] Fps is (10 sec: 24616.7, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 194715648. Throughput: 0: 6067.1. Samples: 48684230. Policy #0 lag: (min: 1.0, avg: 3.4, max: 7.0) [2024-08-05 16:54:48,119][15372] Avg episode reward: [(0, '39.215')] [2024-08-05 16:54:49,010][15444] Updated weights for policy 0, policy_version 23771 (0.0017) [2024-08-05 16:54:52,037][15444] Updated weights for policy 0, policy_version 23781 (0.0020) [2024-08-05 16:54:53,119][15372] Fps is (10 sec: 23756.5, 60 sec: 24166.3, 300 sec: 24187.2). Total num frames: 194830336. Throughput: 0: 6071.1. Samples: 48702740. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 16:54:53,126][15372] Avg episode reward: [(0, '39.133')] [2024-08-05 16:54:55,681][15444] Updated weights for policy 0, policy_version 23791 (0.0012) [2024-08-05 16:54:58,118][15372] Fps is (10 sec: 23757.2, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 194953216. Throughput: 0: 6058.7. Samples: 48738100. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 16:54:58,126][15372] Avg episode reward: [(0, '38.772')] [2024-08-05 16:54:59,013][15444] Updated weights for policy 0, policy_version 23801 (0.0012) [2024-08-05 16:55:02,465][15444] Updated weights for policy 0, policy_version 23811 (0.0020) [2024-08-05 16:55:03,119][15372] Fps is (10 sec: 23757.5, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 195067904. Throughput: 0: 6029.6. Samples: 48773860. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 16:55:03,119][15372] Avg episode reward: [(0, '39.605')] [2024-08-05 16:55:05,960][15444] Updated weights for policy 0, policy_version 23821 (0.0013) [2024-08-05 16:55:08,119][15372] Fps is (10 sec: 23755.7, 60 sec: 24166.2, 300 sec: 24187.2). Total num frames: 195190784. Throughput: 0: 6047.5. Samples: 48792010. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 16:55:08,127][15372] Avg episode reward: [(0, '39.269')] [2024-08-05 16:55:09,368][15444] Updated weights for policy 0, policy_version 23831 (0.0013) [2024-08-05 16:55:11,825][15417] Signal inference workers to stop experience collection... (8850 times) [2024-08-05 16:55:11,826][15417] Signal inference workers to resume experience collection... (8850 times) [2024-08-05 16:55:11,866][15444] InferenceWorker_p0-w0: stopping experience collection (8850 times) [2024-08-05 16:55:11,867][15444] InferenceWorker_p0-w0: resuming experience collection (8850 times) [2024-08-05 16:55:12,798][15444] Updated weights for policy 0, policy_version 23841 (0.0013) [2024-08-05 16:55:13,119][15372] Fps is (10 sec: 24575.8, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 195313664. Throughput: 0: 6044.3. Samples: 48828080. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 16:55:13,119][15372] Avg episode reward: [(0, '38.535')] [2024-08-05 16:55:16,067][15444] Updated weights for policy 0, policy_version 23851 (0.0013) [2024-08-05 16:55:18,119][15372] Fps is (10 sec: 23757.4, 60 sec: 24029.8, 300 sec: 24187.2). Total num frames: 195428352. Throughput: 0: 6026.6. Samples: 48864350. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 16:55:18,120][15372] Avg episode reward: [(0, '38.452')] [2024-08-05 16:55:19,629][15444] Updated weights for policy 0, policy_version 23861 (0.0028) [2024-08-05 16:55:22,775][15444] Updated weights for policy 0, policy_version 23871 (0.0011) [2024-08-05 16:55:23,119][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 195551232. Throughput: 0: 6004.2. Samples: 48882290. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 16:55:23,119][15372] Avg episode reward: [(0, '38.751')] [2024-08-05 16:55:26,373][15444] Updated weights for policy 0, policy_version 23881 (0.0011) [2024-08-05 16:55:28,118][15372] Fps is (10 sec: 24576.4, 60 sec: 24166.6, 300 sec: 24187.2). Total num frames: 195674112. Throughput: 0: 6005.4. Samples: 48918260. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 16:55:28,119][15372] Avg episode reward: [(0, '39.055')] [2024-08-05 16:55:29,764][15444] Updated weights for policy 0, policy_version 23891 (0.0011) [2024-08-05 16:55:33,037][15444] Updated weights for policy 0, policy_version 23901 (0.0012) [2024-08-05 16:55:33,118][15372] Fps is (10 sec: 24576.4, 60 sec: 24029.9, 300 sec: 24187.2). Total num frames: 195796992. Throughput: 0: 6017.4. Samples: 48955010. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 16:55:33,119][15372] Avg episode reward: [(0, '39.379')] [2024-08-05 16:55:36,532][15444] Updated weights for policy 0, policy_version 23911 (0.0020) [2024-08-05 16:55:38,119][15372] Fps is (10 sec: 23756.7, 60 sec: 24036.5, 300 sec: 24159.5). Total num frames: 195911680. Throughput: 0: 6020.9. Samples: 48973680. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 16:55:38,119][15372] Avg episode reward: [(0, '38.517')] [2024-08-05 16:55:39,503][15444] Updated weights for policy 0, policy_version 23921 (0.0026) [2024-08-05 16:55:43,055][15444] Updated weights for policy 0, policy_version 23931 (0.0020) [2024-08-05 16:55:43,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.5, 300 sec: 24187.2). Total num frames: 196042752. Throughput: 0: 6054.0. Samples: 49010530. Policy #0 lag: (min: 0.0, avg: 3.2, max: 8.0) [2024-08-05 16:55:43,119][15372] Avg episode reward: [(0, '38.367')] [2024-08-05 16:55:46,494][15444] Updated weights for policy 0, policy_version 23941 (0.0015) [2024-08-05 16:55:48,121][15372] Fps is (10 sec: 25389.7, 60 sec: 24165.6, 300 sec: 24187.0). Total num frames: 196165632. Throughput: 0: 6062.8. Samples: 49046700. Policy #0 lag: (min: 0.0, avg: 3.2, max: 8.0) [2024-08-05 16:55:48,128][15372] Avg episode reward: [(0, '38.031')] [2024-08-05 16:55:49,819][15444] Updated weights for policy 0, policy_version 23951 (0.0016) [2024-08-05 16:55:53,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.6, 300 sec: 24187.2). Total num frames: 196280320. Throughput: 0: 6076.3. Samples: 49065440. Policy #0 lag: (min: 0.0, avg: 3.2, max: 8.0) [2024-08-05 16:55:53,126][15372] Avg episode reward: [(0, '38.040')] [2024-08-05 16:55:53,220][15444] Updated weights for policy 0, policy_version 23961 (0.0022) [2024-08-05 16:55:56,427][15444] Updated weights for policy 0, policy_version 23971 (0.0020) [2024-08-05 16:55:58,119][15372] Fps is (10 sec: 23761.7, 60 sec: 24166.3, 300 sec: 24187.2). Total num frames: 196403200. Throughput: 0: 6069.5. Samples: 49101210. Policy #0 lag: (min: 0.0, avg: 3.8, max: 9.0) [2024-08-05 16:55:58,126][15372] Avg episode reward: [(0, '39.541')] [2024-08-05 16:55:59,303][15417] Signal inference workers to stop experience collection... (8900 times) [2024-08-05 16:55:59,304][15417] Signal inference workers to resume experience collection... (8900 times) [2024-08-05 16:55:59,371][15444] InferenceWorker_p0-w0: stopping experience collection (8900 times) [2024-08-05 16:55:59,371][15444] InferenceWorker_p0-w0: resuming experience collection (8900 times) [2024-08-05 16:55:59,873][15444] Updated weights for policy 0, policy_version 23981 (0.0010) [2024-08-05 16:56:03,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24302.9, 300 sec: 24187.3). Total num frames: 196526080. Throughput: 0: 6088.2. Samples: 49138320. Policy #0 lag: (min: 0.0, avg: 3.8, max: 9.0) [2024-08-05 16:56:03,126][15372] Avg episode reward: [(0, '39.198')] [2024-08-05 16:56:03,243][15444] Updated weights for policy 0, policy_version 23991 (0.0028) [2024-08-05 16:56:06,797][15444] Updated weights for policy 0, policy_version 24001 (0.0020) [2024-08-05 16:56:08,118][15372] Fps is (10 sec: 24576.4, 60 sec: 24303.1, 300 sec: 24159.5). Total num frames: 196648960. Throughput: 0: 6092.7. Samples: 49156460. Policy #0 lag: (min: 0.0, avg: 3.8, max: 9.0) [2024-08-05 16:56:08,119][15372] Avg episode reward: [(0, '38.814')] [2024-08-05 16:56:09,870][15444] Updated weights for policy 0, policy_version 24011 (0.0018) [2024-08-05 16:56:13,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.5, 300 sec: 24159.5). Total num frames: 196763648. Throughput: 0: 6099.3. Samples: 49192730. Policy #0 lag: (min: 0.0, avg: 3.8, max: 9.0) [2024-08-05 16:56:13,126][15372] Avg episode reward: [(0, '38.901')] [2024-08-05 16:56:13,368][15444] Updated weights for policy 0, policy_version 24021 (0.0013) [2024-08-05 16:56:17,103][15444] Updated weights for policy 0, policy_version 24031 (0.0016) [2024-08-05 16:56:18,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24439.5, 300 sec: 24187.2). Total num frames: 196894720. Throughput: 0: 6077.1. Samples: 49228480. Policy #0 lag: (min: 1.0, avg: 4.1, max: 9.0) [2024-08-05 16:56:18,119][15372] Avg episode reward: [(0, '38.512')] [2024-08-05 16:56:18,122][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000024035_196894720.pth... [2024-08-05 16:56:18,279][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000023326_191086592.pth [2024-08-05 16:56:20,030][15444] Updated weights for policy 0, policy_version 24041 (0.0027) [2024-08-05 16:56:23,119][15372] Fps is (10 sec: 24575.3, 60 sec: 24302.9, 300 sec: 24159.4). Total num frames: 197009408. Throughput: 0: 6078.0. Samples: 49247190. Policy #0 lag: (min: 1.0, avg: 4.1, max: 9.0) [2024-08-05 16:56:23,127][15372] Avg episode reward: [(0, '38.764')] [2024-08-05 16:56:23,684][15444] Updated weights for policy 0, policy_version 24051 (0.0027) [2024-08-05 16:56:26,831][15444] Updated weights for policy 0, policy_version 24061 (0.0020) [2024-08-05 16:56:28,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 197132288. Throughput: 0: 6057.8. Samples: 49283130. Policy #0 lag: (min: 1.0, avg: 4.1, max: 9.0) [2024-08-05 16:56:28,119][15372] Avg episode reward: [(0, '39.419')] [2024-08-05 16:56:30,293][15444] Updated weights for policy 0, policy_version 24071 (0.0014) [2024-08-05 16:56:33,118][15372] Fps is (10 sec: 24576.5, 60 sec: 24302.9, 300 sec: 24159.5). Total num frames: 197255168. Throughput: 0: 6063.9. Samples: 49319560. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 16:56:33,119][15372] Avg episode reward: [(0, '39.890')] [2024-08-05 16:56:33,849][15444] Updated weights for policy 0, policy_version 24081 (0.0027) [2024-08-05 16:56:37,319][15444] Updated weights for policy 0, policy_version 24091 (0.0027) [2024-08-05 16:56:38,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24303.0, 300 sec: 24159.5). Total num frames: 197369856. Throughput: 0: 6047.3. Samples: 49337570. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 16:56:38,119][15372] Avg episode reward: [(0, '38.542')] [2024-08-05 16:56:40,734][15444] Updated weights for policy 0, policy_version 24101 (0.0012) [2024-08-05 16:56:43,123][15372] Fps is (10 sec: 22926.5, 60 sec: 24027.9, 300 sec: 24131.3). Total num frames: 197484544. Throughput: 0: 6039.8. Samples: 49373030. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 16:56:43,131][15372] Avg episode reward: [(0, '38.629')] [2024-08-05 16:56:44,050][15444] Updated weights for policy 0, policy_version 24111 (0.0012) [2024-08-05 16:56:45,898][15417] Signal inference workers to stop experience collection... (8950 times) [2024-08-05 16:56:45,898][15417] Signal inference workers to resume experience collection... (8950 times) [2024-08-05 16:56:45,975][15444] InferenceWorker_p0-w0: stopping experience collection (8950 times) [2024-08-05 16:56:45,975][15444] InferenceWorker_p0-w0: resuming experience collection (8950 times) [2024-08-05 16:56:47,517][15444] Updated weights for policy 0, policy_version 24121 (0.0012) [2024-08-05 16:56:48,119][15372] Fps is (10 sec: 24575.1, 60 sec: 24167.2, 300 sec: 24187.2). Total num frames: 197615616. Throughput: 0: 6030.8. Samples: 49409710. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 16:56:48,127][15372] Avg episode reward: [(0, '39.847')] [2024-08-05 16:56:50,671][15444] Updated weights for policy 0, policy_version 24131 (0.0013) [2024-08-05 16:56:53,118][15372] Fps is (10 sec: 24588.1, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 197730304. Throughput: 0: 6038.0. Samples: 49428170. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 16:56:53,126][15372] Avg episode reward: [(0, '39.623')] [2024-08-05 16:56:54,261][15444] Updated weights for policy 0, policy_version 24141 (0.0019) [2024-08-05 16:56:57,704][15444] Updated weights for policy 0, policy_version 24151 (0.0013) [2024-08-05 16:56:58,119][15372] Fps is (10 sec: 22937.2, 60 sec: 24029.7, 300 sec: 24159.4). Total num frames: 197844992. Throughput: 0: 6019.3. Samples: 49463600. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 16:56:58,119][15372] Avg episode reward: [(0, '38.818')] [2024-08-05 16:57:00,793][15444] Updated weights for policy 0, policy_version 24161 (0.0013) [2024-08-05 16:57:03,119][15372] Fps is (10 sec: 24573.5, 60 sec: 24166.0, 300 sec: 24187.1). Total num frames: 197976064. Throughput: 0: 6029.4. Samples: 49499810. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 16:57:03,127][15372] Avg episode reward: [(0, '38.338')] [2024-08-05 16:57:04,531][15444] Updated weights for policy 0, policy_version 24171 (0.0011) [2024-08-05 16:57:07,822][15444] Updated weights for policy 0, policy_version 24181 (0.0011) [2024-08-05 16:57:08,118][15372] Fps is (10 sec: 24577.2, 60 sec: 24029.9, 300 sec: 24187.2). Total num frames: 198090752. Throughput: 0: 6004.9. Samples: 49517410. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 16:57:08,119][15372] Avg episode reward: [(0, '38.293')] [2024-08-05 16:57:11,193][15444] Updated weights for policy 0, policy_version 24191 (0.0011) [2024-08-05 16:57:13,118][15372] Fps is (10 sec: 22939.9, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 198205440. Throughput: 0: 6002.9. Samples: 49553260. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 16:57:13,126][15372] Avg episode reward: [(0, '39.655')] [2024-08-05 16:57:14,682][15444] Updated weights for policy 0, policy_version 24201 (0.0020) [2024-08-05 16:57:17,922][15444] Updated weights for policy 0, policy_version 24211 (0.0012) [2024-08-05 16:57:18,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 198336512. Throughput: 0: 6014.7. Samples: 49590220. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 16:57:18,119][15372] Avg episode reward: [(0, '39.975')] [2024-08-05 16:57:21,582][15444] Updated weights for policy 0, policy_version 24221 (0.0011) [2024-08-05 16:57:23,119][15372] Fps is (10 sec: 24575.5, 60 sec: 24029.9, 300 sec: 24187.2). Total num frames: 198451200. Throughput: 0: 6019.3. Samples: 49608440. Policy #0 lag: (min: 1.0, avg: 4.7, max: 8.0) [2024-08-05 16:57:23,119][15372] Avg episode reward: [(0, '38.854')] [2024-08-05 16:57:23,991][15417] Signal inference workers to stop experience collection... (9000 times) [2024-08-05 16:57:23,992][15417] Signal inference workers to resume experience collection... (9000 times) [2024-08-05 16:57:24,040][15444] InferenceWorker_p0-w0: stopping experience collection (9000 times) [2024-08-05 16:57:24,040][15444] InferenceWorker_p0-w0: resuming experience collection (9000 times) [2024-08-05 16:57:24,638][15444] Updated weights for policy 0, policy_version 24231 (0.0012) [2024-08-05 16:57:28,119][15372] Fps is (10 sec: 23756.3, 60 sec: 24029.8, 300 sec: 24159.4). Total num frames: 198574080. Throughput: 0: 6043.3. Samples: 49644950. Policy #0 lag: (min: 1.0, avg: 4.7, max: 8.0) [2024-08-05 16:57:28,127][15372] Avg episode reward: [(0, '39.683')] [2024-08-05 16:57:28,223][15444] Updated weights for policy 0, policy_version 24241 (0.0041) [2024-08-05 16:57:31,744][15444] Updated weights for policy 0, policy_version 24251 (0.0012) [2024-08-05 16:57:33,118][15372] Fps is (10 sec: 25395.6, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 198705152. Throughput: 0: 6029.2. Samples: 49681020. Policy #0 lag: (min: 1.0, avg: 4.7, max: 8.0) [2024-08-05 16:57:33,119][15372] Avg episode reward: [(0, '39.530')] [2024-08-05 16:57:34,805][15444] Updated weights for policy 0, policy_version 24261 (0.0012) [2024-08-05 16:57:38,119][15372] Fps is (10 sec: 23757.0, 60 sec: 24029.8, 300 sec: 24131.7). Total num frames: 198811648. Throughput: 0: 6034.2. Samples: 49699710. Policy #0 lag: (min: 1.0, avg: 3.4, max: 8.0) [2024-08-05 16:57:38,126][15372] Avg episode reward: [(0, '38.562')] [2024-08-05 16:57:38,383][15444] Updated weights for policy 0, policy_version 24271 (0.0015) [2024-08-05 16:57:41,707][15444] Updated weights for policy 0, policy_version 24281 (0.0026) [2024-08-05 16:57:43,119][15372] Fps is (10 sec: 23755.8, 60 sec: 24304.7, 300 sec: 24187.2). Total num frames: 198942720. Throughput: 0: 6050.0. Samples: 49735850. Policy #0 lag: (min: 1.0, avg: 3.4, max: 8.0) [2024-08-05 16:57:43,119][15372] Avg episode reward: [(0, '38.665')] [2024-08-05 16:57:45,071][15444] Updated weights for policy 0, policy_version 24291 (0.0019) [2024-08-05 16:57:48,118][15372] Fps is (10 sec: 24576.3, 60 sec: 24030.0, 300 sec: 24159.5). Total num frames: 199057408. Throughput: 0: 6034.1. Samples: 49771340. Policy #0 lag: (min: 1.0, avg: 3.4, max: 8.0) [2024-08-05 16:57:48,128][15372] Avg episode reward: [(0, '38.493')] [2024-08-05 16:57:48,815][15444] Updated weights for policy 0, policy_version 24301 (0.0031) [2024-08-05 16:57:51,711][15444] Updated weights for policy 0, policy_version 24311 (0.0014) [2024-08-05 16:57:53,119][15372] Fps is (10 sec: 23757.6, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 199180288. Throughput: 0: 6063.3. Samples: 49790260. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 16:57:53,119][15372] Avg episode reward: [(0, '39.153')] [2024-08-05 16:57:55,163][15417] Signal inference workers to stop experience collection... (9050 times) [2024-08-05 16:57:55,164][15417] Signal inference workers to resume experience collection... (9050 times) [2024-08-05 16:57:55,205][15444] InferenceWorker_p0-w0: stopping experience collection (9050 times) [2024-08-05 16:57:55,205][15444] InferenceWorker_p0-w0: resuming experience collection (9050 times) [2024-08-05 16:57:55,299][15444] Updated weights for policy 0, policy_version 24321 (0.0016) [2024-08-05 16:57:58,118][15372] Fps is (10 sec: 25395.3, 60 sec: 24439.7, 300 sec: 24215.0). Total num frames: 199311360. Throughput: 0: 6069.6. Samples: 49826390. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 16:57:58,119][15372] Avg episode reward: [(0, '39.687')] [2024-08-05 16:57:58,728][15444] Updated weights for policy 0, policy_version 24331 (0.0013) [2024-08-05 16:58:02,050][15444] Updated weights for policy 0, policy_version 24341 (0.0018) [2024-08-05 16:58:03,119][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.8, 300 sec: 24187.2). Total num frames: 199426048. Throughput: 0: 6039.8. Samples: 49862010. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 16:58:03,119][15372] Avg episode reward: [(0, '39.979')] [2024-08-05 16:58:05,478][15444] Updated weights for policy 0, policy_version 24351 (0.0011) [2024-08-05 16:58:08,119][15372] Fps is (10 sec: 23755.8, 60 sec: 24302.8, 300 sec: 24159.4). Total num frames: 199548928. Throughput: 0: 6049.5. Samples: 49880670. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 16:58:08,129][15372] Avg episode reward: [(0, '40.658')] [2024-08-05 16:58:08,132][15417] Saving new best policy, reward=40.658! [2024-08-05 16:58:08,853][15444] Updated weights for policy 0, policy_version 24361 (0.0014) [2024-08-05 16:58:12,205][15444] Updated weights for policy 0, policy_version 24371 (0.0015) [2024-08-05 16:58:13,119][15372] Fps is (10 sec: 23756.7, 60 sec: 24302.9, 300 sec: 24159.5). Total num frames: 199663616. Throughput: 0: 6034.2. Samples: 49916490. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 16:58:13,119][15372] Avg episode reward: [(0, '40.021')] [2024-08-05 16:58:15,702][15444] Updated weights for policy 0, policy_version 24381 (0.0013) [2024-08-05 16:58:18,119][15372] Fps is (10 sec: 23757.4, 60 sec: 24166.3, 300 sec: 24131.7). Total num frames: 199786496. Throughput: 0: 6033.5. Samples: 49952530. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 16:58:18,126][15372] Avg episode reward: [(0, '38.156')] [2024-08-05 16:58:18,130][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000024388_199786496.pth... [2024-08-05 16:58:18,285][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000023680_193986560.pth [2024-08-05 16:58:19,204][15444] Updated weights for policy 0, policy_version 24391 (0.0025) [2024-08-05 16:58:22,432][15444] Updated weights for policy 0, policy_version 24401 (0.0024) [2024-08-05 16:58:23,118][15372] Fps is (10 sec: 23757.1, 60 sec: 24166.5, 300 sec: 24131.8). Total num frames: 199901184. Throughput: 0: 6011.8. Samples: 49970240. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 16:58:23,127][15372] Avg episode reward: [(0, '40.106')] [2024-08-05 16:58:25,839][15444] Updated weights for policy 0, policy_version 24411 (0.0019) [2024-08-05 16:58:28,119][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.4, 300 sec: 24159.4). Total num frames: 200024064. Throughput: 0: 6014.3. Samples: 50006490. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 16:58:28,126][15372] Avg episode reward: [(0, '39.092')] [2024-08-05 16:58:29,362][15444] Updated weights for policy 0, policy_version 24421 (0.0020) [2024-08-05 16:58:32,718][15444] Updated weights for policy 0, policy_version 24431 (0.0028) [2024-08-05 16:58:33,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 200146944. Throughput: 0: 6024.4. Samples: 50042440. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 16:58:33,119][15372] Avg episode reward: [(0, '38.864')] [2024-08-05 16:58:36,195][15444] Updated weights for policy 0, policy_version 24441 (0.0019) [2024-08-05 16:58:38,119][15372] Fps is (10 sec: 23756.3, 60 sec: 24166.3, 300 sec: 24131.7). Total num frames: 200261632. Throughput: 0: 6012.6. Samples: 50060830. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 16:58:38,126][15372] Avg episode reward: [(0, '38.933')] [2024-08-05 16:58:39,315][15444] Updated weights for policy 0, policy_version 24451 (0.0015) [2024-08-05 16:58:42,984][15444] Updated weights for policy 0, policy_version 24461 (0.0022) [2024-08-05 16:58:43,119][15372] Fps is (10 sec: 23756.7, 60 sec: 24030.0, 300 sec: 24159.5). Total num frames: 200384512. Throughput: 0: 6017.5. Samples: 50097180. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 16:58:43,119][15372] Avg episode reward: [(0, '40.476')] [2024-08-05 16:58:45,544][15417] Signal inference workers to stop experience collection... (9100 times) [2024-08-05 16:58:45,549][15417] Signal inference workers to resume experience collection... (9100 times) [2024-08-05 16:58:45,623][15444] InferenceWorker_p0-w0: stopping experience collection (9100 times) [2024-08-05 16:58:45,624][15444] InferenceWorker_p0-w0: resuming experience collection (9100 times) [2024-08-05 16:58:46,255][15444] Updated weights for policy 0, policy_version 24471 (0.0020) [2024-08-05 16:58:48,119][15372] Fps is (10 sec: 24576.5, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 200507392. Throughput: 0: 6039.6. Samples: 50133790. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 16:58:48,119][15372] Avg episode reward: [(0, '40.429')] [2024-08-05 16:58:49,388][15444] Updated weights for policy 0, policy_version 24481 (0.0026) [2024-08-05 16:58:52,971][15444] Updated weights for policy 0, policy_version 24491 (0.0020) [2024-08-05 16:58:53,119][15372] Fps is (10 sec: 24575.9, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 200630272. Throughput: 0: 6040.7. Samples: 50152500. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 16:58:53,119][15372] Avg episode reward: [(0, '39.758')] [2024-08-05 16:58:56,196][15444] Updated weights for policy 0, policy_version 24501 (0.0011) [2024-08-05 16:58:58,119][15372] Fps is (10 sec: 25395.2, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 200761344. Throughput: 0: 6048.7. Samples: 50188680. Policy #0 lag: (min: 0.0, avg: 4.8, max: 8.0) [2024-08-05 16:58:58,126][15372] Avg episode reward: [(0, '38.310')] [2024-08-05 16:58:59,600][15444] Updated weights for policy 0, policy_version 24511 (0.0015) [2024-08-05 16:59:03,119][15372] Fps is (10 sec: 23756.9, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 200867840. Throughput: 0: 6057.3. Samples: 50225110. Policy #0 lag: (min: 0.0, avg: 4.8, max: 8.0) [2024-08-05 16:59:03,126][15372] Avg episode reward: [(0, '39.964')] [2024-08-05 16:59:03,163][15444] Updated weights for policy 0, policy_version 24521 (0.0033) [2024-08-05 16:59:06,196][15444] Updated weights for policy 0, policy_version 24531 (0.0012) [2024-08-05 16:59:08,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.6, 300 sec: 24187.2). Total num frames: 200998912. Throughput: 0: 6072.2. Samples: 50243490. Policy #0 lag: (min: 0.0, avg: 4.8, max: 8.0) [2024-08-05 16:59:08,126][15372] Avg episode reward: [(0, '39.637')] [2024-08-05 16:59:09,795][15444] Updated weights for policy 0, policy_version 24541 (0.0013) [2024-08-05 16:59:13,035][15444] Updated weights for policy 0, policy_version 24551 (0.0013) [2024-08-05 16:59:13,119][15372] Fps is (10 sec: 25394.2, 60 sec: 24302.8, 300 sec: 24187.2). Total num frames: 201121792. Throughput: 0: 6089.1. Samples: 50280500. Policy #0 lag: (min: 1.0, avg: 3.9, max: 8.0) [2024-08-05 16:59:13,119][15372] Avg episode reward: [(0, '39.134')] [2024-08-05 16:59:16,508][15444] Updated weights for policy 0, policy_version 24561 (0.0029) [2024-08-05 16:59:18,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.5, 300 sec: 24187.2). Total num frames: 201236480. Throughput: 0: 6082.7. Samples: 50316160. Policy #0 lag: (min: 1.0, avg: 3.9, max: 8.0) [2024-08-05 16:59:18,126][15372] Avg episode reward: [(0, '39.545')] [2024-08-05 16:59:19,873][15444] Updated weights for policy 0, policy_version 24571 (0.0012) [2024-08-05 16:59:23,118][15372] Fps is (10 sec: 23758.0, 60 sec: 24302.9, 300 sec: 24187.3). Total num frames: 201359360. Throughput: 0: 6090.7. Samples: 50334910. Policy #0 lag: (min: 1.0, avg: 3.9, max: 8.0) [2024-08-05 16:59:23,126][15372] Avg episode reward: [(0, '39.150')] [2024-08-05 16:59:23,123][15444] Updated weights for policy 0, policy_version 24581 (0.0011) [2024-08-05 16:59:26,658][15444] Updated weights for policy 0, policy_version 24591 (0.0026) [2024-08-05 16:59:28,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24303.0, 300 sec: 24159.5). Total num frames: 201482240. Throughput: 0: 6084.7. Samples: 50370990. Policy #0 lag: (min: 1.0, avg: 3.9, max: 8.0) [2024-08-05 16:59:28,126][15372] Avg episode reward: [(0, '38.462')] [2024-08-05 16:59:29,922][15444] Updated weights for policy 0, policy_version 24601 (0.0012) [2024-08-05 16:59:33,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24302.9, 300 sec: 24188.6). Total num frames: 201605120. Throughput: 0: 6082.7. Samples: 50407510. Policy #0 lag: (min: 1.0, avg: 3.4, max: 7.0) [2024-08-05 16:59:33,126][15372] Avg episode reward: [(0, '38.540')] [2024-08-05 16:59:33,355][15444] Updated weights for policy 0, policy_version 24611 (0.0014) [2024-08-05 16:59:35,512][15417] Signal inference workers to stop experience collection... (9150 times) [2024-08-05 16:59:35,513][15417] Signal inference workers to resume experience collection... (9150 times) [2024-08-05 16:59:35,558][15444] InferenceWorker_p0-w0: stopping experience collection (9150 times) [2024-08-05 16:59:35,558][15444] InferenceWorker_p0-w0: resuming experience collection (9150 times) [2024-08-05 16:59:36,681][15444] Updated weights for policy 0, policy_version 24621 (0.0011) [2024-08-05 16:59:38,121][15372] Fps is (10 sec: 25389.5, 60 sec: 24575.2, 300 sec: 24214.8). Total num frames: 201736192. Throughput: 0: 6080.4. Samples: 50426130. Policy #0 lag: (min: 1.0, avg: 3.4, max: 7.0) [2024-08-05 16:59:38,121][15372] Avg episode reward: [(0, '37.256')] [2024-08-05 16:59:39,907][15444] Updated weights for policy 0, policy_version 24631 (0.0012) [2024-08-05 16:59:43,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24439.5, 300 sec: 24187.2). Total num frames: 201850880. Throughput: 0: 6101.8. Samples: 50463260. Policy #0 lag: (min: 1.0, avg: 3.4, max: 7.0) [2024-08-05 16:59:43,119][15372] Avg episode reward: [(0, '37.819')] [2024-08-05 16:59:43,443][15444] Updated weights for policy 0, policy_version 24641 (0.0012) [2024-08-05 16:59:46,387][15444] Updated weights for policy 0, policy_version 24651 (0.0012) [2024-08-05 16:59:48,118][15372] Fps is (10 sec: 23762.1, 60 sec: 24439.5, 300 sec: 24215.0). Total num frames: 201973760. Throughput: 0: 6087.1. Samples: 50499030. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 16:59:48,126][15372] Avg episode reward: [(0, '38.871')] [2024-08-05 16:59:50,023][15444] Updated weights for policy 0, policy_version 24661 (0.0021) [2024-08-05 16:59:53,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24439.5, 300 sec: 24215.0). Total num frames: 202096640. Throughput: 0: 6088.7. Samples: 50517480. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 16:59:53,126][15372] Avg episode reward: [(0, '39.537')] [2024-08-05 16:59:53,526][15444] Updated weights for policy 0, policy_version 24671 (0.0024) [2024-08-05 16:59:56,753][15444] Updated weights for policy 0, policy_version 24681 (0.0026) [2024-08-05 16:59:58,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 202211328. Throughput: 0: 6077.0. Samples: 50553960. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 16:59:58,119][15372] Avg episode reward: [(0, '39.690')] [2024-08-05 17:00:00,124][15444] Updated weights for policy 0, policy_version 24691 (0.0026) [2024-08-05 17:00:03,126][15372] Fps is (10 sec: 24557.3, 60 sec: 24572.9, 300 sec: 24242.2). Total num frames: 202342400. Throughput: 0: 6095.2. Samples: 50590490. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 17:00:03,134][15372] Avg episode reward: [(0, '39.776')] [2024-08-05 17:00:03,456][15444] Updated weights for policy 0, policy_version 24701 (0.0017) [2024-08-05 17:00:06,984][15444] Updated weights for policy 0, policy_version 24711 (0.0011) [2024-08-05 17:00:08,119][15372] Fps is (10 sec: 24574.8, 60 sec: 24302.7, 300 sec: 24215.0). Total num frames: 202457088. Throughput: 0: 6090.2. Samples: 50608970. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 17:00:08,119][15372] Avg episode reward: [(0, '38.540')] [2024-08-05 17:00:10,363][15444] Updated weights for policy 0, policy_version 24721 (0.0012) [2024-08-05 17:00:13,118][15372] Fps is (10 sec: 23775.0, 60 sec: 24303.1, 300 sec: 24242.8). Total num frames: 202579968. Throughput: 0: 6103.1. Samples: 50645630. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 17:00:13,119][15372] Avg episode reward: [(0, '38.606')] [2024-08-05 17:00:13,492][15444] Updated weights for policy 0, policy_version 24731 (0.0011) [2024-08-05 17:00:17,038][15444] Updated weights for policy 0, policy_version 24741 (0.0021) [2024-08-05 17:00:18,119][15372] Fps is (10 sec: 24576.5, 60 sec: 24439.4, 300 sec: 24242.8). Total num frames: 202702848. Throughput: 0: 6094.9. Samples: 50681780. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 17:00:18,119][15372] Avg episode reward: [(0, '39.658')] [2024-08-05 17:00:18,192][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000024745_202711040.pth... [2024-08-05 17:00:18,312][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000024035_196894720.pth [2024-08-05 17:00:20,338][15444] Updated weights for policy 0, policy_version 24751 (0.0016) [2024-08-05 17:00:23,119][15372] Fps is (10 sec: 23755.4, 60 sec: 24302.7, 300 sec: 24215.0). Total num frames: 202817536. Throughput: 0: 6088.9. Samples: 50700120. Policy #0 lag: (min: 0.0, avg: 4.4, max: 8.0) [2024-08-05 17:00:23,120][15372] Avg episode reward: [(0, '39.974')] [2024-08-05 17:00:23,855][15444] Updated weights for policy 0, policy_version 24761 (0.0013) [2024-08-05 17:00:23,995][15417] Signal inference workers to stop experience collection... (9200 times) [2024-08-05 17:00:23,996][15417] Signal inference workers to resume experience collection... (9200 times) [2024-08-05 17:00:24,041][15444] InferenceWorker_p0-w0: stopping experience collection (9200 times) [2024-08-05 17:00:24,050][15444] InferenceWorker_p0-w0: resuming experience collection (9200 times) [2024-08-05 17:00:26,911][15444] Updated weights for policy 0, policy_version 24771 (0.0012) [2024-08-05 17:00:28,119][15372] Fps is (10 sec: 24576.7, 60 sec: 24439.5, 300 sec: 24242.8). Total num frames: 202948608. Throughput: 0: 6087.6. Samples: 50737200. Policy #0 lag: (min: 0.0, avg: 4.4, max: 8.0) [2024-08-05 17:00:28,126][15372] Avg episode reward: [(0, '38.786')] [2024-08-05 17:00:30,312][15444] Updated weights for policy 0, policy_version 24781 (0.0025) [2024-08-05 17:00:33,119][15372] Fps is (10 sec: 25396.4, 60 sec: 24439.4, 300 sec: 24270.5). Total num frames: 203071488. Throughput: 0: 6098.9. Samples: 50773480. Policy #0 lag: (min: 0.0, avg: 4.4, max: 8.0) [2024-08-05 17:00:33,119][15372] Avg episode reward: [(0, '38.259')] [2024-08-05 17:00:33,776][15444] Updated weights for policy 0, policy_version 24791 (0.0023) [2024-08-05 17:00:37,225][15444] Updated weights for policy 0, policy_version 24801 (0.0011) [2024-08-05 17:00:38,120][15372] Fps is (10 sec: 24571.4, 60 sec: 24303.1, 300 sec: 24242.6). Total num frames: 203194368. Throughput: 0: 6083.7. Samples: 50791260. Policy #0 lag: (min: 1.0, avg: 3.4, max: 7.0) [2024-08-05 17:00:38,121][15372] Avg episode reward: [(0, '39.380')] [2024-08-05 17:00:40,513][15444] Updated weights for policy 0, policy_version 24811 (0.0026) [2024-08-05 17:00:43,118][15372] Fps is (10 sec: 23757.1, 60 sec: 24302.9, 300 sec: 24215.2). Total num frames: 203309056. Throughput: 0: 6078.0. Samples: 50827470. Policy #0 lag: (min: 1.0, avg: 3.4, max: 7.0) [2024-08-05 17:00:43,126][15372] Avg episode reward: [(0, '39.492')] [2024-08-05 17:00:43,984][15444] Updated weights for policy 0, policy_version 24821 (0.0013) [2024-08-05 17:00:47,450][15444] Updated weights for policy 0, policy_version 24831 (0.0023) [2024-08-05 17:00:48,118][15372] Fps is (10 sec: 22941.9, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 203423744. Throughput: 0: 6072.6. Samples: 50863710. Policy #0 lag: (min: 1.0, avg: 3.4, max: 7.0) [2024-08-05 17:00:48,119][15372] Avg episode reward: [(0, '39.218')] [2024-08-05 17:00:50,687][15444] Updated weights for policy 0, policy_version 24841 (0.0019) [2024-08-05 17:00:53,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 203546624. Throughput: 0: 6070.3. Samples: 50882130. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 17:00:53,126][15372] Avg episode reward: [(0, '39.902')] [2024-08-05 17:00:54,251][15444] Updated weights for policy 0, policy_version 24851 (0.0011) [2024-08-05 17:00:57,715][15444] Updated weights for policy 0, policy_version 24861 (0.0011) [2024-08-05 17:00:58,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 203669504. Throughput: 0: 6058.0. Samples: 50918240. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 17:00:58,119][15372] Avg episode reward: [(0, '40.266')] [2024-08-05 17:01:00,830][15444] Updated weights for policy 0, policy_version 24871 (0.0020) [2024-08-05 17:01:03,119][15372] Fps is (10 sec: 24575.6, 60 sec: 24169.4, 300 sec: 24215.0). Total num frames: 203792384. Throughput: 0: 6048.0. Samples: 50953940. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 17:01:03,126][15372] Avg episode reward: [(0, '40.081')] [2024-08-05 17:01:04,401][15444] Updated weights for policy 0, policy_version 24881 (0.0023) [2024-08-05 17:01:08,017][15444] Updated weights for policy 0, policy_version 24891 (0.0020) [2024-08-05 17:01:08,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.6, 300 sec: 24215.0). Total num frames: 203907072. Throughput: 0: 6038.5. Samples: 50971850. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 17:01:08,119][15372] Avg episode reward: [(0, '38.876')] [2024-08-05 17:01:11,158][15444] Updated weights for policy 0, policy_version 24901 (0.0017) [2024-08-05 17:01:13,119][15372] Fps is (10 sec: 24575.3, 60 sec: 24302.7, 300 sec: 24215.0). Total num frames: 204038144. Throughput: 0: 6019.5. Samples: 51008080. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 17:01:13,127][15372] Avg episode reward: [(0, '39.435')] [2024-08-05 17:01:14,563][15444] Updated weights for policy 0, policy_version 24911 (0.0032) [2024-08-05 17:01:17,760][15444] Updated weights for policy 0, policy_version 24921 (0.0018) [2024-08-05 17:01:18,119][15372] Fps is (10 sec: 24575.8, 60 sec: 24166.5, 300 sec: 24215.0). Total num frames: 204152832. Throughput: 0: 6027.3. Samples: 51044710. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 17:01:18,119][15372] Avg episode reward: [(0, '40.248')] [2024-08-05 17:01:21,195][15444] Updated weights for policy 0, policy_version 24931 (0.0013) [2024-08-05 17:01:23,118][15372] Fps is (10 sec: 23758.0, 60 sec: 24303.2, 300 sec: 24215.0). Total num frames: 204275712. Throughput: 0: 6051.4. Samples: 51063560. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 17:01:23,126][15372] Avg episode reward: [(0, '39.238')] [2024-08-05 17:01:24,627][15444] Updated weights for policy 0, policy_version 24941 (0.0024) [2024-08-05 17:01:28,096][15444] Updated weights for policy 0, policy_version 24951 (0.0012) [2024-08-05 17:01:28,119][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 204398592. Throughput: 0: 6062.2. Samples: 51100270. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 17:01:28,119][15372] Avg episode reward: [(0, '39.937')] [2024-08-05 17:01:28,197][15417] Signal inference workers to stop experience collection... (9250 times) [2024-08-05 17:01:28,204][15417] Signal inference workers to resume experience collection... (9250 times) [2024-08-05 17:01:28,266][15444] InferenceWorker_p0-w0: stopping experience collection (9250 times) [2024-08-05 17:01:28,267][15444] InferenceWorker_p0-w0: resuming experience collection (9250 times) [2024-08-05 17:01:31,138][15444] Updated weights for policy 0, policy_version 24961 (0.0011) [2024-08-05 17:01:33,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.4, 300 sec: 24242.8). Total num frames: 204521472. Throughput: 0: 6061.8. Samples: 51136490. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 17:01:33,126][15372] Avg episode reward: [(0, '39.902')] [2024-08-05 17:01:34,737][15444] Updated weights for policy 0, policy_version 24971 (0.0017) [2024-08-05 17:01:37,821][15444] Updated weights for policy 0, policy_version 24981 (0.0021) [2024-08-05 17:01:38,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24167.2, 300 sec: 24270.9). Total num frames: 204644352. Throughput: 0: 6064.9. Samples: 51155050. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 17:01:38,119][15372] Avg episode reward: [(0, '39.391')] [2024-08-05 17:01:41,501][15444] Updated weights for policy 0, policy_version 24991 (0.0011) [2024-08-05 17:01:43,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 204759040. Throughput: 0: 6063.1. Samples: 51191080. Policy #0 lag: (min: 1.0, avg: 4.5, max: 8.0) [2024-08-05 17:01:43,119][15372] Avg episode reward: [(0, '39.556')] [2024-08-05 17:01:44,837][15444] Updated weights for policy 0, policy_version 25001 (0.0010) [2024-08-05 17:01:48,080][15444] Updated weights for policy 0, policy_version 25011 (0.0019) [2024-08-05 17:01:48,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24439.5, 300 sec: 24270.5). Total num frames: 204890112. Throughput: 0: 6078.0. Samples: 51227450. Policy #0 lag: (min: 1.0, avg: 4.5, max: 8.0) [2024-08-05 17:01:48,119][15372] Avg episode reward: [(0, '40.447')] [2024-08-05 17:01:51,629][15444] Updated weights for policy 0, policy_version 25021 (0.0020) [2024-08-05 17:01:53,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24303.0, 300 sec: 24270.6). Total num frames: 205004800. Throughput: 0: 6089.3. Samples: 51245870. Policy #0 lag: (min: 1.0, avg: 4.5, max: 8.0) [2024-08-05 17:01:53,119][15372] Avg episode reward: [(0, '38.720')] [2024-08-05 17:01:54,773][15444] Updated weights for policy 0, policy_version 25031 (0.0021) [2024-08-05 17:01:58,119][15372] Fps is (10 sec: 23756.3, 60 sec: 24302.9, 300 sec: 24242.8). Total num frames: 205127680. Throughput: 0: 6095.1. Samples: 51282360. Policy #0 lag: (min: 1.0, avg: 4.5, max: 8.0) [2024-08-05 17:01:58,127][15372] Avg episode reward: [(0, '39.415')] [2024-08-05 17:01:58,415][15444] Updated weights for policy 0, policy_version 25041 (0.0013) [2024-08-05 17:02:01,730][15444] Updated weights for policy 0, policy_version 25051 (0.0019) [2024-08-05 17:02:03,118][15372] Fps is (10 sec: 24575.8, 60 sec: 24303.0, 300 sec: 24270.5). Total num frames: 205250560. Throughput: 0: 6084.7. Samples: 51318520. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 17:02:03,119][15372] Avg episode reward: [(0, '39.552')] [2024-08-05 17:02:04,988][15444] Updated weights for policy 0, policy_version 25061 (0.0016) [2024-08-05 17:02:08,118][15372] Fps is (10 sec: 24576.3, 60 sec: 24439.4, 300 sec: 24298.3). Total num frames: 205373440. Throughput: 0: 6076.9. Samples: 51337020. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 17:02:08,126][15372] Avg episode reward: [(0, '39.180')] [2024-08-05 17:02:08,446][15444] Updated weights for policy 0, policy_version 25071 (0.0018) [2024-08-05 17:02:11,695][15444] Updated weights for policy 0, policy_version 25081 (0.0019) [2024-08-05 17:02:13,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24303.1, 300 sec: 24270.5). Total num frames: 205496320. Throughput: 0: 6075.3. Samples: 51373660. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 17:02:13,119][15372] Avg episode reward: [(0, '39.065')] [2024-08-05 17:02:15,148][15444] Updated weights for policy 0, policy_version 25091 (0.0024) [2024-08-05 17:02:18,119][15372] Fps is (10 sec: 24575.3, 60 sec: 24439.4, 300 sec: 24298.3). Total num frames: 205619200. Throughput: 0: 6078.4. Samples: 51410020. Policy #0 lag: (min: 1.0, avg: 3.5, max: 7.0) [2024-08-05 17:02:18,127][15372] Avg episode reward: [(0, '38.927')] [2024-08-05 17:02:18,129][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000025100_205619200.pth... [2024-08-05 17:02:18,269][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000024388_199786496.pth [2024-08-05 17:02:18,449][15444] Updated weights for policy 0, policy_version 25101 (0.0026) [2024-08-05 17:02:21,962][15444] Updated weights for policy 0, policy_version 25111 (0.0012) [2024-08-05 17:02:23,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24302.9, 300 sec: 24270.6). Total num frames: 205733888. Throughput: 0: 6073.8. Samples: 51428370. Policy #0 lag: (min: 1.0, avg: 3.5, max: 7.0) [2024-08-05 17:02:23,119][15372] Avg episode reward: [(0, '39.528')] [2024-08-05 17:02:25,360][15444] Updated weights for policy 0, policy_version 25121 (0.0018) [2024-08-05 17:02:28,119][15372] Fps is (10 sec: 23757.2, 60 sec: 24302.9, 300 sec: 24242.8). Total num frames: 205856768. Throughput: 0: 6087.1. Samples: 51465000. Policy #0 lag: (min: 1.0, avg: 3.5, max: 7.0) [2024-08-05 17:02:28,119][15372] Avg episode reward: [(0, '39.759')] [2024-08-05 17:02:28,564][15444] Updated weights for policy 0, policy_version 25131 (0.0012) [2024-08-05 17:02:31,056][15417] Signal inference workers to stop experience collection... (9300 times) [2024-08-05 17:02:31,058][15417] Signal inference workers to resume experience collection... (9300 times) [2024-08-05 17:02:31,096][15444] InferenceWorker_p0-w0: stopping experience collection (9300 times) [2024-08-05 17:02:31,100][15444] InferenceWorker_p0-w0: resuming experience collection (9300 times) [2024-08-05 17:02:32,316][15444] Updated weights for policy 0, policy_version 25141 (0.0021) [2024-08-05 17:02:33,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24302.9, 300 sec: 24298.3). Total num frames: 205979648. Throughput: 0: 6074.4. Samples: 51500800. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 17:02:33,119][15372] Avg episode reward: [(0, '40.257')] [2024-08-05 17:02:35,303][15444] Updated weights for policy 0, policy_version 25151 (0.0012) [2024-08-05 17:02:38,118][15372] Fps is (10 sec: 24576.4, 60 sec: 24302.9, 300 sec: 24270.6). Total num frames: 206102528. Throughput: 0: 6076.9. Samples: 51519330. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 17:02:38,119][15372] Avg episode reward: [(0, '40.402')] [2024-08-05 17:02:38,899][15444] Updated weights for policy 0, policy_version 25161 (0.0010) [2024-08-05 17:02:42,291][15444] Updated weights for policy 0, policy_version 25171 (0.0018) [2024-08-05 17:02:43,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24439.5, 300 sec: 24298.3). Total num frames: 206225408. Throughput: 0: 6068.7. Samples: 51555450. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 17:02:43,119][15372] Avg episode reward: [(0, '39.688')] [2024-08-05 17:02:45,390][15444] Updated weights for policy 0, policy_version 25181 (0.0019) [2024-08-05 17:02:48,120][15372] Fps is (10 sec: 24571.1, 60 sec: 24302.1, 300 sec: 24298.1). Total num frames: 206348288. Throughput: 0: 6094.2. Samples: 51592770. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 17:02:48,122][15372] Avg episode reward: [(0, '39.563')] [2024-08-05 17:02:48,728][15444] Updated weights for policy 0, policy_version 25191 (0.0010) [2024-08-05 17:02:52,036][15444] Updated weights for policy 0, policy_version 25201 (0.0012) [2024-08-05 17:02:53,119][15372] Fps is (10 sec: 23756.4, 60 sec: 24302.8, 300 sec: 24242.7). Total num frames: 206462976. Throughput: 0: 6091.8. Samples: 51611150. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 17:02:53,119][15372] Avg episode reward: [(0, '39.831')] [2024-08-05 17:02:55,693][15444] Updated weights for policy 0, policy_version 25211 (0.0021) [2024-08-05 17:02:58,119][15372] Fps is (10 sec: 23761.1, 60 sec: 24302.9, 300 sec: 24270.5). Total num frames: 206585856. Throughput: 0: 6078.6. Samples: 51647200. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 17:02:58,119][15372] Avg episode reward: [(0, '39.211')] [2024-08-05 17:02:58,943][15444] Updated weights for policy 0, policy_version 25221 (0.0012) [2024-08-05 17:03:02,416][15444] Updated weights for policy 0, policy_version 25231 (0.0036) [2024-08-05 17:03:03,118][15372] Fps is (10 sec: 24576.5, 60 sec: 24303.0, 300 sec: 24270.6). Total num frames: 206708736. Throughput: 0: 6057.6. Samples: 51682610. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 17:03:03,119][15372] Avg episode reward: [(0, '39.044')] [2024-08-05 17:03:06,030][15444] Updated weights for policy 0, policy_version 25241 (0.0010) [2024-08-05 17:03:08,118][15372] Fps is (10 sec: 23757.3, 60 sec: 24166.4, 300 sec: 24270.5). Total num frames: 206823424. Throughput: 0: 6072.0. Samples: 51701610. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 17:03:08,119][15372] Avg episode reward: [(0, '38.912')] [2024-08-05 17:03:08,983][15444] Updated weights for policy 0, policy_version 25251 (0.0018) [2024-08-05 17:03:12,589][15444] Updated weights for policy 0, policy_version 25261 (0.0018) [2024-08-05 17:03:13,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.4, 300 sec: 24270.6). Total num frames: 206946304. Throughput: 0: 6059.4. Samples: 51737670. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 17:03:13,119][15372] Avg episode reward: [(0, '38.695')] [2024-08-05 17:03:14,790][15417] Signal inference workers to stop experience collection... (9350 times) [2024-08-05 17:03:14,790][15417] Signal inference workers to resume experience collection... (9350 times) [2024-08-05 17:03:14,855][15444] InferenceWorker_p0-w0: stopping experience collection (9350 times) [2024-08-05 17:03:14,855][15444] InferenceWorker_p0-w0: resuming experience collection (9350 times) [2024-08-05 17:03:15,755][15444] Updated weights for policy 0, policy_version 25271 (0.0012) [2024-08-05 17:03:18,119][15372] Fps is (10 sec: 25395.0, 60 sec: 24303.0, 300 sec: 24326.1). Total num frames: 207077376. Throughput: 0: 6078.9. Samples: 51774350. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 17:03:18,119][15372] Avg episode reward: [(0, '38.996')] [2024-08-05 17:03:19,116][15444] Updated weights for policy 0, policy_version 25281 (0.0020) [2024-08-05 17:03:22,867][15444] Updated weights for policy 0, policy_version 25291 (0.0016) [2024-08-05 17:03:23,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24302.9, 300 sec: 24298.3). Total num frames: 207192064. Throughput: 0: 6099.1. Samples: 51793790. Policy #0 lag: (min: 0.0, avg: 4.6, max: 8.0) [2024-08-05 17:03:23,119][15372] Avg episode reward: [(0, '39.211')] [2024-08-05 17:03:25,735][15444] Updated weights for policy 0, policy_version 25301 (0.0030) [2024-08-05 17:03:28,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24439.6, 300 sec: 24326.1). Total num frames: 207323136. Throughput: 0: 6083.3. Samples: 51829200. Policy #0 lag: (min: 0.0, avg: 4.6, max: 8.0) [2024-08-05 17:03:28,126][15372] Avg episode reward: [(0, '37.989')] [2024-08-05 17:03:29,557][15444] Updated weights for policy 0, policy_version 25311 (0.0030) [2024-08-05 17:03:32,583][15444] Updated weights for policy 0, policy_version 25321 (0.0016) [2024-08-05 17:03:33,119][15372] Fps is (10 sec: 24575.8, 60 sec: 24302.9, 300 sec: 24326.1). Total num frames: 207437824. Throughput: 0: 6039.4. Samples: 51864530. Policy #0 lag: (min: 0.0, avg: 4.6, max: 8.0) [2024-08-05 17:03:33,119][15372] Avg episode reward: [(0, '38.752')] [2024-08-05 17:03:36,223][15444] Updated weights for policy 0, policy_version 25331 (0.0026) [2024-08-05 17:03:38,119][15372] Fps is (10 sec: 22937.3, 60 sec: 24166.4, 300 sec: 24298.3). Total num frames: 207552512. Throughput: 0: 6054.9. Samples: 51883620. Policy #0 lag: (min: 0.0, avg: 4.6, max: 8.0) [2024-08-05 17:03:38,119][15372] Avg episode reward: [(0, '38.879')] [2024-08-05 17:03:39,667][15444] Updated weights for policy 0, policy_version 25341 (0.0019) [2024-08-05 17:03:42,917][15444] Updated weights for policy 0, policy_version 25351 (0.0019) [2024-08-05 17:03:43,118][15372] Fps is (10 sec: 23757.1, 60 sec: 24166.4, 300 sec: 24298.3). Total num frames: 207675392. Throughput: 0: 6060.7. Samples: 51919930. Policy #0 lag: (min: 0.0, avg: 4.9, max: 8.0) [2024-08-05 17:03:43,119][15372] Avg episode reward: [(0, '38.013')] [2024-08-05 17:03:44,330][15417] Signal inference workers to stop experience collection... (9400 times) [2024-08-05 17:03:44,335][15417] Signal inference workers to resume experience collection... (9400 times) [2024-08-05 17:03:44,378][15444] InferenceWorker_p0-w0: stopping experience collection (9400 times) [2024-08-05 17:03:44,378][15444] InferenceWorker_p0-w0: resuming experience collection (9400 times) [2024-08-05 17:03:46,497][15444] Updated weights for policy 0, policy_version 25361 (0.0013) [2024-08-05 17:03:48,118][15372] Fps is (10 sec: 24576.3, 60 sec: 24167.2, 300 sec: 24298.3). Total num frames: 207798272. Throughput: 0: 6066.2. Samples: 51955590. Policy #0 lag: (min: 0.0, avg: 4.9, max: 8.0) [2024-08-05 17:03:48,119][15372] Avg episode reward: [(0, '38.428')] [2024-08-05 17:03:49,453][15444] Updated weights for policy 0, policy_version 25371 (0.0023) [2024-08-05 17:03:53,118][15372] Fps is (10 sec: 24575.8, 60 sec: 24303.0, 300 sec: 24270.5). Total num frames: 207921152. Throughput: 0: 6064.7. Samples: 51974520. Policy #0 lag: (min: 0.0, avg: 4.9, max: 8.0) [2024-08-05 17:03:53,126][15372] Avg episode reward: [(0, '39.341')] [2024-08-05 17:03:53,127][15444] Updated weights for policy 0, policy_version 25381 (0.0010) [2024-08-05 17:03:56,420][15444] Updated weights for policy 0, policy_version 25391 (0.0012) [2024-08-05 17:03:58,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24303.0, 300 sec: 24326.1). Total num frames: 208044032. Throughput: 0: 6060.7. Samples: 52010400. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 17:03:58,126][15372] Avg episode reward: [(0, '39.145')] [2024-08-05 17:03:59,714][15444] Updated weights for policy 0, policy_version 25401 (0.0018) [2024-08-05 17:04:03,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.4, 300 sec: 24270.5). Total num frames: 208158720. Throughput: 0: 6050.0. Samples: 52046600. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 17:04:03,126][15372] Avg episode reward: [(0, '39.073')] [2024-08-05 17:04:03,415][15444] Updated weights for policy 0, policy_version 25411 (0.0014) [2024-08-05 17:04:06,529][15444] Updated weights for policy 0, policy_version 25421 (0.0017) [2024-08-05 17:04:08,119][15372] Fps is (10 sec: 23756.2, 60 sec: 24302.8, 300 sec: 24270.6). Total num frames: 208281600. Throughput: 0: 6034.2. Samples: 52065330. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 17:04:08,127][15372] Avg episode reward: [(0, '39.930')] [2024-08-05 17:04:09,982][15444] Updated weights for policy 0, policy_version 25431 (0.0017) [2024-08-05 17:04:13,122][15372] Fps is (10 sec: 24567.8, 60 sec: 24301.6, 300 sec: 24298.0). Total num frames: 208404480. Throughput: 0: 6064.7. Samples: 52102130. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 17:04:13,130][15372] Avg episode reward: [(0, '39.883')] [2024-08-05 17:04:13,423][15444] Updated weights for policy 0, policy_version 25441 (0.0019) [2024-08-05 17:04:16,182][15417] Signal inference workers to stop experience collection... (9450 times) [2024-08-05 17:04:16,182][15417] Signal inference workers to resume experience collection... (9450 times) [2024-08-05 17:04:16,242][15444] InferenceWorker_p0-w0: stopping experience collection (9450 times) [2024-08-05 17:04:16,245][15444] InferenceWorker_p0-w0: resuming experience collection (9450 times) [2024-08-05 17:04:16,493][15444] Updated weights for policy 0, policy_version 25451 (0.0016) [2024-08-05 17:04:18,119][15372] Fps is (10 sec: 24575.8, 60 sec: 24166.3, 300 sec: 24298.3). Total num frames: 208527360. Throughput: 0: 6077.5. Samples: 52138020. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 17:04:18,126][15372] Avg episode reward: [(0, '38.731')] [2024-08-05 17:04:18,130][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000025455_208527360.pth... [2024-08-05 17:04:18,295][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000024745_202711040.pth [2024-08-05 17:04:20,216][15444] Updated weights for policy 0, policy_version 25461 (0.0010) [2024-08-05 17:04:23,118][15372] Fps is (10 sec: 24584.3, 60 sec: 24302.9, 300 sec: 24298.3). Total num frames: 208650240. Throughput: 0: 6049.8. Samples: 52155860. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 17:04:23,126][15372] Avg episode reward: [(0, '39.007')] [2024-08-05 17:04:23,410][15444] Updated weights for policy 0, policy_version 25471 (0.0013) [2024-08-05 17:04:26,837][15444] Updated weights for policy 0, policy_version 25481 (0.0019) [2024-08-05 17:04:28,118][15372] Fps is (10 sec: 24576.8, 60 sec: 24166.4, 300 sec: 24298.3). Total num frames: 208773120. Throughput: 0: 6047.1. Samples: 52192050. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 17:04:28,119][15372] Avg episode reward: [(0, '39.384')] [2024-08-05 17:04:30,094][15444] Updated weights for policy 0, policy_version 25491 (0.0017) [2024-08-05 17:04:33,119][15372] Fps is (10 sec: 23756.4, 60 sec: 24166.4, 300 sec: 24242.9). Total num frames: 208887808. Throughput: 0: 6075.5. Samples: 52228990. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 17:04:33,126][15372] Avg episode reward: [(0, '39.270')] [2024-08-05 17:04:33,600][15444] Updated weights for policy 0, policy_version 25501 (0.0019) [2024-08-05 17:04:37,068][15444] Updated weights for policy 0, policy_version 25511 (0.0020) [2024-08-05 17:04:38,119][15372] Fps is (10 sec: 23756.0, 60 sec: 24302.8, 300 sec: 24270.5). Total num frames: 209010688. Throughput: 0: 6055.1. Samples: 52247000. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 17:04:38,119][15372] Avg episode reward: [(0, '38.521')] [2024-08-05 17:04:40,246][15444] Updated weights for policy 0, policy_version 25521 (0.0017) [2024-08-05 17:04:43,119][15372] Fps is (10 sec: 24576.0, 60 sec: 24302.9, 300 sec: 24270.5). Total num frames: 209133568. Throughput: 0: 6072.2. Samples: 52283650. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 17:04:43,126][15372] Avg episode reward: [(0, '38.133')] [2024-08-05 17:04:43,753][15444] Updated weights for policy 0, policy_version 25531 (0.0012) [2024-08-05 17:04:46,865][15444] Updated weights for policy 0, policy_version 25541 (0.0011) [2024-08-05 17:04:48,119][15372] Fps is (10 sec: 23757.0, 60 sec: 24166.3, 300 sec: 24242.7). Total num frames: 209248256. Throughput: 0: 6072.9. Samples: 52319880. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 17:04:48,119][15372] Avg episode reward: [(0, '37.978')] [2024-08-05 17:04:50,308][15444] Updated weights for policy 0, policy_version 25551 (0.0032) [2024-08-05 17:04:53,118][15372] Fps is (10 sec: 24576.3, 60 sec: 24302.9, 300 sec: 24298.3). Total num frames: 209379328. Throughput: 0: 6064.5. Samples: 52338230. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 17:04:53,119][15372] Avg episode reward: [(0, '39.891')] [2024-08-05 17:04:53,889][15444] Updated weights for policy 0, policy_version 25561 (0.0019) [2024-08-05 17:04:56,946][15444] Updated weights for policy 0, policy_version 25571 (0.0011) [2024-08-05 17:04:58,119][15372] Fps is (10 sec: 24576.1, 60 sec: 24166.3, 300 sec: 24243.4). Total num frames: 209494016. Throughput: 0: 6064.4. Samples: 52375010. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 17:04:58,119][15372] Avg episode reward: [(0, '39.866')] [2024-08-05 17:05:00,465][15444] Updated weights for policy 0, policy_version 25581 (0.0013) [2024-08-05 17:05:03,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24439.5, 300 sec: 24298.3). Total num frames: 209625088. Throughput: 0: 6078.5. Samples: 52411550. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 17:05:03,119][15372] Avg episode reward: [(0, '39.151')] [2024-08-05 17:05:03,775][15444] Updated weights for policy 0, policy_version 25591 (0.0014) [2024-08-05 17:05:07,268][15444] Updated weights for policy 0, policy_version 25601 (0.0012) [2024-08-05 17:05:08,016][15417] Signal inference workers to stop experience collection... (9500 times) [2024-08-05 17:05:08,018][15417] Signal inference workers to resume experience collection... (9500 times) [2024-08-05 17:05:08,067][15444] InferenceWorker_p0-w0: stopping experience collection (9500 times) [2024-08-05 17:05:08,071][15444] InferenceWorker_p0-w0: resuming experience collection (9500 times) [2024-08-05 17:05:08,118][15372] Fps is (10 sec: 25395.8, 60 sec: 24439.6, 300 sec: 24298.3). Total num frames: 209747968. Throughput: 0: 6084.9. Samples: 52429680. Policy #0 lag: (min: 1.0, avg: 3.2, max: 7.0) [2024-08-05 17:05:08,119][15372] Avg episode reward: [(0, '38.969')] [2024-08-05 17:05:10,796][15444] Updated weights for policy 0, policy_version 25611 (0.0013) [2024-08-05 17:05:13,119][15372] Fps is (10 sec: 23756.4, 60 sec: 24304.2, 300 sec: 24270.5). Total num frames: 209862656. Throughput: 0: 6092.0. Samples: 52466190. Policy #0 lag: (min: 1.0, avg: 3.2, max: 7.0) [2024-08-05 17:05:13,119][15372] Avg episode reward: [(0, '39.183')] [2024-08-05 17:05:13,855][15444] Updated weights for policy 0, policy_version 25621 (0.0010) [2024-08-05 17:05:17,496][15444] Updated weights for policy 0, policy_version 25631 (0.0011) [2024-08-05 17:05:18,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24303.1, 300 sec: 24298.4). Total num frames: 209985536. Throughput: 0: 6067.8. Samples: 52502040. Policy #0 lag: (min: 1.0, avg: 3.2, max: 7.0) [2024-08-05 17:05:18,119][15372] Avg episode reward: [(0, '39.721')] [2024-08-05 17:05:20,767][15444] Updated weights for policy 0, policy_version 25641 (0.0012) [2024-08-05 17:05:23,118][15372] Fps is (10 sec: 23757.1, 60 sec: 24166.4, 300 sec: 24242.8). Total num frames: 210100224. Throughput: 0: 6080.9. Samples: 52520640. Policy #0 lag: (min: 1.0, avg: 3.4, max: 7.0) [2024-08-05 17:05:23,129][15372] Avg episode reward: [(0, '39.548')] [2024-08-05 17:05:24,159][15444] Updated weights for policy 0, policy_version 25651 (0.0018) [2024-08-05 17:05:27,785][15444] Updated weights for policy 0, policy_version 25661 (0.0014) [2024-08-05 17:05:28,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24242.8). Total num frames: 210223104. Throughput: 0: 6058.9. Samples: 52556300. Policy #0 lag: (min: 1.0, avg: 3.4, max: 7.0) [2024-08-05 17:05:28,119][15372] Avg episode reward: [(0, '39.349')] [2024-08-05 17:05:30,951][15444] Updated weights for policy 0, policy_version 25671 (0.0011) [2024-08-05 17:05:33,119][15372] Fps is (10 sec: 24574.9, 60 sec: 24302.8, 300 sec: 24242.9). Total num frames: 210345984. Throughput: 0: 6042.6. Samples: 52591800. Policy #0 lag: (min: 1.0, avg: 3.4, max: 7.0) [2024-08-05 17:05:33,127][15372] Avg episode reward: [(0, '39.625')] [2024-08-05 17:05:34,504][15444] Updated weights for policy 0, policy_version 25681 (0.0018) [2024-08-05 17:05:38,118][15372] Fps is (10 sec: 22937.6, 60 sec: 24030.0, 300 sec: 24215.0). Total num frames: 210452480. Throughput: 0: 6039.6. Samples: 52610010. Policy #0 lag: (min: 1.0, avg: 3.4, max: 7.0) [2024-08-05 17:05:38,119][15372] Avg episode reward: [(0, '39.252')] [2024-08-05 17:05:38,153][15444] Updated weights for policy 0, policy_version 25691 (0.0030) [2024-08-05 17:05:41,134][15444] Updated weights for policy 0, policy_version 25701 (0.0043) [2024-08-05 17:05:43,118][15372] Fps is (10 sec: 23757.9, 60 sec: 24166.5, 300 sec: 24270.5). Total num frames: 210583552. Throughput: 0: 6017.8. Samples: 52645810. Policy #0 lag: (min: 1.0, avg: 3.9, max: 8.0) [2024-08-05 17:05:43,126][15372] Avg episode reward: [(0, '37.761')] [2024-08-05 17:05:44,839][15444] Updated weights for policy 0, policy_version 25711 (0.0018) [2024-08-05 17:05:45,174][15417] Signal inference workers to stop experience collection... (9550 times) [2024-08-05 17:05:45,175][15417] Signal inference workers to resume experience collection... (9550 times) [2024-08-05 17:05:45,251][15444] InferenceWorker_p0-w0: stopping experience collection (9550 times) [2024-08-05 17:05:45,251][15444] InferenceWorker_p0-w0: resuming experience collection (9550 times) [2024-08-05 17:05:47,826][15444] Updated weights for policy 0, policy_version 25721 (0.0020) [2024-08-05 17:05:48,118][15372] Fps is (10 sec: 25395.2, 60 sec: 24303.0, 300 sec: 24270.5). Total num frames: 210706432. Throughput: 0: 6024.4. Samples: 52682650. Policy #0 lag: (min: 1.0, avg: 3.9, max: 8.0) [2024-08-05 17:05:48,119][15372] Avg episode reward: [(0, '39.112')] [2024-08-05 17:05:51,477][15444] Updated weights for policy 0, policy_version 25731 (0.0021) [2024-08-05 17:05:53,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24029.9, 300 sec: 24242.8). Total num frames: 210821120. Throughput: 0: 6028.7. Samples: 52700970. Policy #0 lag: (min: 1.0, avg: 3.9, max: 8.0) [2024-08-05 17:05:53,119][15372] Avg episode reward: [(0, '39.745')] [2024-08-05 17:05:55,159][15444] Updated weights for policy 0, policy_version 25741 (0.0011) [2024-08-05 17:05:58,106][15444] Updated weights for policy 0, policy_version 25751 (0.0014) [2024-08-05 17:05:58,120][15372] Fps is (10 sec: 24572.3, 60 sec: 24302.4, 300 sec: 24270.4). Total num frames: 210952192. Throughput: 0: 6012.0. Samples: 52736740. Policy #0 lag: (min: 1.0, avg: 3.6, max: 8.0) [2024-08-05 17:05:58,120][15372] Avg episode reward: [(0, '38.733')] [2024-08-05 17:06:01,801][15444] Updated weights for policy 0, policy_version 25761 (0.0019) [2024-08-05 17:06:03,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24029.9, 300 sec: 24270.5). Total num frames: 211066880. Throughput: 0: 6009.1. Samples: 52772450. Policy #0 lag: (min: 1.0, avg: 3.6, max: 8.0) [2024-08-05 17:06:03,119][15372] Avg episode reward: [(0, '39.216')] [2024-08-05 17:06:05,083][15444] Updated weights for policy 0, policy_version 25771 (0.0015) [2024-08-05 17:06:08,118][15372] Fps is (10 sec: 23760.4, 60 sec: 24029.8, 300 sec: 24242.8). Total num frames: 211189760. Throughput: 0: 6020.0. Samples: 52791540. Policy #0 lag: (min: 1.0, avg: 3.6, max: 8.0) [2024-08-05 17:06:08,127][15372] Avg episode reward: [(0, '39.741')] [2024-08-05 17:06:08,367][15444] Updated weights for policy 0, policy_version 25781 (0.0012) [2024-08-05 17:06:11,930][15444] Updated weights for policy 0, policy_version 25791 (0.0023) [2024-08-05 17:06:13,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.5, 300 sec: 24270.5). Total num frames: 211312640. Throughput: 0: 6027.6. Samples: 52827540. Policy #0 lag: (min: 2.0, avg: 4.5, max: 8.0) [2024-08-05 17:06:13,119][15372] Avg episode reward: [(0, '39.765')] [2024-08-05 17:06:14,418][15417] Signal inference workers to stop experience collection... (9600 times) [2024-08-05 17:06:14,418][15417] Signal inference workers to resume experience collection... (9600 times) [2024-08-05 17:06:14,450][15444] InferenceWorker_p0-w0: stopping experience collection (9600 times) [2024-08-05 17:06:14,450][15444] InferenceWorker_p0-w0: resuming experience collection (9600 times) [2024-08-05 17:06:14,931][15444] Updated weights for policy 0, policy_version 25801 (0.0022) [2024-08-05 17:06:18,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.9, 300 sec: 24242.8). Total num frames: 211427328. Throughput: 0: 6059.6. Samples: 52864480. Policy #0 lag: (min: 2.0, avg: 4.5, max: 8.0) [2024-08-05 17:06:18,126][15372] Avg episode reward: [(0, '39.337')] [2024-08-05 17:06:18,129][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000025809_211427328.pth... [2024-08-05 17:06:18,308][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000025100_205619200.pth [2024-08-05 17:06:18,667][15444] Updated weights for policy 0, policy_version 25811 (0.0014) [2024-08-05 17:06:22,249][15444] Updated weights for policy 0, policy_version 25821 (0.0035) [2024-08-05 17:06:23,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.4, 300 sec: 24242.8). Total num frames: 211550208. Throughput: 0: 6053.1. Samples: 52882400. Policy #0 lag: (min: 2.0, avg: 4.5, max: 8.0) [2024-08-05 17:06:23,119][15372] Avg episode reward: [(0, '39.222')] [2024-08-05 17:06:25,162][15444] Updated weights for policy 0, policy_version 25831 (0.0014) [2024-08-05 17:06:28,119][15372] Fps is (10 sec: 24575.7, 60 sec: 24166.3, 300 sec: 24242.8). Total num frames: 211673088. Throughput: 0: 6072.0. Samples: 52919050. Policy #0 lag: (min: 2.0, avg: 4.5, max: 8.0) [2024-08-05 17:06:28,126][15372] Avg episode reward: [(0, '37.994')] [2024-08-05 17:06:28,837][15444] Updated weights for policy 0, policy_version 25841 (0.0010) [2024-08-05 17:06:32,029][15444] Updated weights for policy 0, policy_version 25851 (0.0031) [2024-08-05 17:06:33,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24030.1, 300 sec: 24215.0). Total num frames: 211787776. Throughput: 0: 6048.4. Samples: 52954830. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 17:06:33,119][15372] Avg episode reward: [(0, '39.797')] [2024-08-05 17:06:35,413][15444] Updated weights for policy 0, policy_version 25861 (0.0011) [2024-08-05 17:06:38,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24439.4, 300 sec: 24270.5). Total num frames: 211918848. Throughput: 0: 6053.1. Samples: 52973360. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 17:06:38,119][15372] Avg episode reward: [(0, '39.606')] [2024-08-05 17:06:38,696][15444] Updated weights for policy 0, policy_version 25871 (0.0011) [2024-08-05 17:06:42,062][15444] Updated weights for policy 0, policy_version 25881 (0.0025) [2024-08-05 17:06:43,118][15372] Fps is (10 sec: 25395.0, 60 sec: 24302.9, 300 sec: 24242.8). Total num frames: 212041728. Throughput: 0: 6072.0. Samples: 53009970. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 17:06:43,119][15372] Avg episode reward: [(0, '40.056')] [2024-08-05 17:06:45,771][15444] Updated weights for policy 0, policy_version 25891 (0.0025) [2024-08-05 17:06:48,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.4, 300 sec: 24242.8). Total num frames: 212156416. Throughput: 0: 6074.9. Samples: 53045820. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 17:06:48,126][15372] Avg episode reward: [(0, '40.183')] [2024-08-05 17:06:48,845][15444] Updated weights for policy 0, policy_version 25901 (0.0020) [2024-08-05 17:06:52,495][15444] Updated weights for policy 0, policy_version 25911 (0.0013) [2024-08-05 17:06:53,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24302.9, 300 sec: 24242.8). Total num frames: 212279296. Throughput: 0: 6048.7. Samples: 53063730. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 17:06:53,119][15372] Avg episode reward: [(0, '40.059')] [2024-08-05 17:06:53,166][15417] Signal inference workers to stop experience collection... (9650 times) [2024-08-05 17:06:53,170][15417] Signal inference workers to resume experience collection... (9650 times) [2024-08-05 17:06:53,213][15444] InferenceWorker_p0-w0: stopping experience collection (9650 times) [2024-08-05 17:06:53,213][15444] InferenceWorker_p0-w0: resuming experience collection (9650 times) [2024-08-05 17:06:55,777][15444] Updated weights for policy 0, policy_version 25921 (0.0012) [2024-08-05 17:06:58,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24167.0, 300 sec: 24242.8). Total num frames: 212402176. Throughput: 0: 6047.6. Samples: 53099680. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 17:06:58,126][15372] Avg episode reward: [(0, '39.084')] [2024-08-05 17:06:59,303][15444] Updated weights for policy 0, policy_version 25931 (0.0027) [2024-08-05 17:07:02,718][15444] Updated weights for policy 0, policy_version 25941 (0.0023) [2024-08-05 17:07:03,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 212516864. Throughput: 0: 6016.5. Samples: 53135220. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 17:07:03,119][15372] Avg episode reward: [(0, '39.165')] [2024-08-05 17:07:05,949][15444] Updated weights for policy 0, policy_version 25951 (0.0012) [2024-08-05 17:07:08,123][15372] Fps is (10 sec: 23745.0, 60 sec: 24164.4, 300 sec: 24214.6). Total num frames: 212639744. Throughput: 0: 6024.2. Samples: 53153520. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 17:07:08,131][15372] Avg episode reward: [(0, '39.416')] [2024-08-05 17:07:09,622][15444] Updated weights for policy 0, policy_version 25961 (0.0017) [2024-08-05 17:07:12,702][15444] Updated weights for policy 0, policy_version 25971 (0.0029) [2024-08-05 17:07:13,121][15372] Fps is (10 sec: 23750.7, 60 sec: 24028.8, 300 sec: 24187.0). Total num frames: 212754432. Throughput: 0: 6016.8. Samples: 53189820. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 17:07:13,121][15372] Avg episode reward: [(0, '39.295')] [2024-08-05 17:07:16,391][15444] Updated weights for policy 0, policy_version 25981 (0.0022) [2024-08-05 17:07:18,118][15372] Fps is (10 sec: 23768.6, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 212877312. Throughput: 0: 6018.9. Samples: 53225680. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 17:07:18,119][15372] Avg episode reward: [(0, '40.334')] [2024-08-05 17:07:19,794][15444] Updated weights for policy 0, policy_version 25991 (0.0021) [2024-08-05 17:07:22,988][15444] Updated weights for policy 0, policy_version 26001 (0.0019) [2024-08-05 17:07:23,119][15372] Fps is (10 sec: 24582.0, 60 sec: 24166.3, 300 sec: 24215.0). Total num frames: 213000192. Throughput: 0: 6011.3. Samples: 53243870. Policy #0 lag: (min: 1.0, avg: 3.8, max: 7.0) [2024-08-05 17:07:23,119][15372] Avg episode reward: [(0, '40.022')] [2024-08-05 17:07:26,418][15444] Updated weights for policy 0, policy_version 26011 (0.0011) [2024-08-05 17:07:28,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.9, 300 sec: 24187.2). Total num frames: 213114880. Throughput: 0: 5997.8. Samples: 53279870. Policy #0 lag: (min: 1.0, avg: 3.8, max: 7.0) [2024-08-05 17:07:28,126][15372] Avg episode reward: [(0, '39.150')] [2024-08-05 17:07:29,691][15444] Updated weights for policy 0, policy_version 26021 (0.0013) [2024-08-05 17:07:33,119][15372] Fps is (10 sec: 23756.3, 60 sec: 24166.3, 300 sec: 24187.2). Total num frames: 213237760. Throughput: 0: 6013.3. Samples: 53316420. Policy #0 lag: (min: 1.0, avg: 3.8, max: 7.0) [2024-08-05 17:07:33,127][15372] Avg episode reward: [(0, '39.919')] [2024-08-05 17:07:33,195][15444] Updated weights for policy 0, policy_version 26031 (0.0020) [2024-08-05 17:07:36,634][15444] Updated weights for policy 0, policy_version 26041 (0.0011) [2024-08-05 17:07:38,119][15372] Fps is (10 sec: 24574.9, 60 sec: 24029.7, 300 sec: 24187.2). Total num frames: 213360640. Throughput: 0: 6028.8. Samples: 53335030. Policy #0 lag: (min: 1.0, avg: 3.8, max: 7.0) [2024-08-05 17:07:38,127][15372] Avg episode reward: [(0, '40.128')] [2024-08-05 17:07:39,849][15444] Updated weights for policy 0, policy_version 26051 (0.0012) [2024-08-05 17:07:43,043][15417] Signal inference workers to stop experience collection... (9700 times) [2024-08-05 17:07:43,051][15417] Signal inference workers to resume experience collection... (9700 times) [2024-08-05 17:07:43,085][15444] InferenceWorker_p0-w0: stopping experience collection (9700 times) [2024-08-05 17:07:43,085][15444] InferenceWorker_p0-w0: resuming experience collection (9700 times) [2024-08-05 17:07:43,119][15372] Fps is (10 sec: 24574.6, 60 sec: 24029.5, 300 sec: 24187.3). Total num frames: 213483520. Throughput: 0: 6046.5. Samples: 53371780. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 17:07:43,120][15372] Avg episode reward: [(0, '39.895')] [2024-08-05 17:07:43,126][15444] Updated weights for policy 0, policy_version 26061 (0.0014) [2024-08-05 17:07:46,524][15444] Updated weights for policy 0, policy_version 26071 (0.0019) [2024-08-05 17:07:48,119][15372] Fps is (10 sec: 24576.1, 60 sec: 24166.2, 300 sec: 24215.0). Total num frames: 213606400. Throughput: 0: 6051.7. Samples: 53407550. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 17:07:48,127][15372] Avg episode reward: [(0, '39.366')] [2024-08-05 17:07:49,962][15444] Updated weights for policy 0, policy_version 26081 (0.0020) [2024-08-05 17:07:53,118][15372] Fps is (10 sec: 24578.0, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 213729280. Throughput: 0: 6067.6. Samples: 53426530. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 17:07:53,126][15372] Avg episode reward: [(0, '39.173')] [2024-08-05 17:07:53,362][15444] Updated weights for policy 0, policy_version 26091 (0.0012) [2024-08-05 17:07:56,942][15444] Updated weights for policy 0, policy_version 26101 (0.0033) [2024-08-05 17:07:58,118][15372] Fps is (10 sec: 24576.9, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 213852160. Throughput: 0: 6069.0. Samples: 53462910. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 17:07:58,119][15372] Avg episode reward: [(0, '39.043')] [2024-08-05 17:07:59,955][15444] Updated weights for policy 0, policy_version 26111 (0.0021) [2024-08-05 17:08:03,119][15372] Fps is (10 sec: 23756.4, 60 sec: 24166.3, 300 sec: 24215.0). Total num frames: 213966848. Throughput: 0: 6073.3. Samples: 53498980. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 17:08:03,120][15372] Avg episode reward: [(0, '39.428')] [2024-08-05 17:08:03,603][15444] Updated weights for policy 0, policy_version 26121 (0.0014) [2024-08-05 17:08:06,697][15444] Updated weights for policy 0, policy_version 26131 (0.0020) [2024-08-05 17:08:08,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24168.4, 300 sec: 24215.0). Total num frames: 214089728. Throughput: 0: 6084.9. Samples: 53517690. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 17:08:08,119][15372] Avg episode reward: [(0, '39.916')] [2024-08-05 17:08:10,329][15444] Updated weights for policy 0, policy_version 26141 (0.0011) [2024-08-05 17:08:13,119][15372] Fps is (10 sec: 25395.3, 60 sec: 24440.4, 300 sec: 24215.0). Total num frames: 214220800. Throughput: 0: 6093.5. Samples: 53554080. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 17:08:13,119][15372] Avg episode reward: [(0, '39.302')] [2024-08-05 17:08:13,654][15444] Updated weights for policy 0, policy_version 26151 (0.0013) [2024-08-05 17:08:17,138][15444] Updated weights for policy 0, policy_version 26161 (0.0012) [2024-08-05 17:08:18,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 214335488. Throughput: 0: 6070.3. Samples: 53589580. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 17:08:18,119][15372] Avg episode reward: [(0, '39.606')] [2024-08-05 17:08:18,122][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000026164_214335488.pth... [2024-08-05 17:08:18,251][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000025455_208527360.pth [2024-08-05 17:08:20,350][15444] Updated weights for policy 0, policy_version 26171 (0.0035) [2024-08-05 17:08:23,118][15372] Fps is (10 sec: 22938.0, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 214450176. Throughput: 0: 6068.1. Samples: 53608090. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 17:08:23,119][15372] Avg episode reward: [(0, '40.262')] [2024-08-05 17:08:23,804][15444] Updated weights for policy 0, policy_version 26181 (0.0012) [2024-08-05 17:08:27,174][15444] Updated weights for policy 0, policy_version 26191 (0.0020) [2024-08-05 17:08:28,063][15417] Signal inference workers to stop experience collection... (9750 times) [2024-08-05 17:08:28,063][15417] Signal inference workers to resume experience collection... (9750 times) [2024-08-05 17:08:28,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 214573056. Throughput: 0: 6042.6. Samples: 53643690. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 17:08:28,119][15372] Avg episode reward: [(0, '39.067')] [2024-08-05 17:08:28,142][15444] InferenceWorker_p0-w0: stopping experience collection (9750 times) [2024-08-05 17:08:28,142][15444] InferenceWorker_p0-w0: resuming experience collection (9750 times) [2024-08-05 17:08:30,427][15444] Updated weights for policy 0, policy_version 26201 (0.0018) [2024-08-05 17:08:33,119][15372] Fps is (10 sec: 24575.6, 60 sec: 24303.0, 300 sec: 24215.0). Total num frames: 214695936. Throughput: 0: 6074.5. Samples: 53680900. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 17:08:33,119][15372] Avg episode reward: [(0, '38.678')] [2024-08-05 17:08:34,032][15444] Updated weights for policy 0, policy_version 26211 (0.0011) [2024-08-05 17:08:37,176][15444] Updated weights for policy 0, policy_version 26221 (0.0021) [2024-08-05 17:08:38,119][15372] Fps is (10 sec: 24575.7, 60 sec: 24303.1, 300 sec: 24215.0). Total num frames: 214818816. Throughput: 0: 6050.4. Samples: 53698800. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 17:08:38,126][15372] Avg episode reward: [(0, '40.100')] [2024-08-05 17:08:40,610][15444] Updated weights for policy 0, policy_version 26231 (0.0017) [2024-08-05 17:08:43,118][15372] Fps is (10 sec: 24576.4, 60 sec: 24303.3, 300 sec: 24215.0). Total num frames: 214941696. Throughput: 0: 6060.5. Samples: 53735630. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 17:08:43,119][15372] Avg episode reward: [(0, '40.374')] [2024-08-05 17:08:43,911][15444] Updated weights for policy 0, policy_version 26241 (0.0022) [2024-08-05 17:08:47,439][15444] Updated weights for policy 0, policy_version 26251 (0.0011) [2024-08-05 17:08:48,118][15372] Fps is (10 sec: 24576.3, 60 sec: 24303.1, 300 sec: 24215.0). Total num frames: 215064576. Throughput: 0: 6059.1. Samples: 53771640. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 17:08:48,119][15372] Avg episode reward: [(0, '38.535')] [2024-08-05 17:08:50,809][15444] Updated weights for policy 0, policy_version 26261 (0.0030) [2024-08-05 17:08:53,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 215179264. Throughput: 0: 6050.9. Samples: 53789980. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 17:08:53,126][15372] Avg episode reward: [(0, '38.281')] [2024-08-05 17:08:54,250][15444] Updated weights for policy 0, policy_version 26271 (0.0012) [2024-08-05 17:08:57,699][15444] Updated weights for policy 0, policy_version 26281 (0.0012) [2024-08-05 17:08:58,119][15372] Fps is (10 sec: 22937.1, 60 sec: 24029.8, 300 sec: 24187.2). Total num frames: 215293952. Throughput: 0: 6034.4. Samples: 53825630. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 17:08:58,119][15372] Avg episode reward: [(0, '40.473')] [2024-08-05 17:09:00,828][15444] Updated weights for policy 0, policy_version 26291 (0.0024) [2024-08-05 17:09:03,118][15372] Fps is (10 sec: 24575.8, 60 sec: 24303.0, 300 sec: 24215.0). Total num frames: 215425024. Throughput: 0: 6052.4. Samples: 53861940. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 17:09:03,126][15372] Avg episode reward: [(0, '40.216')] [2024-08-05 17:09:04,602][15444] Updated weights for policy 0, policy_version 26301 (0.0012) [2024-08-05 17:09:07,903][15444] Updated weights for policy 0, policy_version 26311 (0.0018) [2024-08-05 17:09:08,118][15372] Fps is (10 sec: 24576.5, 60 sec: 24166.4, 300 sec: 24187.5). Total num frames: 215539712. Throughput: 0: 6034.4. Samples: 53879640. Policy #0 lag: (min: 1.0, avg: 3.4, max: 7.0) [2024-08-05 17:09:08,119][15372] Avg episode reward: [(0, '39.861')] [2024-08-05 17:09:11,188][15444] Updated weights for policy 0, policy_version 26321 (0.0022) [2024-08-05 17:09:13,118][15372] Fps is (10 sec: 22937.7, 60 sec: 23893.4, 300 sec: 24159.5). Total num frames: 215654400. Throughput: 0: 6036.9. Samples: 53915350. Policy #0 lag: (min: 1.0, avg: 3.4, max: 7.0) [2024-08-05 17:09:13,126][15372] Avg episode reward: [(0, '39.870')] [2024-08-05 17:09:14,747][15444] Updated weights for policy 0, policy_version 26331 (0.0015) [2024-08-05 17:09:17,998][15444] Updated weights for policy 0, policy_version 26341 (0.0037) [2024-08-05 17:09:18,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 215785472. Throughput: 0: 6024.2. Samples: 53951990. Policy #0 lag: (min: 1.0, avg: 3.4, max: 7.0) [2024-08-05 17:09:18,119][15372] Avg episode reward: [(0, '38.935')] [2024-08-05 17:09:19,880][15417] Signal inference workers to stop experience collection... (9800 times) [2024-08-05 17:09:19,880][15417] Signal inference workers to resume experience collection... (9800 times) [2024-08-05 17:09:19,953][15444] InferenceWorker_p0-w0: stopping experience collection (9800 times) [2024-08-05 17:09:19,953][15444] InferenceWorker_p0-w0: resuming experience collection (9800 times) [2024-08-05 17:09:21,483][15444] Updated weights for policy 0, policy_version 26351 (0.0014) [2024-08-05 17:09:23,118][15372] Fps is (10 sec: 25395.0, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 215908352. Throughput: 0: 6037.3. Samples: 53970480. Policy #0 lag: (min: 1.0, avg: 3.4, max: 7.0) [2024-08-05 17:09:23,119][15372] Avg episode reward: [(0, '40.937')] [2024-08-05 17:09:23,119][15417] Saving new best policy, reward=40.937! [2024-08-05 17:09:24,734][15444] Updated weights for policy 0, policy_version 26361 (0.0015) [2024-08-05 17:09:28,120][15372] Fps is (10 sec: 23753.5, 60 sec: 24165.8, 300 sec: 24187.1). Total num frames: 216023040. Throughput: 0: 6032.5. Samples: 54007100. Policy #0 lag: (min: 1.0, avg: 3.8, max: 8.0) [2024-08-05 17:09:28,120][15372] Avg episode reward: [(0, '40.003')] [2024-08-05 17:09:28,171][15444] Updated weights for policy 0, policy_version 26371 (0.0011) [2024-08-05 17:09:31,411][15444] Updated weights for policy 0, policy_version 26381 (0.0014) [2024-08-05 17:09:33,119][15372] Fps is (10 sec: 23756.2, 60 sec: 24166.3, 300 sec: 24187.2). Total num frames: 216145920. Throughput: 0: 6032.8. Samples: 54043120. Policy #0 lag: (min: 1.0, avg: 3.8, max: 8.0) [2024-08-05 17:09:33,127][15372] Avg episode reward: [(0, '39.049')] [2024-08-05 17:09:34,675][15444] Updated weights for policy 0, policy_version 26391 (0.0018) [2024-08-05 17:09:38,119][15372] Fps is (10 sec: 24579.3, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 216268800. Throughput: 0: 6048.2. Samples: 54062150. Policy #0 lag: (min: 1.0, avg: 3.8, max: 8.0) [2024-08-05 17:09:38,126][15372] Avg episode reward: [(0, '39.202')] [2024-08-05 17:09:38,316][15444] Updated weights for policy 0, policy_version 26401 (0.0021) [2024-08-05 17:09:41,607][15444] Updated weights for policy 0, policy_version 26411 (0.0023) [2024-08-05 17:09:43,118][15372] Fps is (10 sec: 24576.8, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 216391680. Throughput: 0: 6055.1. Samples: 54098110. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 17:09:43,126][15372] Avg episode reward: [(0, '40.009')] [2024-08-05 17:09:44,971][15444] Updated weights for policy 0, policy_version 26421 (0.0018) [2024-08-05 17:09:48,119][15372] Fps is (10 sec: 24575.9, 60 sec: 24166.3, 300 sec: 24187.2). Total num frames: 216514560. Throughput: 0: 6053.3. Samples: 54134340. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 17:09:48,126][15372] Avg episode reward: [(0, '40.548')] [2024-08-05 17:09:48,371][15444] Updated weights for policy 0, policy_version 26431 (0.0010) [2024-08-05 17:09:51,584][15444] Updated weights for policy 0, policy_version 26441 (0.0029) [2024-08-05 17:09:53,118][15372] Fps is (10 sec: 23757.1, 60 sec: 24166.4, 300 sec: 24187.3). Total num frames: 216629248. Throughput: 0: 6075.6. Samples: 54153040. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 17:09:53,126][15372] Avg episode reward: [(0, '40.773')] [2024-08-05 17:09:55,122][15444] Updated weights for policy 0, policy_version 26451 (0.0019) [2024-08-05 17:09:58,118][15372] Fps is (10 sec: 24576.4, 60 sec: 24439.5, 300 sec: 24187.2). Total num frames: 216760320. Throughput: 0: 6105.6. Samples: 54190100. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 17:09:58,126][15372] Avg episode reward: [(0, '40.567')] [2024-08-05 17:09:58,363][15444] Updated weights for policy 0, policy_version 26461 (0.0023) [2024-08-05 17:10:01,778][15444] Updated weights for policy 0, policy_version 26471 (0.0014) [2024-08-05 17:10:03,118][15372] Fps is (10 sec: 25394.8, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 216883200. Throughput: 0: 6093.6. Samples: 54226200. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 17:10:03,119][15372] Avg episode reward: [(0, '39.760')] [2024-08-05 17:10:05,079][15444] Updated weights for policy 0, policy_version 26481 (0.0010) [2024-08-05 17:10:08,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24439.5, 300 sec: 24215.0). Total num frames: 217006080. Throughput: 0: 6108.9. Samples: 54245380. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 17:10:08,126][15372] Avg episode reward: [(0, '39.394')] [2024-08-05 17:10:08,344][15444] Updated weights for policy 0, policy_version 26491 (0.0013) [2024-08-05 17:10:10,488][15417] Signal inference workers to stop experience collection... (9850 times) [2024-08-05 17:10:10,489][15417] Signal inference workers to resume experience collection... (9850 times) [2024-08-05 17:10:10,555][15444] InferenceWorker_p0-w0: stopping experience collection (9850 times) [2024-08-05 17:10:10,555][15444] InferenceWorker_p0-w0: resuming experience collection (9850 times) [2024-08-05 17:10:11,581][15444] Updated weights for policy 0, policy_version 26501 (0.0013) [2024-08-05 17:10:13,119][15372] Fps is (10 sec: 24574.3, 60 sec: 24575.7, 300 sec: 24214.9). Total num frames: 217128960. Throughput: 0: 6106.3. Samples: 54281880. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 17:10:13,119][15372] Avg episode reward: [(0, '39.751')] [2024-08-05 17:10:14,811][15444] Updated weights for policy 0, policy_version 26511 (0.0014) [2024-08-05 17:10:18,118][15372] Fps is (10 sec: 24575.8, 60 sec: 24439.4, 300 sec: 24242.8). Total num frames: 217251840. Throughput: 0: 6123.4. Samples: 54318670. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 17:10:18,126][15372] Avg episode reward: [(0, '39.966')] [2024-08-05 17:10:18,130][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000026520_217251840.pth... [2024-08-05 17:10:18,266][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000025809_211427328.pth [2024-08-05 17:10:18,429][15444] Updated weights for policy 0, policy_version 26521 (0.0028) [2024-08-05 17:10:21,851][15444] Updated weights for policy 0, policy_version 26531 (0.0012) [2024-08-05 17:10:23,119][15372] Fps is (10 sec: 24577.1, 60 sec: 24439.4, 300 sec: 24242.7). Total num frames: 217374720. Throughput: 0: 6113.5. Samples: 54337260. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 17:10:23,119][15372] Avg episode reward: [(0, '39.949')] [2024-08-05 17:10:25,056][15444] Updated weights for policy 0, policy_version 26541 (0.0014) [2024-08-05 17:10:28,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24440.0, 300 sec: 24215.0). Total num frames: 217489408. Throughput: 0: 6125.8. Samples: 54373770. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 17:10:28,127][15372] Avg episode reward: [(0, '40.511')] [2024-08-05 17:10:28,647][15444] Updated weights for policy 0, policy_version 26551 (0.0027) [2024-08-05 17:10:31,939][15444] Updated weights for policy 0, policy_version 26561 (0.0012) [2024-08-05 17:10:33,118][15372] Fps is (10 sec: 23757.5, 60 sec: 24439.6, 300 sec: 24270.5). Total num frames: 217612288. Throughput: 0: 6122.2. Samples: 54409840. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 17:10:33,119][15372] Avg episode reward: [(0, '39.413')] [2024-08-05 17:10:35,195][15444] Updated weights for policy 0, policy_version 26571 (0.0012) [2024-08-05 17:10:38,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24439.5, 300 sec: 24242.8). Total num frames: 217735168. Throughput: 0: 6110.2. Samples: 54428000. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 17:10:38,126][15372] Avg episode reward: [(0, '39.001')] [2024-08-05 17:10:38,812][15444] Updated weights for policy 0, policy_version 26581 (0.0013) [2024-08-05 17:10:41,896][15444] Updated weights for policy 0, policy_version 26591 (0.0012) [2024-08-05 17:10:43,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 217849856. Throughput: 0: 6086.2. Samples: 54463980. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 17:10:43,119][15372] Avg episode reward: [(0, '38.762')] [2024-08-05 17:10:45,498][15444] Updated weights for policy 0, policy_version 26601 (0.0035) [2024-08-05 17:10:48,119][15372] Fps is (10 sec: 24575.8, 60 sec: 24439.5, 300 sec: 24270.5). Total num frames: 217980928. Throughput: 0: 6100.9. Samples: 54500740. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 17:10:48,119][15372] Avg episode reward: [(0, '39.408')] [2024-08-05 17:10:48,862][15444] Updated weights for policy 0, policy_version 26611 (0.0012) [2024-08-05 17:10:49,754][15417] Signal inference workers to stop experience collection... (9900 times) [2024-08-05 17:10:49,754][15417] Signal inference workers to resume experience collection... (9900 times) [2024-08-05 17:10:49,805][15444] InferenceWorker_p0-w0: stopping experience collection (9900 times) [2024-08-05 17:10:49,818][15444] InferenceWorker_p0-w0: resuming experience collection (9900 times) [2024-08-05 17:10:52,141][15444] Updated weights for policy 0, policy_version 26621 (0.0018) [2024-08-05 17:10:53,119][15372] Fps is (10 sec: 24575.5, 60 sec: 24439.3, 300 sec: 24215.1). Total num frames: 218095616. Throughput: 0: 6066.6. Samples: 54518380. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 17:10:53,126][15372] Avg episode reward: [(0, '40.169')] [2024-08-05 17:10:55,718][15444] Updated weights for policy 0, policy_version 26631 (0.0023) [2024-08-05 17:10:58,119][15372] Fps is (10 sec: 23756.5, 60 sec: 24302.9, 300 sec: 24242.7). Total num frames: 218218496. Throughput: 0: 6072.5. Samples: 54555140. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 17:10:58,119][15372] Avg episode reward: [(0, '39.106')] [2024-08-05 17:10:58,901][15444] Updated weights for policy 0, policy_version 26641 (0.0012) [2024-08-05 17:11:02,517][15444] Updated weights for policy 0, policy_version 26651 (0.0016) [2024-08-05 17:11:03,118][15372] Fps is (10 sec: 24576.5, 60 sec: 24302.9, 300 sec: 24242.8). Total num frames: 218341376. Throughput: 0: 6056.2. Samples: 54591200. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 17:11:03,119][15372] Avg episode reward: [(0, '39.894')] [2024-08-05 17:11:05,549][15444] Updated weights for policy 0, policy_version 26661 (0.0018) [2024-08-05 17:11:08,131][15372] Fps is (10 sec: 23726.4, 60 sec: 24161.2, 300 sec: 24213.9). Total num frames: 218456064. Throughput: 0: 6033.4. Samples: 54608840. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 17:11:08,139][15372] Avg episode reward: [(0, '39.624')] [2024-08-05 17:11:09,275][15444] Updated weights for policy 0, policy_version 26671 (0.0026) [2024-08-05 17:11:12,836][15444] Updated weights for policy 0, policy_version 26681 (0.0014) [2024-08-05 17:11:13,119][15372] Fps is (10 sec: 22936.8, 60 sec: 24030.0, 300 sec: 24215.0). Total num frames: 218570752. Throughput: 0: 6015.3. Samples: 54644460. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 17:11:13,119][15372] Avg episode reward: [(0, '39.182')] [2024-08-05 17:11:16,222][15444] Updated weights for policy 0, policy_version 26691 (0.0016) [2024-08-05 17:11:18,118][15372] Fps is (10 sec: 24607.8, 60 sec: 24166.4, 300 sec: 24242.8). Total num frames: 218701824. Throughput: 0: 6010.7. Samples: 54680320. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 17:11:18,126][15372] Avg episode reward: [(0, '40.174')] [2024-08-05 17:11:19,579][15444] Updated weights for policy 0, policy_version 26701 (0.0015) [2024-08-05 17:11:22,800][15444] Updated weights for policy 0, policy_version 26711 (0.0025) [2024-08-05 17:11:23,118][15372] Fps is (10 sec: 25396.0, 60 sec: 24166.5, 300 sec: 24242.8). Total num frames: 218824704. Throughput: 0: 6031.3. Samples: 54699410. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 17:11:23,119][15372] Avg episode reward: [(0, '40.268')] [2024-08-05 17:11:26,149][15444] Updated weights for policy 0, policy_version 26721 (0.0015) [2024-08-05 17:11:28,119][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24242.8). Total num frames: 218939392. Throughput: 0: 6028.7. Samples: 54735270. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 17:11:28,126][15372] Avg episode reward: [(0, '41.661')] [2024-08-05 17:11:28,129][15417] Saving new best policy, reward=41.661! [2024-08-05 17:11:29,658][15444] Updated weights for policy 0, policy_version 26731 (0.0014) [2024-08-05 17:11:32,856][15444] Updated weights for policy 0, policy_version 26741 (0.0017) [2024-08-05 17:11:33,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 219062272. Throughput: 0: 6019.8. Samples: 54771630. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 17:11:33,119][15372] Avg episode reward: [(0, '40.875')] [2024-08-05 17:11:36,250][15444] Updated weights for policy 0, policy_version 26751 (0.0025) [2024-08-05 17:11:38,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 219185152. Throughput: 0: 6045.8. Samples: 54790440. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 17:11:38,119][15372] Avg episode reward: [(0, '39.438')] [2024-08-05 17:11:39,817][15444] Updated weights for policy 0, policy_version 26761 (0.0024) [2024-08-05 17:11:40,405][15417] Signal inference workers to stop experience collection... (9950 times) [2024-08-05 17:11:40,405][15417] Signal inference workers to resume experience collection... (9950 times) [2024-08-05 17:11:40,453][15444] InferenceWorker_p0-w0: stopping experience collection (9950 times) [2024-08-05 17:11:40,453][15444] InferenceWorker_p0-w0: resuming experience collection (9950 times) [2024-08-05 17:11:43,005][15444] Updated weights for policy 0, policy_version 26771 (0.0019) [2024-08-05 17:11:43,124][15372] Fps is (10 sec: 24562.4, 60 sec: 24300.7, 300 sec: 24242.3). Total num frames: 219308032. Throughput: 0: 6037.9. Samples: 54826880. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 17:11:43,124][15372] Avg episode reward: [(0, '39.546')] [2024-08-05 17:11:46,515][15444] Updated weights for policy 0, policy_version 26781 (0.0022) [2024-08-05 17:11:48,118][15372] Fps is (10 sec: 22937.6, 60 sec: 23893.4, 300 sec: 24187.2). Total num frames: 219414528. Throughput: 0: 6027.6. Samples: 54862440. Policy #0 lag: (min: 0.0, avg: 3.1, max: 8.0) [2024-08-05 17:11:48,126][15372] Avg episode reward: [(0, '39.670')] [2024-08-05 17:11:50,083][15444] Updated weights for policy 0, policy_version 26791 (0.0013) [2024-08-05 17:11:53,119][15372] Fps is (10 sec: 23769.0, 60 sec: 24166.3, 300 sec: 24215.0). Total num frames: 219545600. Throughput: 0: 6047.2. Samples: 54880890. Policy #0 lag: (min: 0.0, avg: 3.1, max: 8.0) [2024-08-05 17:11:53,127][15372] Avg episode reward: [(0, '40.053')] [2024-08-05 17:11:53,206][15444] Updated weights for policy 0, policy_version 26801 (0.0012) [2024-08-05 17:11:56,829][15444] Updated weights for policy 0, policy_version 26811 (0.0013) [2024-08-05 17:11:58,119][15372] Fps is (10 sec: 25392.5, 60 sec: 24166.1, 300 sec: 24242.7). Total num frames: 219668480. Throughput: 0: 6063.5. Samples: 54917320. Policy #0 lag: (min: 0.0, avg: 3.1, max: 8.0) [2024-08-05 17:11:58,120][15372] Avg episode reward: [(0, '40.400')] [2024-08-05 17:11:59,942][15444] Updated weights for policy 0, policy_version 26821 (0.0014) [2024-08-05 17:12:03,119][15372] Fps is (10 sec: 23757.4, 60 sec: 24029.8, 300 sec: 24215.4). Total num frames: 219783168. Throughput: 0: 6069.8. Samples: 54953460. Policy #0 lag: (min: 0.0, avg: 3.1, max: 8.0) [2024-08-05 17:12:03,126][15372] Avg episode reward: [(0, '38.827')] [2024-08-05 17:12:03,510][15444] Updated weights for policy 0, policy_version 26831 (0.0011) [2024-08-05 17:12:07,316][15444] Updated weights for policy 0, policy_version 26841 (0.0022) [2024-08-05 17:12:07,854][15417] Signal inference workers to stop experience collection... (10000 times) [2024-08-05 17:12:07,855][15417] Signal inference workers to resume experience collection... (10000 times) [2024-08-05 17:12:07,916][15444] InferenceWorker_p0-w0: stopping experience collection (10000 times) [2024-08-05 17:12:07,922][15444] InferenceWorker_p0-w0: resuming experience collection (10000 times) [2024-08-05 17:12:08,118][15372] Fps is (10 sec: 23759.3, 60 sec: 24171.6, 300 sec: 24243.0). Total num frames: 219906048. Throughput: 0: 6045.3. Samples: 54971450. Policy #0 lag: (min: 2.0, avg: 4.3, max: 9.0) [2024-08-05 17:12:08,119][15372] Avg episode reward: [(0, '39.448')] [2024-08-05 17:12:10,136][15444] Updated weights for policy 0, policy_version 26851 (0.0019) [2024-08-05 17:12:13,118][15372] Fps is (10 sec: 24576.6, 60 sec: 24303.1, 300 sec: 24242.8). Total num frames: 220028928. Throughput: 0: 6058.2. Samples: 55007890. Policy #0 lag: (min: 2.0, avg: 4.3, max: 9.0) [2024-08-05 17:12:13,119][15372] Avg episode reward: [(0, '40.327')] [2024-08-05 17:12:13,906][15444] Updated weights for policy 0, policy_version 26861 (0.0018) [2024-08-05 17:12:16,943][15444] Updated weights for policy 0, policy_version 26871 (0.0014) [2024-08-05 17:12:18,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.9, 300 sec: 24215.0). Total num frames: 220143616. Throughput: 0: 6044.0. Samples: 55043610. Policy #0 lag: (min: 2.0, avg: 4.3, max: 9.0) [2024-08-05 17:12:18,119][15372] Avg episode reward: [(0, '39.424')] [2024-08-05 17:12:18,122][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000026873_220143616.pth... [2024-08-05 17:12:18,271][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000026164_214335488.pth [2024-08-05 17:12:20,534][15444] Updated weights for policy 0, policy_version 26881 (0.0012) [2024-08-05 17:12:23,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24166.4, 300 sec: 24270.5). Total num frames: 220274688. Throughput: 0: 6026.9. Samples: 55061650. Policy #0 lag: (min: 0.0, avg: 4.7, max: 8.0) [2024-08-05 17:12:23,119][15372] Avg episode reward: [(0, '38.665')] [2024-08-05 17:12:24,163][15444] Updated weights for policy 0, policy_version 26891 (0.0022) [2024-08-05 17:12:27,214][15444] Updated weights for policy 0, policy_version 26901 (0.0044) [2024-08-05 17:12:28,120][15372] Fps is (10 sec: 23753.1, 60 sec: 24029.3, 300 sec: 24214.9). Total num frames: 220381184. Throughput: 0: 6033.0. Samples: 55098340. Policy #0 lag: (min: 0.0, avg: 4.7, max: 8.0) [2024-08-05 17:12:28,128][15372] Avg episode reward: [(0, '40.170')] [2024-08-05 17:12:30,866][15444] Updated weights for policy 0, policy_version 26911 (0.0017) [2024-08-05 17:12:33,121][15372] Fps is (10 sec: 23751.3, 60 sec: 24165.5, 300 sec: 24242.6). Total num frames: 220512256. Throughput: 0: 6045.5. Samples: 55134500. Policy #0 lag: (min: 0.0, avg: 4.7, max: 8.0) [2024-08-05 17:12:33,121][15372] Avg episode reward: [(0, '40.286')] [2024-08-05 17:12:33,907][15444] Updated weights for policy 0, policy_version 26921 (0.0031) [2024-08-05 17:12:35,374][15417] Signal inference workers to stop experience collection... (10050 times) [2024-08-05 17:12:35,375][15417] Signal inference workers to resume experience collection... (10050 times) [2024-08-05 17:12:35,437][15444] InferenceWorker_p0-w0: stopping experience collection (10050 times) [2024-08-05 17:12:35,444][15444] InferenceWorker_p0-w0: resuming experience collection (10050 times) [2024-08-05 17:12:37,490][15444] Updated weights for policy 0, policy_version 26931 (0.0020) [2024-08-05 17:12:38,119][15372] Fps is (10 sec: 25398.7, 60 sec: 24166.3, 300 sec: 24242.8). Total num frames: 220635136. Throughput: 0: 6029.4. Samples: 55152210. Policy #0 lag: (min: 0.0, avg: 4.7, max: 8.0) [2024-08-05 17:12:38,119][15372] Avg episode reward: [(0, '40.013')] [2024-08-05 17:12:40,746][15444] Updated weights for policy 0, policy_version 26941 (0.0012) [2024-08-05 17:12:43,119][15372] Fps is (10 sec: 23761.6, 60 sec: 24032.0, 300 sec: 24215.0). Total num frames: 220749824. Throughput: 0: 6032.5. Samples: 55188780. Policy #0 lag: (min: 0.0, avg: 3.1, max: 7.0) [2024-08-05 17:12:43,126][15372] Avg episode reward: [(0, '40.664')] [2024-08-05 17:12:44,184][15444] Updated weights for policy 0, policy_version 26951 (0.0013) [2024-08-05 17:12:47,970][15444] Updated weights for policy 0, policy_version 26961 (0.0019) [2024-08-05 17:12:48,119][15372] Fps is (10 sec: 22937.7, 60 sec: 24166.3, 300 sec: 24187.2). Total num frames: 220864512. Throughput: 0: 6022.2. Samples: 55224460. Policy #0 lag: (min: 0.0, avg: 3.1, max: 7.0) [2024-08-05 17:12:48,119][15372] Avg episode reward: [(0, '40.201')] [2024-08-05 17:12:50,915][15444] Updated weights for policy 0, policy_version 26971 (0.0012) [2024-08-05 17:12:53,119][15372] Fps is (10 sec: 24576.7, 60 sec: 24166.6, 300 sec: 24215.0). Total num frames: 220995584. Throughput: 0: 6021.8. Samples: 55242430. Policy #0 lag: (min: 0.0, avg: 3.1, max: 7.0) [2024-08-05 17:12:53,126][15372] Avg episode reward: [(0, '39.229')] [2024-08-05 17:12:54,583][15444] Updated weights for policy 0, policy_version 26981 (0.0018) [2024-08-05 17:12:58,118][15372] Fps is (10 sec: 23757.2, 60 sec: 23893.8, 300 sec: 24187.2). Total num frames: 221102080. Throughput: 0: 6011.6. Samples: 55278410. Policy #0 lag: (min: 1.0, avg: 4.4, max: 9.0) [2024-08-05 17:12:58,126][15372] Avg episode reward: [(0, '38.491')] [2024-08-05 17:12:58,160][15444] Updated weights for policy 0, policy_version 26991 (0.0013) [2024-08-05 17:13:01,211][15444] Updated weights for policy 0, policy_version 27001 (0.0016) [2024-08-05 17:13:03,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24166.5, 300 sec: 24215.0). Total num frames: 221233152. Throughput: 0: 6012.0. Samples: 55314150. Policy #0 lag: (min: 1.0, avg: 4.4, max: 9.0) [2024-08-05 17:13:03,126][15372] Avg episode reward: [(0, '38.738')] [2024-08-05 17:13:04,568][15444] Updated weights for policy 0, policy_version 27011 (0.0016) [2024-08-05 17:13:08,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 221347840. Throughput: 0: 6021.1. Samples: 55332600. Policy #0 lag: (min: 1.0, avg: 4.4, max: 9.0) [2024-08-05 17:13:08,126][15372] Avg episode reward: [(0, '40.237')] [2024-08-05 17:13:08,173][15444] Updated weights for policy 0, policy_version 27021 (0.0017) [2024-08-05 17:13:11,345][15444] Updated weights for policy 0, policy_version 27031 (0.0011) [2024-08-05 17:13:13,120][15372] Fps is (10 sec: 23753.3, 60 sec: 24029.3, 300 sec: 24187.1). Total num frames: 221470720. Throughput: 0: 6010.9. Samples: 55368830. Policy #0 lag: (min: 1.0, avg: 4.4, max: 9.0) [2024-08-05 17:13:13,128][15372] Avg episode reward: [(0, '39.870')] [2024-08-05 17:13:14,834][15444] Updated weights for policy 0, policy_version 27041 (0.0011) [2024-08-05 17:13:15,602][15417] Signal inference workers to stop experience collection... (10100 times) [2024-08-05 17:13:15,602][15417] Signal inference workers to resume experience collection... (10100 times) [2024-08-05 17:13:15,656][15444] InferenceWorker_p0-w0: stopping experience collection (10100 times) [2024-08-05 17:13:15,663][15444] InferenceWorker_p0-w0: resuming experience collection (10100 times) [2024-08-05 17:13:17,958][15444] Updated weights for policy 0, policy_version 27051 (0.0025) [2024-08-05 17:13:18,118][15372] Fps is (10 sec: 25395.2, 60 sec: 24302.9, 300 sec: 24242.8). Total num frames: 221601792. Throughput: 0: 6035.7. Samples: 55406090. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 17:13:18,119][15372] Avg episode reward: [(0, '39.527')] [2024-08-05 17:13:21,617][15444] Updated weights for policy 0, policy_version 27061 (0.0025) [2024-08-05 17:13:23,119][15372] Fps is (10 sec: 24579.2, 60 sec: 24029.8, 300 sec: 24215.0). Total num frames: 221716480. Throughput: 0: 6054.4. Samples: 55424660. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 17:13:23,119][15372] Avg episode reward: [(0, '39.428')] [2024-08-05 17:13:24,558][15444] Updated weights for policy 0, policy_version 27071 (0.0010) [2024-08-05 17:13:28,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24303.6, 300 sec: 24215.0). Total num frames: 221839360. Throughput: 0: 6049.6. Samples: 55461010. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 17:13:28,126][15372] Avg episode reward: [(0, '39.149')] [2024-08-05 17:13:28,238][15444] Updated weights for policy 0, policy_version 27081 (0.0029) [2024-08-05 17:13:31,766][15444] Updated weights for policy 0, policy_version 27091 (0.0029) [2024-08-05 17:13:33,118][15372] Fps is (10 sec: 25395.7, 60 sec: 24303.9, 300 sec: 24242.8). Total num frames: 221970432. Throughput: 0: 6056.0. Samples: 55496980. Policy #0 lag: (min: 1.0, avg: 4.0, max: 8.0) [2024-08-05 17:13:33,119][15372] Avg episode reward: [(0, '39.906')] [2024-08-05 17:13:34,871][15444] Updated weights for policy 0, policy_version 27101 (0.0014) [2024-08-05 17:13:38,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24029.9, 300 sec: 24187.2). Total num frames: 222076928. Throughput: 0: 6074.7. Samples: 55515790. Policy #0 lag: (min: 1.0, avg: 4.0, max: 8.0) [2024-08-05 17:13:38,126][15372] Avg episode reward: [(0, '40.110')] [2024-08-05 17:13:38,417][15444] Updated weights for policy 0, policy_version 27111 (0.0032) [2024-08-05 17:13:41,397][15444] Updated weights for policy 0, policy_version 27121 (0.0018) [2024-08-05 17:13:43,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24303.1, 300 sec: 24215.0). Total num frames: 222208000. Throughput: 0: 6069.6. Samples: 55551540. Policy #0 lag: (min: 1.0, avg: 4.0, max: 8.0) [2024-08-05 17:13:43,126][15372] Avg episode reward: [(0, '40.375')] [2024-08-05 17:13:45,033][15444] Updated weights for policy 0, policy_version 27131 (0.0018) [2024-08-05 17:13:45,743][15417] Signal inference workers to stop experience collection... (10150 times) [2024-08-05 17:13:45,744][15417] Signal inference workers to resume experience collection... (10150 times) [2024-08-05 17:13:45,808][15444] InferenceWorker_p0-w0: stopping experience collection (10150 times) [2024-08-05 17:13:45,808][15444] InferenceWorker_p0-w0: resuming experience collection (10150 times) [2024-08-05 17:13:48,118][15372] Fps is (10 sec: 25395.1, 60 sec: 24439.5, 300 sec: 24242.8). Total num frames: 222330880. Throughput: 0: 6098.7. Samples: 55588590. Policy #0 lag: (min: 1.0, avg: 4.0, max: 8.0) [2024-08-05 17:13:48,119][15372] Avg episode reward: [(0, '40.493')] [2024-08-05 17:13:48,490][15444] Updated weights for policy 0, policy_version 27141 (0.0029) [2024-08-05 17:13:51,579][15444] Updated weights for policy 0, policy_version 27151 (0.0011) [2024-08-05 17:13:53,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24303.0, 300 sec: 24270.6). Total num frames: 222453760. Throughput: 0: 6106.4. Samples: 55607390. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 17:13:53,126][15372] Avg episode reward: [(0, '40.394')] [2024-08-05 17:13:55,023][15444] Updated weights for policy 0, policy_version 27161 (0.0011) [2024-08-05 17:13:58,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24576.0, 300 sec: 24242.8). Total num frames: 222576640. Throughput: 0: 6121.3. Samples: 55644280. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 17:13:58,126][15372] Avg episode reward: [(0, '39.913')] [2024-08-05 17:13:58,402][15444] Updated weights for policy 0, policy_version 27171 (0.0017) [2024-08-05 17:14:01,643][15444] Updated weights for policy 0, policy_version 27181 (0.0011) [2024-08-05 17:14:03,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24302.9, 300 sec: 24242.8). Total num frames: 222691328. Throughput: 0: 6098.0. Samples: 55680500. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 17:14:03,126][15372] Avg episode reward: [(0, '39.077')] [2024-08-05 17:14:05,198][15444] Updated weights for policy 0, policy_version 27191 (0.0053) [2024-08-05 17:14:08,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24576.0, 300 sec: 24298.3). Total num frames: 222822400. Throughput: 0: 6083.4. Samples: 55698410. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 17:14:08,126][15372] Avg episode reward: [(0, '39.213')] [2024-08-05 17:14:08,695][15444] Updated weights for policy 0, policy_version 27201 (0.0020) [2024-08-05 17:14:11,962][15444] Updated weights for policy 0, policy_version 27211 (0.0014) [2024-08-05 17:14:13,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24303.5, 300 sec: 24215.0). Total num frames: 222928896. Throughput: 0: 6085.3. Samples: 55734850. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 17:14:13,119][15372] Avg episode reward: [(0, '40.113')] [2024-08-05 17:14:15,596][15444] Updated weights for policy 0, policy_version 27221 (0.0029) [2024-08-05 17:14:18,124][15372] Fps is (10 sec: 23742.7, 60 sec: 24300.5, 300 sec: 24242.3). Total num frames: 223059968. Throughput: 0: 6098.1. Samples: 55771430. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 17:14:18,125][15372] Avg episode reward: [(0, '39.530')] [2024-08-05 17:14:18,128][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000027229_223059968.pth... [2024-08-05 17:14:18,272][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000026520_217251840.pth [2024-08-05 17:14:18,532][15444] Updated weights for policy 0, policy_version 27231 (0.0014) [2024-08-05 17:14:22,235][15444] Updated weights for policy 0, policy_version 27241 (0.0026) [2024-08-05 17:14:23,118][15372] Fps is (10 sec: 25395.2, 60 sec: 24439.5, 300 sec: 24270.7). Total num frames: 223182848. Throughput: 0: 6063.1. Samples: 55788630. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 17:14:23,119][15372] Avg episode reward: [(0, '39.701')] [2024-08-05 17:14:25,518][15444] Updated weights for policy 0, policy_version 27251 (0.0012) [2024-08-05 17:14:28,132][15372] Fps is (10 sec: 23739.0, 60 sec: 24297.5, 300 sec: 24241.7). Total num frames: 223297536. Throughput: 0: 6084.4. Samples: 55825420. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 17:14:28,132][15372] Avg episode reward: [(0, '40.205')] [2024-08-05 17:14:28,481][15417] Signal inference workers to stop experience collection... (10200 times) [2024-08-05 17:14:28,489][15417] Signal inference workers to resume experience collection... (10200 times) [2024-08-05 17:14:28,559][15444] InferenceWorker_p0-w0: stopping experience collection (10200 times) [2024-08-05 17:14:28,566][15444] InferenceWorker_p0-w0: resuming experience collection (10200 times) [2024-08-05 17:14:28,963][15444] Updated weights for policy 0, policy_version 27261 (0.0024) [2024-08-05 17:14:32,329][15444] Updated weights for policy 0, policy_version 27271 (0.0012) [2024-08-05 17:14:33,119][15372] Fps is (10 sec: 23756.5, 60 sec: 24166.3, 300 sec: 24242.8). Total num frames: 223420416. Throughput: 0: 6059.8. Samples: 55861280. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 17:14:33,126][15372] Avg episode reward: [(0, '39.362')] [2024-08-05 17:14:35,371][15444] Updated weights for policy 0, policy_version 27281 (0.0012) [2024-08-05 17:14:38,119][15372] Fps is (10 sec: 24608.7, 60 sec: 24439.4, 300 sec: 24242.8). Total num frames: 223543296. Throughput: 0: 6046.0. Samples: 55879460. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 17:14:38,119][15372] Avg episode reward: [(0, '39.532')] [2024-08-05 17:14:39,215][15444] Updated weights for policy 0, policy_version 27291 (0.0010) [2024-08-05 17:14:42,521][15444] Updated weights for policy 0, policy_version 27301 (0.0014) [2024-08-05 17:14:43,119][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 223657984. Throughput: 0: 6030.4. Samples: 55915650. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 17:14:43,119][15372] Avg episode reward: [(0, '40.851')] [2024-08-05 17:14:46,042][15444] Updated weights for policy 0, policy_version 27311 (0.0019) [2024-08-05 17:14:48,122][15372] Fps is (10 sec: 23748.2, 60 sec: 24164.9, 300 sec: 24242.4). Total num frames: 223780864. Throughput: 0: 6024.6. Samples: 55951630. Policy #0 lag: (min: 0.0, avg: 4.3, max: 8.0) [2024-08-05 17:14:48,123][15372] Avg episode reward: [(0, '40.165')] [2024-08-05 17:14:49,480][15444] Updated weights for policy 0, policy_version 27321 (0.0028) [2024-08-05 17:14:52,784][15444] Updated weights for policy 0, policy_version 27331 (0.0013) [2024-08-05 17:14:53,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 223903744. Throughput: 0: 6016.4. Samples: 55969150. Policy #0 lag: (min: 0.0, avg: 4.3, max: 8.0) [2024-08-05 17:14:53,119][15372] Avg episode reward: [(0, '39.627')] [2024-08-05 17:14:56,393][15444] Updated weights for policy 0, policy_version 27341 (0.0013) [2024-08-05 17:14:58,118][15372] Fps is (10 sec: 23765.8, 60 sec: 24029.9, 300 sec: 24187.2). Total num frames: 224018432. Throughput: 0: 6007.1. Samples: 56005170. Policy #0 lag: (min: 0.0, avg: 4.3, max: 8.0) [2024-08-05 17:14:58,119][15372] Avg episode reward: [(0, '40.494')] [2024-08-05 17:14:59,478][15444] Updated weights for policy 0, policy_version 27351 (0.0017) [2024-08-05 17:15:03,067][15444] Updated weights for policy 0, policy_version 27361 (0.0010) [2024-08-05 17:15:03,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 224141312. Throughput: 0: 6004.1. Samples: 56041580. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 17:15:03,119][15372] Avg episode reward: [(0, '40.621')] [2024-08-05 17:15:06,325][15444] Updated weights for policy 0, policy_version 27371 (0.0018) [2024-08-05 17:15:08,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24029.8, 300 sec: 24187.3). Total num frames: 224264192. Throughput: 0: 6035.5. Samples: 56060230. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 17:15:08,126][15372] Avg episode reward: [(0, '40.461')] [2024-08-05 17:15:09,613][15444] Updated weights for policy 0, policy_version 27381 (0.0017) [2024-08-05 17:15:13,119][15372] Fps is (10 sec: 23756.1, 60 sec: 24166.3, 300 sec: 24159.4). Total num frames: 224378880. Throughput: 0: 6035.1. Samples: 56096920. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 17:15:13,127][15372] Avg episode reward: [(0, '39.395')] [2024-08-05 17:15:13,168][15444] Updated weights for policy 0, policy_version 27391 (0.0023) [2024-08-05 17:15:16,363][15444] Updated weights for policy 0, policy_version 27401 (0.0025) [2024-08-05 17:15:17,532][15417] Signal inference workers to stop experience collection... (10250 times) [2024-08-05 17:15:17,540][15417] Signal inference workers to resume experience collection... (10250 times) [2024-08-05 17:15:17,574][15444] InferenceWorker_p0-w0: stopping experience collection (10250 times) [2024-08-05 17:15:17,584][15444] InferenceWorker_p0-w0: resuming experience collection (10250 times) [2024-08-05 17:15:18,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24032.2, 300 sec: 24159.5). Total num frames: 224501760. Throughput: 0: 6031.3. Samples: 56132690. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 17:15:18,119][15372] Avg episode reward: [(0, '39.643')] [2024-08-05 17:15:19,549][15444] Updated weights for policy 0, policy_version 27411 (0.0017) [2024-08-05 17:15:23,077][15444] Updated weights for policy 0, policy_version 27421 (0.0012) [2024-08-05 17:15:23,118][15372] Fps is (10 sec: 25396.0, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 224632832. Throughput: 0: 6058.0. Samples: 56152070. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 17:15:23,119][15372] Avg episode reward: [(0, '40.383')] [2024-08-05 17:15:26,467][15444] Updated weights for policy 0, policy_version 27431 (0.0013) [2024-08-05 17:15:28,118][15372] Fps is (10 sec: 25395.4, 60 sec: 24308.4, 300 sec: 24215.0). Total num frames: 224755712. Throughput: 0: 6061.8. Samples: 56188430. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 17:15:28,126][15372] Avg episode reward: [(0, '39.717')] [2024-08-05 17:15:29,754][15444] Updated weights for policy 0, policy_version 27441 (0.0019) [2024-08-05 17:15:33,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.5, 300 sec: 24187.2). Total num frames: 224870400. Throughput: 0: 6074.3. Samples: 56224950. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 17:15:33,126][15372] Avg episode reward: [(0, '39.233')] [2024-08-05 17:15:33,139][15444] Updated weights for policy 0, policy_version 27451 (0.0010) [2024-08-05 17:15:36,310][15444] Updated weights for policy 0, policy_version 27461 (0.0012) [2024-08-05 17:15:38,118][15372] Fps is (10 sec: 24575.8, 60 sec: 24303.0, 300 sec: 24242.8). Total num frames: 225001472. Throughput: 0: 6096.4. Samples: 56243490. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 17:15:38,126][15372] Avg episode reward: [(0, '40.029')] [2024-08-05 17:15:39,686][15444] Updated weights for policy 0, policy_version 27471 (0.0021) [2024-08-05 17:15:43,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24303.0, 300 sec: 24187.2). Total num frames: 225116160. Throughput: 0: 6120.2. Samples: 56280580. Policy #0 lag: (min: 1.0, avg: 3.7, max: 7.0) [2024-08-05 17:15:43,126][15372] Avg episode reward: [(0, '40.219')] [2024-08-05 17:15:43,426][15444] Updated weights for policy 0, policy_version 27481 (0.0013) [2024-08-05 17:15:46,459][15444] Updated weights for policy 0, policy_version 27491 (0.0012) [2024-08-05 17:15:48,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24304.5, 300 sec: 24215.0). Total num frames: 225239040. Throughput: 0: 6106.7. Samples: 56316380. Policy #0 lag: (min: 1.0, avg: 3.7, max: 7.0) [2024-08-05 17:15:48,126][15372] Avg episode reward: [(0, '39.616')] [2024-08-05 17:15:50,175][15444] Updated weights for policy 0, policy_version 27501 (0.0021) [2024-08-05 17:15:53,119][15372] Fps is (10 sec: 24575.6, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 225361920. Throughput: 0: 6078.0. Samples: 56333740. Policy #0 lag: (min: 1.0, avg: 3.7, max: 7.0) [2024-08-05 17:15:53,126][15372] Avg episode reward: [(0, '40.157')] [2024-08-05 17:15:53,441][15444] Updated weights for policy 0, policy_version 27511 (0.0018) [2024-08-05 17:15:56,840][15444] Updated weights for policy 0, policy_version 27521 (0.0020) [2024-08-05 17:15:58,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 225476608. Throughput: 0: 6077.6. Samples: 56370410. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 17:15:58,119][15372] Avg episode reward: [(0, '41.264')] [2024-08-05 17:16:00,382][15444] Updated weights for policy 0, policy_version 27531 (0.0021) [2024-08-05 17:16:03,119][15372] Fps is (10 sec: 23756.4, 60 sec: 24302.8, 300 sec: 24216.0). Total num frames: 225599488. Throughput: 0: 6096.9. Samples: 56407050. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 17:16:03,119][15372] Avg episode reward: [(0, '40.951')] [2024-08-05 17:16:03,401][15444] Updated weights for policy 0, policy_version 27541 (0.0013) [2024-08-05 17:16:07,115][15444] Updated weights for policy 0, policy_version 27551 (0.0014) [2024-08-05 17:16:07,212][15417] Signal inference workers to stop experience collection... (10300 times) [2024-08-05 17:16:07,214][15417] Signal inference workers to resume experience collection... (10300 times) [2024-08-05 17:16:07,278][15444] InferenceWorker_p0-w0: stopping experience collection (10300 times) [2024-08-05 17:16:07,278][15444] InferenceWorker_p0-w0: resuming experience collection (10300 times) [2024-08-05 17:16:08,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24303.0, 300 sec: 24242.8). Total num frames: 225722368. Throughput: 0: 6066.4. Samples: 56425060. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 17:16:08,119][15372] Avg episode reward: [(0, '40.570')] [2024-08-05 17:16:10,178][15444] Updated weights for policy 0, policy_version 27561 (0.0016) [2024-08-05 17:16:13,121][15372] Fps is (10 sec: 23751.3, 60 sec: 24302.0, 300 sec: 24187.0). Total num frames: 225837056. Throughput: 0: 6065.9. Samples: 56461410. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 17:16:13,129][15372] Avg episode reward: [(0, '40.670')] [2024-08-05 17:16:13,673][15444] Updated weights for policy 0, policy_version 27571 (0.0019) [2024-08-05 17:16:17,300][15444] Updated weights for policy 0, policy_version 27581 (0.0017) [2024-08-05 17:16:18,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24439.5, 300 sec: 24215.0). Total num frames: 225968128. Throughput: 0: 6051.8. Samples: 56497280. Policy #0 lag: (min: 2.0, avg: 4.4, max: 8.0) [2024-08-05 17:16:18,119][15372] Avg episode reward: [(0, '40.478')] [2024-08-05 17:16:18,122][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000027584_225968128.pth... [2024-08-05 17:16:18,243][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000026873_220143616.pth [2024-08-05 17:16:20,340][15444] Updated weights for policy 0, policy_version 27591 (0.0021) [2024-08-05 17:16:23,119][15372] Fps is (10 sec: 24582.3, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 226082816. Throughput: 0: 6039.1. Samples: 56515250. Policy #0 lag: (min: 2.0, avg: 4.4, max: 8.0) [2024-08-05 17:16:23,119][15372] Avg episode reward: [(0, '40.105')] [2024-08-05 17:16:24,080][15444] Updated weights for policy 0, policy_version 27601 (0.0027) [2024-08-05 17:16:27,645][15444] Updated weights for policy 0, policy_version 27611 (0.0013) [2024-08-05 17:16:28,119][15372] Fps is (10 sec: 22936.9, 60 sec: 24029.7, 300 sec: 24187.2). Total num frames: 226197504. Throughput: 0: 6012.8. Samples: 56551160. Policy #0 lag: (min: 2.0, avg: 4.4, max: 8.0) [2024-08-05 17:16:28,119][15372] Avg episode reward: [(0, '40.746')] [2024-08-05 17:16:30,656][15444] Updated weights for policy 0, policy_version 27621 (0.0018) [2024-08-05 17:16:33,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 226328576. Throughput: 0: 6023.3. Samples: 56587430. Policy #0 lag: (min: 2.0, avg: 4.4, max: 8.0) [2024-08-05 17:16:33,126][15372] Avg episode reward: [(0, '40.476')] [2024-08-05 17:16:34,384][15444] Updated weights for policy 0, policy_version 27631 (0.0022) [2024-08-05 17:16:37,434][15444] Updated weights for policy 0, policy_version 27641 (0.0029) [2024-08-05 17:16:38,118][15372] Fps is (10 sec: 24576.8, 60 sec: 24029.9, 300 sec: 24187.7). Total num frames: 226443264. Throughput: 0: 6040.7. Samples: 56605570. Policy #0 lag: (min: 0.0, avg: 3.1, max: 7.0) [2024-08-05 17:16:38,126][15372] Avg episode reward: [(0, '40.538')] [2024-08-05 17:16:39,398][15417] Signal inference workers to stop experience collection... (10350 times) [2024-08-05 17:16:39,399][15417] Signal inference workers to resume experience collection... (10350 times) [2024-08-05 17:16:39,442][15444] InferenceWorker_p0-w0: stopping experience collection (10350 times) [2024-08-05 17:16:39,443][15444] InferenceWorker_p0-w0: resuming experience collection (10350 times) [2024-08-05 17:16:40,951][15444] Updated weights for policy 0, policy_version 27651 (0.0016) [2024-08-05 17:16:43,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.4, 300 sec: 24242.8). Total num frames: 226566144. Throughput: 0: 6032.2. Samples: 56641860. Policy #0 lag: (min: 0.0, avg: 3.1, max: 7.0) [2024-08-05 17:16:43,119][15372] Avg episode reward: [(0, '39.948')] [2024-08-05 17:16:44,511][15444] Updated weights for policy 0, policy_version 27661 (0.0012) [2024-08-05 17:16:47,756][15444] Updated weights for policy 0, policy_version 27671 (0.0015) [2024-08-05 17:16:48,119][15372] Fps is (10 sec: 24574.7, 60 sec: 24166.2, 300 sec: 24215.0). Total num frames: 226689024. Throughput: 0: 6019.5. Samples: 56677930. Policy #0 lag: (min: 0.0, avg: 3.1, max: 7.0) [2024-08-05 17:16:48,119][15372] Avg episode reward: [(0, '40.578')] [2024-08-05 17:16:51,050][15444] Updated weights for policy 0, policy_version 27681 (0.0019) [2024-08-05 17:16:53,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24029.9, 300 sec: 24187.3). Total num frames: 226803712. Throughput: 0: 6033.3. Samples: 56696560. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 17:16:53,119][15372] Avg episode reward: [(0, '39.624')] [2024-08-05 17:16:54,629][15444] Updated weights for policy 0, policy_version 27691 (0.0013) [2024-08-05 17:16:57,867][15444] Updated weights for policy 0, policy_version 27701 (0.0023) [2024-08-05 17:16:58,118][15372] Fps is (10 sec: 23758.0, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 226926592. Throughput: 0: 6029.5. Samples: 56732720. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 17:16:58,119][15372] Avg episode reward: [(0, '40.559')] [2024-08-05 17:17:01,368][15444] Updated weights for policy 0, policy_version 27711 (0.0012) [2024-08-05 17:17:03,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24030.0, 300 sec: 24187.2). Total num frames: 227041280. Throughput: 0: 6024.2. Samples: 56768370. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 17:17:03,126][15372] Avg episode reward: [(0, '40.689')] [2024-08-05 17:17:04,463][15444] Updated weights for policy 0, policy_version 27721 (0.0017) [2024-08-05 17:17:08,112][15444] Updated weights for policy 0, policy_version 27731 (0.0011) [2024-08-05 17:17:08,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 227172352. Throughput: 0: 6047.6. Samples: 56787390. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 17:17:08,120][15372] Avg episode reward: [(0, '40.057')] [2024-08-05 17:17:11,610][15444] Updated weights for policy 0, policy_version 27741 (0.0028) [2024-08-05 17:17:13,119][15372] Fps is (10 sec: 25394.9, 60 sec: 24304.0, 300 sec: 24242.8). Total num frames: 227295232. Throughput: 0: 6035.4. Samples: 56822750. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 17:17:13,119][15372] Avg episode reward: [(0, '40.998')] [2024-08-05 17:17:14,670][15444] Updated weights for policy 0, policy_version 27751 (0.0012) [2024-08-05 17:17:18,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24029.9, 300 sec: 24187.2). Total num frames: 227409920. Throughput: 0: 6044.9. Samples: 56859450. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 17:17:18,126][15372] Avg episode reward: [(0, '40.140')] [2024-08-05 17:17:18,250][15444] Updated weights for policy 0, policy_version 27761 (0.0022) [2024-08-05 17:17:20,336][15417] Signal inference workers to stop experience collection... (10400 times) [2024-08-05 17:17:20,336][15417] Signal inference workers to resume experience collection... (10400 times) [2024-08-05 17:17:20,368][15444] InferenceWorker_p0-w0: stopping experience collection (10400 times) [2024-08-05 17:17:20,396][15444] InferenceWorker_p0-w0: resuming experience collection (10400 times) [2024-08-05 17:17:21,512][15444] Updated weights for policy 0, policy_version 27771 (0.0018) [2024-08-05 17:17:23,119][15372] Fps is (10 sec: 24575.4, 60 sec: 24302.9, 300 sec: 24270.6). Total num frames: 227540992. Throughput: 0: 6059.1. Samples: 56878230. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 17:17:23,119][15372] Avg episode reward: [(0, '39.394')] [2024-08-05 17:17:24,617][15444] Updated weights for policy 0, policy_version 27781 (0.0011) [2024-08-05 17:17:27,960][15444] Updated weights for policy 0, policy_version 27791 (0.0011) [2024-08-05 17:17:28,120][15372] Fps is (10 sec: 25391.5, 60 sec: 24439.0, 300 sec: 24242.8). Total num frames: 227663872. Throughput: 0: 6096.5. Samples: 56916210. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 17:17:28,120][15372] Avg episode reward: [(0, '40.810')] [2024-08-05 17:17:31,436][15444] Updated weights for policy 0, policy_version 27801 (0.0011) [2024-08-05 17:17:33,118][15372] Fps is (10 sec: 24576.7, 60 sec: 24302.9, 300 sec: 24242.8). Total num frames: 227786752. Throughput: 0: 6093.2. Samples: 56952120. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 17:17:33,126][15372] Avg episode reward: [(0, '41.340')] [2024-08-05 17:17:34,625][15444] Updated weights for policy 0, policy_version 27811 (0.0021) [2024-08-05 17:17:38,119][15372] Fps is (10 sec: 23760.0, 60 sec: 24302.9, 300 sec: 24242.8). Total num frames: 227901440. Throughput: 0: 6098.2. Samples: 56970980. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 17:17:38,126][15372] Avg episode reward: [(0, '41.781')] [2024-08-05 17:17:38,130][15417] Saving new best policy, reward=41.781! [2024-08-05 17:17:38,404][15444] Updated weights for policy 0, policy_version 27821 (0.0015) [2024-08-05 17:17:41,509][15444] Updated weights for policy 0, policy_version 27831 (0.0012) [2024-08-05 17:17:43,119][15372] Fps is (10 sec: 23756.6, 60 sec: 24302.9, 300 sec: 24270.5). Total num frames: 228024320. Throughput: 0: 6093.8. Samples: 57006940. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 17:17:43,127][15372] Avg episode reward: [(0, '40.530')] [2024-08-05 17:17:44,872][15444] Updated weights for policy 0, policy_version 27841 (0.0026) [2024-08-05 17:17:48,119][15372] Fps is (10 sec: 24575.5, 60 sec: 24303.0, 300 sec: 24242.7). Total num frames: 228147200. Throughput: 0: 6113.3. Samples: 57043470. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 17:17:48,127][15372] Avg episode reward: [(0, '39.206')] [2024-08-05 17:17:48,390][15444] Updated weights for policy 0, policy_version 27851 (0.0015) [2024-08-05 17:17:51,505][15444] Updated weights for policy 0, policy_version 27861 (0.0034) [2024-08-05 17:17:53,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24439.5, 300 sec: 24298.3). Total num frames: 228270080. Throughput: 0: 6096.4. Samples: 57061730. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 17:17:53,126][15372] Avg episode reward: [(0, '39.610')] [2024-08-05 17:17:54,971][15444] Updated weights for policy 0, policy_version 27871 (0.0018) [2024-08-05 17:17:58,118][15372] Fps is (10 sec: 24576.9, 60 sec: 24439.5, 300 sec: 24270.5). Total num frames: 228392960. Throughput: 0: 6135.1. Samples: 57098830. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 17:17:58,126][15372] Avg episode reward: [(0, '39.158')] [2024-08-05 17:17:58,322][15444] Updated weights for policy 0, policy_version 27881 (0.0018) [2024-08-05 17:18:01,741][15444] Updated weights for policy 0, policy_version 27891 (0.0012) [2024-08-05 17:18:03,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24439.4, 300 sec: 24270.5). Total num frames: 228507648. Throughput: 0: 6105.3. Samples: 57134190. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 17:18:03,119][15372] Avg episode reward: [(0, '39.421')] [2024-08-05 17:18:05,154][15444] Updated weights for policy 0, policy_version 27901 (0.0025) [2024-08-05 17:18:08,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24302.9, 300 sec: 24270.7). Total num frames: 228630528. Throughput: 0: 6100.5. Samples: 57152750. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 17:18:08,126][15372] Avg episode reward: [(0, '40.498')] [2024-08-05 17:18:08,423][15444] Updated weights for policy 0, policy_version 27911 (0.0022) [2024-08-05 17:18:11,887][15444] Updated weights for policy 0, policy_version 27921 (0.0013) [2024-08-05 17:18:13,118][15372] Fps is (10 sec: 25395.2, 60 sec: 24439.5, 300 sec: 24270.5). Total num frames: 228761600. Throughput: 0: 6062.2. Samples: 57189000. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 17:18:13,119][15372] Avg episode reward: [(0, '39.353')] [2024-08-05 17:18:15,184][15444] Updated weights for policy 0, policy_version 27931 (0.0013) [2024-08-05 17:18:18,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24439.5, 300 sec: 24270.5). Total num frames: 228876288. Throughput: 0: 6073.1. Samples: 57225410. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 17:18:18,126][15372] Avg episode reward: [(0, '40.254')] [2024-08-05 17:18:18,130][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000027939_228876288.pth... [2024-08-05 17:18:18,276][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000027229_223059968.pth [2024-08-05 17:18:18,624][15444] Updated weights for policy 0, policy_version 27941 (0.0031) [2024-08-05 17:18:22,077][15444] Updated weights for policy 0, policy_version 27951 (0.0030) [2024-08-05 17:18:23,119][15372] Fps is (10 sec: 23756.5, 60 sec: 24303.0, 300 sec: 24270.5). Total num frames: 228999168. Throughput: 0: 6052.0. Samples: 57243320. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 17:18:23,119][15372] Avg episode reward: [(0, '40.343')] [2024-08-05 17:18:25,424][15444] Updated weights for policy 0, policy_version 27961 (0.0014) [2024-08-05 17:18:27,061][15417] Signal inference workers to stop experience collection... (10450 times) [2024-08-05 17:18:27,070][15417] Signal inference workers to resume experience collection... (10450 times) [2024-08-05 17:18:27,139][15444] InferenceWorker_p0-w0: stopping experience collection (10450 times) [2024-08-05 17:18:27,139][15444] InferenceWorker_p0-w0: resuming experience collection (10450 times) [2024-08-05 17:18:28,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24167.0, 300 sec: 24215.0). Total num frames: 229113856. Throughput: 0: 6055.1. Samples: 57279420. Policy #0 lag: (min: 0.0, avg: 4.2, max: 7.0) [2024-08-05 17:18:28,119][15372] Avg episode reward: [(0, '39.555')] [2024-08-05 17:18:29,022][15444] Updated weights for policy 0, policy_version 27971 (0.0024) [2024-08-05 17:18:32,081][15444] Updated weights for policy 0, policy_version 27981 (0.0018) [2024-08-05 17:18:33,118][15372] Fps is (10 sec: 23757.1, 60 sec: 24166.4, 300 sec: 24270.5). Total num frames: 229236736. Throughput: 0: 6052.7. Samples: 57315840. Policy #0 lag: (min: 0.0, avg: 4.2, max: 7.0) [2024-08-05 17:18:33,126][15372] Avg episode reward: [(0, '39.637')] [2024-08-05 17:18:35,574][15444] Updated weights for policy 0, policy_version 27991 (0.0012) [2024-08-05 17:18:38,119][15372] Fps is (10 sec: 24575.8, 60 sec: 24302.9, 300 sec: 24242.8). Total num frames: 229359616. Throughput: 0: 6053.8. Samples: 57334150. Policy #0 lag: (min: 0.0, avg: 4.2, max: 7.0) [2024-08-05 17:18:38,126][15372] Avg episode reward: [(0, '40.848')] [2024-08-05 17:18:39,179][15444] Updated weights for policy 0, policy_version 28001 (0.0049) [2024-08-05 17:18:42,494][15444] Updated weights for policy 0, policy_version 28011 (0.0028) [2024-08-05 17:18:43,119][15372] Fps is (10 sec: 23756.4, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 229474304. Throughput: 0: 6027.5. Samples: 57370070. Policy #0 lag: (min: 0.0, avg: 4.2, max: 7.0) [2024-08-05 17:18:43,119][15372] Avg episode reward: [(0, '40.219')] [2024-08-05 17:18:46,014][15444] Updated weights for policy 0, policy_version 28021 (0.0012) [2024-08-05 17:18:48,118][15372] Fps is (10 sec: 23757.1, 60 sec: 24166.5, 300 sec: 24215.0). Total num frames: 229597184. Throughput: 0: 6049.3. Samples: 57406410. Policy #0 lag: (min: 0.0, avg: 3.4, max: 8.0) [2024-08-05 17:18:48,120][15372] Avg episode reward: [(0, '39.313')] [2024-08-05 17:18:49,310][15444] Updated weights for policy 0, policy_version 28031 (0.0014) [2024-08-05 17:18:52,691][15444] Updated weights for policy 0, policy_version 28041 (0.0011) [2024-08-05 17:18:53,118][15372] Fps is (10 sec: 24576.4, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 229720064. Throughput: 0: 6015.3. Samples: 57423440. Policy #0 lag: (min: 0.0, avg: 3.4, max: 8.0) [2024-08-05 17:18:53,119][15372] Avg episode reward: [(0, '40.015')] [2024-08-05 17:18:56,401][15444] Updated weights for policy 0, policy_version 28051 (0.0011) [2024-08-05 17:18:58,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.9, 300 sec: 24215.0). Total num frames: 229834752. Throughput: 0: 6004.4. Samples: 57459200. Policy #0 lag: (min: 0.0, avg: 3.4, max: 8.0) [2024-08-05 17:18:58,119][15372] Avg episode reward: [(0, '40.557')] [2024-08-05 17:18:59,446][15444] Updated weights for policy 0, policy_version 28061 (0.0012) [2024-08-05 17:19:03,040][15444] Updated weights for policy 0, policy_version 28071 (0.0011) [2024-08-05 17:19:03,119][15372] Fps is (10 sec: 23756.2, 60 sec: 24166.3, 300 sec: 24187.2). Total num frames: 229957632. Throughput: 0: 6013.3. Samples: 57496010. Policy #0 lag: (min: 0.0, avg: 3.3, max: 8.0) [2024-08-05 17:19:03,119][15372] Avg episode reward: [(0, '39.582')] [2024-08-05 17:19:06,137][15444] Updated weights for policy 0, policy_version 28081 (0.0020) [2024-08-05 17:19:08,119][15372] Fps is (10 sec: 24575.4, 60 sec: 24166.3, 300 sec: 24242.7). Total num frames: 230080512. Throughput: 0: 6026.2. Samples: 57514500. Policy #0 lag: (min: 0.0, avg: 3.3, max: 8.0) [2024-08-05 17:19:08,127][15372] Avg episode reward: [(0, '40.274')] [2024-08-05 17:19:09,764][15444] Updated weights for policy 0, policy_version 28091 (0.0011) [2024-08-05 17:19:09,842][15417] Signal inference workers to stop experience collection... (10500 times) [2024-08-05 17:19:09,843][15417] Signal inference workers to resume experience collection... (10500 times) [2024-08-05 17:19:09,940][15444] InferenceWorker_p0-w0: stopping experience collection (10500 times) [2024-08-05 17:19:09,940][15444] InferenceWorker_p0-w0: resuming experience collection (10500 times) [2024-08-05 17:19:13,119][15372] Fps is (10 sec: 23756.0, 60 sec: 23893.1, 300 sec: 24187.7). Total num frames: 230195200. Throughput: 0: 6031.5. Samples: 57550840. Policy #0 lag: (min: 0.0, avg: 3.3, max: 8.0) [2024-08-05 17:19:13,120][15372] Avg episode reward: [(0, '41.292')] [2024-08-05 17:19:13,150][15444] Updated weights for policy 0, policy_version 28101 (0.0020) [2024-08-05 17:19:16,416][15444] Updated weights for policy 0, policy_version 28111 (0.0031) [2024-08-05 17:19:18,119][15372] Fps is (10 sec: 24576.2, 60 sec: 24166.3, 300 sec: 24215.0). Total num frames: 230326272. Throughput: 0: 6017.5. Samples: 57586630. Policy #0 lag: (min: 0.0, avg: 3.3, max: 8.0) [2024-08-05 17:19:18,126][15372] Avg episode reward: [(0, '41.547')] [2024-08-05 17:19:19,884][15444] Updated weights for policy 0, policy_version 28121 (0.0011) [2024-08-05 17:19:23,118][15372] Fps is (10 sec: 24577.4, 60 sec: 24029.9, 300 sec: 24216.1). Total num frames: 230440960. Throughput: 0: 6024.7. Samples: 57605260. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 17:19:23,126][15372] Avg episode reward: [(0, '40.672')] [2024-08-05 17:19:23,263][15444] Updated weights for policy 0, policy_version 28131 (0.0012) [2024-08-05 17:19:26,478][15444] Updated weights for policy 0, policy_version 28141 (0.0017) [2024-08-05 17:19:28,119][15372] Fps is (10 sec: 23757.0, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 230563840. Throughput: 0: 6027.8. Samples: 57641320. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 17:19:28,126][15372] Avg episode reward: [(0, '40.401')] [2024-08-05 17:19:30,175][15444] Updated weights for policy 0, policy_version 28151 (0.0021) [2024-08-05 17:19:33,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 230686720. Throughput: 0: 6035.3. Samples: 57678000. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 17:19:33,126][15372] Avg episode reward: [(0, '39.869')] [2024-08-05 17:19:33,369][15444] Updated weights for policy 0, policy_version 28161 (0.0028) [2024-08-05 17:19:36,911][15444] Updated weights for policy 0, policy_version 28171 (0.0017) [2024-08-05 17:19:38,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24166.5, 300 sec: 24242.8). Total num frames: 230809600. Throughput: 0: 6061.3. Samples: 57696200. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 17:19:38,119][15372] Avg episode reward: [(0, '40.380')] [2024-08-05 17:19:39,955][15444] Updated weights for policy 0, policy_version 28181 (0.0011) [2024-08-05 17:19:43,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24215.3). Total num frames: 230924288. Throughput: 0: 6076.9. Samples: 57732660. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 17:19:43,126][15372] Avg episode reward: [(0, '39.893')] [2024-08-05 17:19:43,437][15444] Updated weights for policy 0, policy_version 28191 (0.0018) [2024-08-05 17:19:47,167][15444] Updated weights for policy 0, policy_version 28201 (0.0013) [2024-08-05 17:19:48,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 231047168. Throughput: 0: 6040.5. Samples: 57767830. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 17:19:48,119][15372] Avg episode reward: [(0, '38.753')] [2024-08-05 17:19:49,472][15417] Signal inference workers to stop experience collection... (10550 times) [2024-08-05 17:19:49,474][15417] Signal inference workers to resume experience collection... (10550 times) [2024-08-05 17:19:49,529][15444] InferenceWorker_p0-w0: stopping experience collection (10550 times) [2024-08-05 17:19:49,530][15444] InferenceWorker_p0-w0: resuming experience collection (10550 times) [2024-08-05 17:19:50,129][15444] Updated weights for policy 0, policy_version 28211 (0.0010) [2024-08-05 17:19:53,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24166.4, 300 sec: 24242.8). Total num frames: 231170048. Throughput: 0: 6048.5. Samples: 57786680. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 17:19:53,119][15372] Avg episode reward: [(0, '39.803')] [2024-08-05 17:19:53,728][15444] Updated weights for policy 0, policy_version 28221 (0.0016) [2024-08-05 17:19:57,188][15444] Updated weights for policy 0, policy_version 28231 (0.0015) [2024-08-05 17:19:58,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24302.9, 300 sec: 24242.8). Total num frames: 231292928. Throughput: 0: 6034.7. Samples: 57822400. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 17:19:58,119][15372] Avg episode reward: [(0, '39.600')] [2024-08-05 17:20:00,493][15444] Updated weights for policy 0, policy_version 28241 (0.0020) [2024-08-05 17:20:03,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.5, 300 sec: 24215.0). Total num frames: 231407616. Throughput: 0: 6053.4. Samples: 57859030. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 17:20:03,119][15372] Avg episode reward: [(0, '40.397')] [2024-08-05 17:20:04,063][15444] Updated weights for policy 0, policy_version 28251 (0.0030) [2024-08-05 17:20:07,234][15444] Updated weights for policy 0, policy_version 28261 (0.0012) [2024-08-05 17:20:08,119][15372] Fps is (10 sec: 23755.7, 60 sec: 24166.3, 300 sec: 24242.7). Total num frames: 231530496. Throughput: 0: 6036.6. Samples: 57876910. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 17:20:08,127][15372] Avg episode reward: [(0, '39.768')] [2024-08-05 17:20:10,516][15444] Updated weights for policy 0, policy_version 28271 (0.0017) [2024-08-05 17:20:13,120][15372] Fps is (10 sec: 24572.3, 60 sec: 24302.5, 300 sec: 24242.6). Total num frames: 231653376. Throughput: 0: 6060.3. Samples: 57914040. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 17:20:13,120][15372] Avg episode reward: [(0, '40.242')] [2024-08-05 17:20:13,962][15444] Updated weights for policy 0, policy_version 28281 (0.0021) [2024-08-05 17:20:17,338][15444] Updated weights for policy 0, policy_version 28291 (0.0018) [2024-08-05 17:20:18,119][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.3, 300 sec: 24215.0). Total num frames: 231776256. Throughput: 0: 6048.2. Samples: 57950170. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 17:20:18,119][15372] Avg episode reward: [(0, '39.907')] [2024-08-05 17:20:18,122][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000028293_231776256.pth... [2024-08-05 17:20:18,262][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000027584_225968128.pth [2024-08-05 17:20:20,620][15444] Updated weights for policy 0, policy_version 28301 (0.0033) [2024-08-05 17:20:23,118][15372] Fps is (10 sec: 24579.6, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 231899136. Throughput: 0: 6049.6. Samples: 57968430. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 17:20:23,119][15372] Avg episode reward: [(0, '40.070')] [2024-08-05 17:20:23,933][15444] Updated weights for policy 0, policy_version 28311 (0.0020) [2024-08-05 17:20:27,506][15444] Updated weights for policy 0, policy_version 28321 (0.0023) [2024-08-05 17:20:28,119][15372] Fps is (10 sec: 24576.3, 60 sec: 24302.8, 300 sec: 24242.7). Total num frames: 232022016. Throughput: 0: 6046.8. Samples: 58004770. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 17:20:28,119][15372] Avg episode reward: [(0, '39.659')] [2024-08-05 17:20:31,006][15444] Updated weights for policy 0, policy_version 28331 (0.0010) [2024-08-05 17:20:33,119][15372] Fps is (10 sec: 23756.6, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 232136704. Throughput: 0: 6073.5. Samples: 58041140. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 17:20:33,119][15372] Avg episode reward: [(0, '39.283')] [2024-08-05 17:20:34,121][15444] Updated weights for policy 0, policy_version 28341 (0.0011) [2024-08-05 17:20:36,295][15417] Signal inference workers to stop experience collection... (10600 times) [2024-08-05 17:20:36,295][15417] Signal inference workers to resume experience collection... (10600 times) [2024-08-05 17:20:36,368][15444] InferenceWorker_p0-w0: stopping experience collection (10600 times) [2024-08-05 17:20:36,368][15444] InferenceWorker_p0-w0: resuming experience collection (10600 times) [2024-08-05 17:20:37,694][15444] Updated weights for policy 0, policy_version 28351 (0.0020) [2024-08-05 17:20:38,118][15372] Fps is (10 sec: 23757.6, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 232259584. Throughput: 0: 6057.3. Samples: 58059260. Policy #0 lag: (min: 0.0, avg: 3.4, max: 8.0) [2024-08-05 17:20:38,119][15372] Avg episode reward: [(0, '40.039')] [2024-08-05 17:20:40,630][15444] Updated weights for policy 0, policy_version 28361 (0.0024) [2024-08-05 17:20:43,119][15372] Fps is (10 sec: 24575.4, 60 sec: 24302.8, 300 sec: 24215.0). Total num frames: 232382464. Throughput: 0: 6093.3. Samples: 58096600. Policy #0 lag: (min: 0.0, avg: 3.4, max: 8.0) [2024-08-05 17:20:43,119][15372] Avg episode reward: [(0, '39.059')] [2024-08-05 17:20:44,248][15444] Updated weights for policy 0, policy_version 28371 (0.0010) [2024-08-05 17:20:47,879][15444] Updated weights for policy 0, policy_version 28381 (0.0011) [2024-08-05 17:20:48,119][15372] Fps is (10 sec: 24573.9, 60 sec: 24302.6, 300 sec: 24214.9). Total num frames: 232505344. Throughput: 0: 6081.7. Samples: 58132710. Policy #0 lag: (min: 0.0, avg: 3.4, max: 8.0) [2024-08-05 17:20:48,120][15372] Avg episode reward: [(0, '39.091')] [2024-08-05 17:20:50,910][15444] Updated weights for policy 0, policy_version 28391 (0.0015) [2024-08-05 17:20:53,118][15372] Fps is (10 sec: 24577.0, 60 sec: 24302.9, 300 sec: 24242.8). Total num frames: 232628224. Throughput: 0: 6079.8. Samples: 58150500. Policy #0 lag: (min: 0.0, avg: 3.4, max: 8.0) [2024-08-05 17:20:53,126][15372] Avg episode reward: [(0, '40.188')] [2024-08-05 17:20:54,713][15444] Updated weights for policy 0, policy_version 28401 (0.0023) [2024-08-05 17:20:57,881][15444] Updated weights for policy 0, policy_version 28411 (0.0011) [2024-08-05 17:20:58,120][15372] Fps is (10 sec: 23755.5, 60 sec: 24165.8, 300 sec: 24214.9). Total num frames: 232742912. Throughput: 0: 6049.3. Samples: 58186260. Policy #0 lag: (min: 0.0, avg: 2.8, max: 7.0) [2024-08-05 17:20:58,120][15372] Avg episode reward: [(0, '39.870')] [2024-08-05 17:21:01,302][15444] Updated weights for policy 0, policy_version 28421 (0.0011) [2024-08-05 17:21:03,118][15372] Fps is (10 sec: 22937.5, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 232857600. Throughput: 0: 6049.6. Samples: 58222400. Policy #0 lag: (min: 0.0, avg: 2.8, max: 7.0) [2024-08-05 17:21:03,130][15372] Avg episode reward: [(0, '39.036')] [2024-08-05 17:21:04,880][15444] Updated weights for policy 0, policy_version 28431 (0.0011) [2024-08-05 17:21:08,079][15444] Updated weights for policy 0, policy_version 28441 (0.0022) [2024-08-05 17:21:08,119][15372] Fps is (10 sec: 24579.1, 60 sec: 24303.1, 300 sec: 24243.0). Total num frames: 232988672. Throughput: 0: 6051.5. Samples: 58240750. Policy #0 lag: (min: 0.0, avg: 2.8, max: 7.0) [2024-08-05 17:21:08,119][15372] Avg episode reward: [(0, '39.024')] [2024-08-05 17:21:11,568][15444] Updated weights for policy 0, policy_version 28451 (0.0028) [2024-08-05 17:21:13,124][15372] Fps is (10 sec: 25380.8, 60 sec: 24301.2, 300 sec: 24214.5). Total num frames: 233111552. Throughput: 0: 6051.9. Samples: 58277140. Policy #0 lag: (min: 0.0, avg: 2.8, max: 7.0) [2024-08-05 17:21:13,132][15372] Avg episode reward: [(0, '39.831')] [2024-08-05 17:21:14,779][15444] Updated weights for policy 0, policy_version 28461 (0.0025) [2024-08-05 17:21:18,118][15372] Fps is (10 sec: 23757.1, 60 sec: 24166.6, 300 sec: 24215.0). Total num frames: 233226240. Throughput: 0: 6056.5. Samples: 58313680. Policy #0 lag: (min: 1.0, avg: 3.9, max: 7.0) [2024-08-05 17:21:18,126][15372] Avg episode reward: [(0, '40.444')] [2024-08-05 17:21:18,148][15444] Updated weights for policy 0, policy_version 28471 (0.0013) [2024-08-05 17:21:18,264][15417] Signal inference workers to stop experience collection... (10650 times) [2024-08-05 17:21:18,264][15417] Signal inference workers to resume experience collection... (10650 times) [2024-08-05 17:21:18,327][15444] InferenceWorker_p0-w0: stopping experience collection (10650 times) [2024-08-05 17:21:18,336][15444] InferenceWorker_p0-w0: resuming experience collection (10650 times) [2024-08-05 17:21:21,604][15444] Updated weights for policy 0, policy_version 28481 (0.0027) [2024-08-05 17:21:23,119][15372] Fps is (10 sec: 23769.9, 60 sec: 24166.3, 300 sec: 24242.8). Total num frames: 233349120. Throughput: 0: 6071.5. Samples: 58332480. Policy #0 lag: (min: 1.0, avg: 3.9, max: 7.0) [2024-08-05 17:21:23,126][15372] Avg episode reward: [(0, '39.441')] [2024-08-05 17:21:24,868][15444] Updated weights for policy 0, policy_version 28491 (0.0011) [2024-08-05 17:21:28,118][15372] Fps is (10 sec: 24575.8, 60 sec: 24166.5, 300 sec: 24215.0). Total num frames: 233472000. Throughput: 0: 6048.3. Samples: 58368770. Policy #0 lag: (min: 1.0, avg: 3.9, max: 7.0) [2024-08-05 17:21:28,126][15372] Avg episode reward: [(0, '38.870')] [2024-08-05 17:21:28,483][15444] Updated weights for policy 0, policy_version 28501 (0.0024) [2024-08-05 17:21:31,526][15444] Updated weights for policy 0, policy_version 28511 (0.0017) [2024-08-05 17:21:33,118][15372] Fps is (10 sec: 24576.6, 60 sec: 24303.0, 300 sec: 24242.8). Total num frames: 233594880. Throughput: 0: 6045.0. Samples: 58404730. Policy #0 lag: (min: 1.0, avg: 3.9, max: 7.0) [2024-08-05 17:21:33,126][15372] Avg episode reward: [(0, '39.421')] [2024-08-05 17:21:35,006][15444] Updated weights for policy 0, policy_version 28521 (0.0013) [2024-08-05 17:21:38,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24302.9, 300 sec: 24242.8). Total num frames: 233717760. Throughput: 0: 6060.4. Samples: 58423220. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 17:21:38,126][15372] Avg episode reward: [(0, '40.346')] [2024-08-05 17:21:38,309][15444] Updated weights for policy 0, policy_version 28531 (0.0025) [2024-08-05 17:21:41,869][15444] Updated weights for policy 0, policy_version 28541 (0.0021) [2024-08-05 17:21:43,119][15372] Fps is (10 sec: 23756.3, 60 sec: 24166.5, 300 sec: 24215.0). Total num frames: 233832448. Throughput: 0: 6068.8. Samples: 58459350. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 17:21:43,119][15372] Avg episode reward: [(0, '40.462')] [2024-08-05 17:21:45,268][15444] Updated weights for policy 0, policy_version 28551 (0.0011) [2024-08-05 17:21:48,119][15372] Fps is (10 sec: 24574.4, 60 sec: 24303.0, 300 sec: 24270.5). Total num frames: 233963520. Throughput: 0: 6084.6. Samples: 58496210. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 17:21:48,120][15372] Avg episode reward: [(0, '40.136')] [2024-08-05 17:21:48,551][15444] Updated weights for policy 0, policy_version 28561 (0.0012) [2024-08-05 17:21:51,943][15444] Updated weights for policy 0, policy_version 28571 (0.0015) [2024-08-05 17:21:53,119][15372] Fps is (10 sec: 24575.8, 60 sec: 24166.3, 300 sec: 24242.7). Total num frames: 234078208. Throughput: 0: 6076.2. Samples: 58514180. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 17:21:53,119][15372] Avg episode reward: [(0, '40.136')] [2024-08-05 17:21:55,216][15444] Updated weights for policy 0, policy_version 28581 (0.0021) [2024-08-05 17:21:58,119][15372] Fps is (10 sec: 23757.9, 60 sec: 24303.4, 300 sec: 24270.5). Total num frames: 234201088. Throughput: 0: 6087.4. Samples: 58551040. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 17:21:58,126][15372] Avg episode reward: [(0, '40.356')] [2024-08-05 17:21:58,740][15444] Updated weights for policy 0, policy_version 28591 (0.0011) [2024-08-05 17:22:01,270][15417] Signal inference workers to stop experience collection... (10700 times) [2024-08-05 17:22:01,270][15417] Signal inference workers to resume experience collection... (10700 times) [2024-08-05 17:22:01,327][15444] InferenceWorker_p0-w0: stopping experience collection (10700 times) [2024-08-05 17:22:01,327][15444] InferenceWorker_p0-w0: resuming experience collection (10700 times) [2024-08-05 17:22:02,049][15444] Updated weights for policy 0, policy_version 28601 (0.0011) [2024-08-05 17:22:03,118][15372] Fps is (10 sec: 24576.8, 60 sec: 24439.5, 300 sec: 24242.8). Total num frames: 234323968. Throughput: 0: 6065.6. Samples: 58586630. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 17:22:03,119][15372] Avg episode reward: [(0, '39.568')] [2024-08-05 17:22:05,391][15444] Updated weights for policy 0, policy_version 28611 (0.0012) [2024-08-05 17:22:08,118][15372] Fps is (10 sec: 24576.5, 60 sec: 24303.0, 300 sec: 24242.8). Total num frames: 234446848. Throughput: 0: 6068.0. Samples: 58605540. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 17:22:08,119][15372] Avg episode reward: [(0, '40.025')] [2024-08-05 17:22:08,648][15444] Updated weights for policy 0, policy_version 28621 (0.0018) [2024-08-05 17:22:12,044][15444] Updated weights for policy 0, policy_version 28631 (0.0021) [2024-08-05 17:22:13,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24305.3, 300 sec: 24270.5). Total num frames: 234569728. Throughput: 0: 6072.9. Samples: 58642050. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 17:22:13,119][15372] Avg episode reward: [(0, '40.053')] [2024-08-05 17:22:15,407][15444] Updated weights for policy 0, policy_version 28641 (0.0022) [2024-08-05 17:22:18,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 234684416. Throughput: 0: 6078.9. Samples: 58678280. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 17:22:18,126][15372] Avg episode reward: [(0, '39.160')] [2024-08-05 17:22:18,130][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000028648_234684416.pth... [2024-08-05 17:22:18,269][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000027939_228876288.pth [2024-08-05 17:22:18,896][15444] Updated weights for policy 0, policy_version 28651 (0.0037) [2024-08-05 17:22:22,498][15444] Updated weights for policy 0, policy_version 28661 (0.0012) [2024-08-05 17:22:23,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24303.0, 300 sec: 24215.1). Total num frames: 234807296. Throughput: 0: 6061.3. Samples: 58695980. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 17:22:23,119][15372] Avg episode reward: [(0, '39.780')] [2024-08-05 17:22:25,612][15444] Updated weights for policy 0, policy_version 28671 (0.0026) [2024-08-05 17:22:28,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 234921984. Throughput: 0: 6048.7. Samples: 58731540. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 17:22:28,126][15372] Avg episode reward: [(0, '39.960')] [2024-08-05 17:22:29,033][15444] Updated weights for policy 0, policy_version 28681 (0.0024) [2024-08-05 17:22:32,824][15444] Updated weights for policy 0, policy_version 28691 (0.0014) [2024-08-05 17:22:33,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 235044864. Throughput: 0: 6039.6. Samples: 58767990. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 17:22:33,119][15372] Avg episode reward: [(0, '40.206')] [2024-08-05 17:22:35,969][15444] Updated weights for policy 0, policy_version 28701 (0.0013) [2024-08-05 17:22:38,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 235167744. Throughput: 0: 6044.9. Samples: 58786200. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 17:22:38,126][15372] Avg episode reward: [(0, '42.293')] [2024-08-05 17:22:38,129][15417] Saving new best policy, reward=42.293! [2024-08-05 17:22:39,471][15444] Updated weights for policy 0, policy_version 28711 (0.0032) [2024-08-05 17:22:42,837][15444] Updated weights for policy 0, policy_version 28721 (0.0012) [2024-08-05 17:22:43,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24303.0, 300 sec: 24215.0). Total num frames: 235290624. Throughput: 0: 6027.1. Samples: 58822260. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 17:22:43,119][15372] Avg episode reward: [(0, '40.413')] [2024-08-05 17:22:46,022][15444] Updated weights for policy 0, policy_version 28731 (0.0018) [2024-08-05 17:22:48,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.7, 300 sec: 24215.0). Total num frames: 235413504. Throughput: 0: 6037.5. Samples: 58858320. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 17:22:48,119][15372] Avg episode reward: [(0, '40.842')] [2024-08-05 17:22:49,719][15444] Updated weights for policy 0, policy_version 28741 (0.0023) [2024-08-05 17:22:53,022][15444] Updated weights for policy 0, policy_version 28751 (0.0016) [2024-08-05 17:22:53,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.5, 300 sec: 24187.2). Total num frames: 235528192. Throughput: 0: 6014.9. Samples: 58876210. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 17:22:53,119][15372] Avg episode reward: [(0, '40.519')] [2024-08-05 17:22:53,862][15417] Signal inference workers to stop experience collection... (10750 times) [2024-08-05 17:22:53,863][15417] Signal inference workers to resume experience collection... (10750 times) [2024-08-05 17:22:53,901][15444] InferenceWorker_p0-w0: stopping experience collection (10750 times) [2024-08-05 17:22:53,901][15444] InferenceWorker_p0-w0: resuming experience collection (10750 times) [2024-08-05 17:22:56,175][15444] Updated weights for policy 0, policy_version 28761 (0.0011) [2024-08-05 17:22:58,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24303.0, 300 sec: 24242.8). Total num frames: 235659264. Throughput: 0: 6025.5. Samples: 58913200. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 17:22:58,119][15372] Avg episode reward: [(0, '40.433')] [2024-08-05 17:22:59,577][15444] Updated weights for policy 0, policy_version 28771 (0.0012) [2024-08-05 17:23:02,886][15444] Updated weights for policy 0, policy_version 28781 (0.0027) [2024-08-05 17:23:03,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 235773952. Throughput: 0: 6040.2. Samples: 58950090. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 17:23:03,119][15372] Avg episode reward: [(0, '39.894')] [2024-08-05 17:23:06,206][15444] Updated weights for policy 0, policy_version 28791 (0.0014) [2024-08-05 17:23:08,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 235896832. Throughput: 0: 6074.4. Samples: 58969330. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 17:23:08,126][15372] Avg episode reward: [(0, '40.548')] [2024-08-05 17:23:09,484][15444] Updated weights for policy 0, policy_version 28801 (0.0011) [2024-08-05 17:23:12,839][15444] Updated weights for policy 0, policy_version 28811 (0.0012) [2024-08-05 17:23:13,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 236019712. Throughput: 0: 6100.9. Samples: 59006080. Policy #0 lag: (min: 1.0, avg: 3.6, max: 7.0) [2024-08-05 17:23:13,119][15372] Avg episode reward: [(0, '40.368')] [2024-08-05 17:23:16,159][15444] Updated weights for policy 0, policy_version 28821 (0.0012) [2024-08-05 17:23:18,119][15372] Fps is (10 sec: 24575.6, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 236142592. Throughput: 0: 6099.5. Samples: 59042470. Policy #0 lag: (min: 1.0, avg: 3.6, max: 7.0) [2024-08-05 17:23:18,126][15372] Avg episode reward: [(0, '41.265')] [2024-08-05 17:23:19,465][15444] Updated weights for policy 0, policy_version 28831 (0.0034) [2024-08-05 17:23:22,804][15444] Updated weights for policy 0, policy_version 28841 (0.0016) [2024-08-05 17:23:23,119][15372] Fps is (10 sec: 24575.8, 60 sec: 24302.9, 300 sec: 24242.8). Total num frames: 236265472. Throughput: 0: 6120.9. Samples: 59061640. Policy #0 lag: (min: 1.0, avg: 3.6, max: 7.0) [2024-08-05 17:23:23,119][15372] Avg episode reward: [(0, '40.581')] [2024-08-05 17:23:26,002][15444] Updated weights for policy 0, policy_version 28851 (0.0011) [2024-08-05 17:23:28,119][15372] Fps is (10 sec: 25395.1, 60 sec: 24575.9, 300 sec: 24270.5). Total num frames: 236396544. Throughput: 0: 6121.3. Samples: 59097720. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 17:23:28,119][15372] Avg episode reward: [(0, '38.862')] [2024-08-05 17:23:29,689][15444] Updated weights for policy 0, policy_version 28861 (0.0010) [2024-08-05 17:23:32,884][15444] Updated weights for policy 0, policy_version 28871 (0.0020) [2024-08-05 17:23:33,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24439.4, 300 sec: 24242.8). Total num frames: 236511232. Throughput: 0: 6132.7. Samples: 59134290. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 17:23:33,119][15372] Avg episode reward: [(0, '39.535')] [2024-08-05 17:23:36,195][15444] Updated weights for policy 0, policy_version 28881 (0.0019) [2024-08-05 17:23:38,118][15372] Fps is (10 sec: 23757.2, 60 sec: 24439.5, 300 sec: 24270.5). Total num frames: 236634112. Throughput: 0: 6154.9. Samples: 59153180. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 17:23:38,126][15372] Avg episode reward: [(0, '41.183')] [2024-08-05 17:23:39,599][15444] Updated weights for policy 0, policy_version 28891 (0.0011) [2024-08-05 17:23:42,821][15444] Updated weights for policy 0, policy_version 28901 (0.0011) [2024-08-05 17:23:43,118][15372] Fps is (10 sec: 25395.3, 60 sec: 24576.0, 300 sec: 24298.3). Total num frames: 236765184. Throughput: 0: 6157.3. Samples: 59190280. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 17:23:43,119][15372] Avg episode reward: [(0, '40.206')] [2024-08-05 17:23:46,494][15444] Updated weights for policy 0, policy_version 28911 (0.0011) [2024-08-05 17:23:48,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24439.5, 300 sec: 24270.5). Total num frames: 236879872. Throughput: 0: 6130.9. Samples: 59225980. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 17:23:48,119][15372] Avg episode reward: [(0, '40.220')] [2024-08-05 17:23:49,602][15444] Updated weights for policy 0, policy_version 28921 (0.0018) [2024-08-05 17:23:50,147][15417] Signal inference workers to stop experience collection... (10800 times) [2024-08-05 17:23:50,148][15417] Signal inference workers to resume experience collection... (10800 times) [2024-08-05 17:23:50,179][15444] InferenceWorker_p0-w0: stopping experience collection (10800 times) [2024-08-05 17:23:50,184][15444] InferenceWorker_p0-w0: resuming experience collection (10800 times) [2024-08-05 17:23:52,834][15444] Updated weights for policy 0, policy_version 28931 (0.0018) [2024-08-05 17:23:53,119][15372] Fps is (10 sec: 23755.8, 60 sec: 24575.8, 300 sec: 24298.3). Total num frames: 237002752. Throughput: 0: 6126.8. Samples: 59245040. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 17:23:53,119][15372] Avg episode reward: [(0, '40.316')] [2024-08-05 17:23:56,317][15444] Updated weights for policy 0, policy_version 28941 (0.0011) [2024-08-05 17:23:58,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24303.0, 300 sec: 24270.6). Total num frames: 237117440. Throughput: 0: 6111.6. Samples: 59281100. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 17:23:58,126][15372] Avg episode reward: [(0, '40.305')] [2024-08-05 17:23:59,545][15444] Updated weights for policy 0, policy_version 28951 (0.0012) [2024-08-05 17:24:03,118][15372] Fps is (10 sec: 23757.9, 60 sec: 24439.5, 300 sec: 24270.6). Total num frames: 237240320. Throughput: 0: 6114.7. Samples: 59317630. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 17:24:03,126][15372] Avg episode reward: [(0, '39.223')] [2024-08-05 17:24:03,131][15444] Updated weights for policy 0, policy_version 28961 (0.0025) [2024-08-05 17:24:06,467][15444] Updated weights for policy 0, policy_version 28971 (0.0023) [2024-08-05 17:24:08,118][15372] Fps is (10 sec: 25395.3, 60 sec: 24576.0, 300 sec: 24326.1). Total num frames: 237371392. Throughput: 0: 6102.0. Samples: 59336230. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 17:24:08,128][15372] Avg episode reward: [(0, '39.873')] [2024-08-05 17:24:09,772][15444] Updated weights for policy 0, policy_version 28981 (0.0019) [2024-08-05 17:24:13,036][15444] Updated weights for policy 0, policy_version 28991 (0.0013) [2024-08-05 17:24:13,118][15372] Fps is (10 sec: 25395.3, 60 sec: 24576.0, 300 sec: 24298.3). Total num frames: 237494272. Throughput: 0: 6123.6. Samples: 59373280. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 17:24:13,119][15372] Avg episode reward: [(0, '40.116')] [2024-08-05 17:24:16,378][15444] Updated weights for policy 0, policy_version 29001 (0.0028) [2024-08-05 17:24:18,119][15372] Fps is (10 sec: 23755.8, 60 sec: 24439.4, 300 sec: 24298.3). Total num frames: 237608960. Throughput: 0: 6120.4. Samples: 59409710. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 17:24:18,127][15372] Avg episode reward: [(0, '40.238')] [2024-08-05 17:24:18,158][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000029006_237617152.pth... [2024-08-05 17:24:18,294][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000028293_231776256.pth [2024-08-05 17:24:19,942][15444] Updated weights for policy 0, policy_version 29011 (0.0023) [2024-08-05 17:24:23,016][15444] Updated weights for policy 0, policy_version 29021 (0.0017) [2024-08-05 17:24:23,119][15372] Fps is (10 sec: 24575.7, 60 sec: 24576.0, 300 sec: 24326.1). Total num frames: 237740032. Throughput: 0: 6114.9. Samples: 59428350. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 17:24:23,119][15372] Avg episode reward: [(0, '40.153')] [2024-08-05 17:24:26,626][15444] Updated weights for policy 0, policy_version 29031 (0.0009) [2024-08-05 17:24:28,118][15372] Fps is (10 sec: 24577.0, 60 sec: 24303.0, 300 sec: 24298.3). Total num frames: 237854720. Throughput: 0: 6090.9. Samples: 59464370. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 17:24:28,119][15372] Avg episode reward: [(0, '40.012')] [2024-08-05 17:24:29,941][15444] Updated weights for policy 0, policy_version 29041 (0.0024) [2024-08-05 17:24:33,119][15372] Fps is (10 sec: 23756.8, 60 sec: 24439.5, 300 sec: 24298.3). Total num frames: 237977600. Throughput: 0: 6118.7. Samples: 59501320. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 17:24:33,126][15372] Avg episode reward: [(0, '40.219')] [2024-08-05 17:24:33,286][15444] Updated weights for policy 0, policy_version 29051 (0.0014) [2024-08-05 17:24:36,634][15444] Updated weights for policy 0, policy_version 29061 (0.0017) [2024-08-05 17:24:37,579][15417] Signal inference workers to stop experience collection... (10850 times) [2024-08-05 17:24:37,580][15417] Signal inference workers to resume experience collection... (10850 times) [2024-08-05 17:24:37,665][15444] InferenceWorker_p0-w0: stopping experience collection (10850 times) [2024-08-05 17:24:37,665][15444] InferenceWorker_p0-w0: resuming experience collection (10850 times) [2024-08-05 17:24:38,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24439.5, 300 sec: 24326.1). Total num frames: 238100480. Throughput: 0: 6103.2. Samples: 59519680. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 17:24:38,119][15372] Avg episode reward: [(0, '40.186')] [2024-08-05 17:24:40,002][15444] Updated weights for policy 0, policy_version 29071 (0.0019) [2024-08-05 17:24:43,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24302.9, 300 sec: 24326.1). Total num frames: 238223360. Throughput: 0: 6122.0. Samples: 59556590. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 17:24:43,126][15372] Avg episode reward: [(0, '40.504')] [2024-08-05 17:24:43,543][15444] Updated weights for policy 0, policy_version 29081 (0.0017) [2024-08-05 17:24:46,632][15444] Updated weights for policy 0, policy_version 29091 (0.0019) [2024-08-05 17:24:48,119][15372] Fps is (10 sec: 23756.6, 60 sec: 24302.9, 300 sec: 24298.3). Total num frames: 238338048. Throughput: 0: 6110.0. Samples: 59592580. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 17:24:48,127][15372] Avg episode reward: [(0, '40.094')] [2024-08-05 17:24:50,119][15444] Updated weights for policy 0, policy_version 29101 (0.0020) [2024-08-05 17:24:53,119][15372] Fps is (10 sec: 24576.0, 60 sec: 24439.6, 300 sec: 24326.1). Total num frames: 238469120. Throughput: 0: 6102.0. Samples: 59610820. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 17:24:53,126][15372] Avg episode reward: [(0, '39.910')] [2024-08-05 17:24:53,576][15444] Updated weights for policy 0, policy_version 29111 (0.0010) [2024-08-05 17:24:56,673][15444] Updated weights for policy 0, policy_version 29121 (0.0029) [2024-08-05 17:24:58,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24439.5, 300 sec: 24326.1). Total num frames: 238583808. Throughput: 0: 6083.1. Samples: 59647020. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 17:24:58,119][15372] Avg episode reward: [(0, '40.381')] [2024-08-05 17:25:00,345][15444] Updated weights for policy 0, policy_version 29131 (0.0016) [2024-08-05 17:25:03,119][15372] Fps is (10 sec: 23756.2, 60 sec: 24439.4, 300 sec: 24326.1). Total num frames: 238706688. Throughput: 0: 6081.6. Samples: 59683380. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 17:25:03,119][15372] Avg episode reward: [(0, '40.916')] [2024-08-05 17:25:03,401][15444] Updated weights for policy 0, policy_version 29141 (0.0013) [2024-08-05 17:25:07,129][15444] Updated weights for policy 0, policy_version 29151 (0.0017) [2024-08-05 17:25:08,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24302.9, 300 sec: 24326.2). Total num frames: 238829568. Throughput: 0: 6071.1. Samples: 59701550. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 17:25:08,119][15372] Avg episode reward: [(0, '39.500')] [2024-08-05 17:25:10,335][15444] Updated weights for policy 0, policy_version 29161 (0.0054) [2024-08-05 17:25:13,118][15372] Fps is (10 sec: 24576.6, 60 sec: 24302.9, 300 sec: 24326.1). Total num frames: 238952448. Throughput: 0: 6084.2. Samples: 59738160. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 17:25:13,119][15372] Avg episode reward: [(0, '39.306')] [2024-08-05 17:25:13,655][15444] Updated weights for policy 0, policy_version 29171 (0.0010) [2024-08-05 17:25:17,149][15444] Updated weights for policy 0, policy_version 29181 (0.0031) [2024-08-05 17:25:18,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24439.6, 300 sec: 24326.1). Total num frames: 239075328. Throughput: 0: 6068.9. Samples: 59774420. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 17:25:18,119][15372] Avg episode reward: [(0, '39.961')] [2024-08-05 17:25:20,296][15444] Updated weights for policy 0, policy_version 29191 (0.0014) [2024-08-05 17:25:21,869][15417] Signal inference workers to stop experience collection... (10900 times) [2024-08-05 17:25:21,877][15417] Signal inference workers to resume experience collection... (10900 times) [2024-08-05 17:25:21,939][15444] InferenceWorker_p0-w0: stopping experience collection (10900 times) [2024-08-05 17:25:21,939][15444] InferenceWorker_p0-w0: resuming experience collection (10900 times) [2024-08-05 17:25:23,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24303.0, 300 sec: 24326.1). Total num frames: 239198208. Throughput: 0: 6080.7. Samples: 59793310. Policy #0 lag: (min: 1.0, avg: 3.9, max: 7.0) [2024-08-05 17:25:23,119][15372] Avg episode reward: [(0, '40.097')] [2024-08-05 17:25:23,722][15444] Updated weights for policy 0, policy_version 29201 (0.0024) [2024-08-05 17:25:27,243][15444] Updated weights for policy 0, policy_version 29211 (0.0025) [2024-08-05 17:25:28,119][15372] Fps is (10 sec: 24575.2, 60 sec: 24439.3, 300 sec: 24353.8). Total num frames: 239321088. Throughput: 0: 6073.7. Samples: 59829910. Policy #0 lag: (min: 1.0, avg: 3.9, max: 7.0) [2024-08-05 17:25:28,119][15372] Avg episode reward: [(0, '40.018')] [2024-08-05 17:25:30,243][15444] Updated weights for policy 0, policy_version 29221 (0.0017) [2024-08-05 17:25:33,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24303.0, 300 sec: 24326.1). Total num frames: 239435776. Throughput: 0: 6065.3. Samples: 59865520. Policy #0 lag: (min: 1.0, avg: 3.9, max: 7.0) [2024-08-05 17:25:33,126][15372] Avg episode reward: [(0, '40.762')] [2024-08-05 17:25:33,941][15444] Updated weights for policy 0, policy_version 29231 (0.0012) [2024-08-05 17:25:37,468][15444] Updated weights for policy 0, policy_version 29241 (0.0012) [2024-08-05 17:25:38,118][15372] Fps is (10 sec: 23757.6, 60 sec: 24302.9, 300 sec: 24326.1). Total num frames: 239558656. Throughput: 0: 6070.9. Samples: 59884010. Policy #0 lag: (min: 1.0, avg: 3.9, max: 7.0) [2024-08-05 17:25:38,119][15372] Avg episode reward: [(0, '40.783')] [2024-08-05 17:25:40,753][15444] Updated weights for policy 0, policy_version 29251 (0.0022) [2024-08-05 17:25:43,119][15372] Fps is (10 sec: 24575.4, 60 sec: 24302.8, 300 sec: 24326.1). Total num frames: 239681536. Throughput: 0: 6069.5. Samples: 59920150. Policy #0 lag: (min: 1.0, avg: 3.8, max: 7.0) [2024-08-05 17:25:43,127][15372] Avg episode reward: [(0, '40.234')] [2024-08-05 17:25:44,160][15444] Updated weights for policy 0, policy_version 29261 (0.0022) [2024-08-05 17:25:47,494][15444] Updated weights for policy 0, policy_version 29271 (0.0014) [2024-08-05 17:25:48,119][15372] Fps is (10 sec: 23756.6, 60 sec: 24302.9, 300 sec: 24298.3). Total num frames: 239796224. Throughput: 0: 6051.8. Samples: 59955710. Policy #0 lag: (min: 1.0, avg: 3.8, max: 7.0) [2024-08-05 17:25:48,119][15372] Avg episode reward: [(0, '40.961')] [2024-08-05 17:25:51,100][15444] Updated weights for policy 0, policy_version 29281 (0.0016) [2024-08-05 17:25:53,121][15372] Fps is (10 sec: 23751.9, 60 sec: 24165.5, 300 sec: 24326.0). Total num frames: 239919104. Throughput: 0: 6059.0. Samples: 59974220. Policy #0 lag: (min: 1.0, avg: 3.8, max: 7.0) [2024-08-05 17:25:53,121][15372] Avg episode reward: [(0, '41.531')] [2024-08-05 17:25:54,482][15444] Updated weights for policy 0, policy_version 29291 (0.0023) [2024-08-05 17:25:57,752][15444] Updated weights for policy 0, policy_version 29301 (0.0012) [2024-08-05 17:25:58,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24302.9, 300 sec: 24353.8). Total num frames: 240041984. Throughput: 0: 6046.2. Samples: 60010240. Policy #0 lag: (min: 1.0, avg: 3.8, max: 7.0) [2024-08-05 17:25:58,119][15372] Avg episode reward: [(0, '39.727')] [2024-08-05 17:26:01,130][15444] Updated weights for policy 0, policy_version 29311 (0.0027) [2024-08-05 17:26:03,118][15372] Fps is (10 sec: 23762.3, 60 sec: 24166.5, 300 sec: 24298.3). Total num frames: 240156672. Throughput: 0: 6040.0. Samples: 60046220. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 17:26:03,126][15372] Avg episode reward: [(0, '40.764')] [2024-08-05 17:26:04,411][15444] Updated weights for policy 0, policy_version 29321 (0.0022) [2024-08-05 17:26:08,051][15444] Updated weights for policy 0, policy_version 29331 (0.0031) [2024-08-05 17:26:08,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.4, 300 sec: 24298.8). Total num frames: 240279552. Throughput: 0: 6043.3. Samples: 60065260. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 17:26:08,119][15372] Avg episode reward: [(0, '39.744')] [2024-08-05 17:26:11,136][15444] Updated weights for policy 0, policy_version 29341 (0.0023) [2024-08-05 17:26:13,119][15372] Fps is (10 sec: 24575.2, 60 sec: 24166.3, 300 sec: 24326.1). Total num frames: 240402432. Throughput: 0: 6024.0. Samples: 60100990. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 17:26:13,127][15372] Avg episode reward: [(0, '40.782')] [2024-08-05 17:26:14,533][15444] Updated weights for policy 0, policy_version 29351 (0.0012) [2024-08-05 17:26:17,848][15444] Updated weights for policy 0, policy_version 29361 (0.0024) [2024-08-05 17:26:18,118][15372] Fps is (10 sec: 25395.2, 60 sec: 24302.9, 300 sec: 24353.9). Total num frames: 240533504. Throughput: 0: 6055.3. Samples: 60138010. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 17:26:18,119][15372] Avg episode reward: [(0, '40.871')] [2024-08-05 17:26:18,122][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000029362_240533504.pth... [2024-08-05 17:26:18,236][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000028648_234684416.pth [2024-08-05 17:26:19,551][15417] Signal inference workers to stop experience collection... (10950 times) [2024-08-05 17:26:19,552][15417] Signal inference workers to resume experience collection... (10950 times) [2024-08-05 17:26:19,601][15444] InferenceWorker_p0-w0: stopping experience collection (10950 times) [2024-08-05 17:26:19,601][15444] InferenceWorker_p0-w0: resuming experience collection (10950 times) [2024-08-05 17:26:21,406][15444] Updated weights for policy 0, policy_version 29371 (0.0023) [2024-08-05 17:26:23,118][15372] Fps is (10 sec: 24576.5, 60 sec: 24166.4, 300 sec: 24326.1). Total num frames: 240648192. Throughput: 0: 6049.3. Samples: 60156230. Policy #0 lag: (min: 0.0, avg: 3.3, max: 8.0) [2024-08-05 17:26:23,119][15372] Avg episode reward: [(0, '40.247')] [2024-08-05 17:26:24,569][15444] Updated weights for policy 0, policy_version 29381 (0.0023) [2024-08-05 17:26:27,888][15444] Updated weights for policy 0, policy_version 29391 (0.0013) [2024-08-05 17:26:28,119][15372] Fps is (10 sec: 24575.7, 60 sec: 24303.0, 300 sec: 24353.8). Total num frames: 240779264. Throughput: 0: 6091.8. Samples: 60194280. Policy #0 lag: (min: 0.0, avg: 3.3, max: 8.0) [2024-08-05 17:26:28,119][15372] Avg episode reward: [(0, '40.484')] [2024-08-05 17:26:31,262][15444] Updated weights for policy 0, policy_version 29401 (0.0025) [2024-08-05 17:26:33,119][15372] Fps is (10 sec: 23756.0, 60 sec: 24166.2, 300 sec: 24298.3). Total num frames: 240885760. Throughput: 0: 6098.4. Samples: 60230140. Policy #0 lag: (min: 0.0, avg: 3.3, max: 8.0) [2024-08-05 17:26:33,127][15372] Avg episode reward: [(0, '40.294')] [2024-08-05 17:26:34,468][15444] Updated weights for policy 0, policy_version 29411 (0.0012) [2024-08-05 17:26:37,845][15444] Updated weights for policy 0, policy_version 29421 (0.0022) [2024-08-05 17:26:38,118][15372] Fps is (10 sec: 23757.1, 60 sec: 24302.9, 300 sec: 24353.9). Total num frames: 241016832. Throughput: 0: 6098.3. Samples: 60248630. Policy #0 lag: (min: 0.0, avg: 3.3, max: 8.0) [2024-08-05 17:26:38,119][15372] Avg episode reward: [(0, '40.451')] [2024-08-05 17:26:41,355][15444] Updated weights for policy 0, policy_version 29431 (0.0022) [2024-08-05 17:26:43,118][15372] Fps is (10 sec: 25396.1, 60 sec: 24303.0, 300 sec: 24326.1). Total num frames: 241139712. Throughput: 0: 6102.7. Samples: 60284860. Policy #0 lag: (min: 1.0, avg: 3.6, max: 7.0) [2024-08-05 17:26:43,126][15372] Avg episode reward: [(0, '40.546')] [2024-08-05 17:26:44,710][15444] Updated weights for policy 0, policy_version 29441 (0.0013) [2024-08-05 17:26:48,030][15444] Updated weights for policy 0, policy_version 29451 (0.0015) [2024-08-05 17:26:48,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24439.5, 300 sec: 24353.9). Total num frames: 241262592. Throughput: 0: 6125.3. Samples: 60321860. Policy #0 lag: (min: 1.0, avg: 3.6, max: 7.0) [2024-08-05 17:26:48,119][15372] Avg episode reward: [(0, '40.032')] [2024-08-05 17:26:51,346][15444] Updated weights for policy 0, policy_version 29461 (0.0029) [2024-08-05 17:26:53,119][15372] Fps is (10 sec: 23756.7, 60 sec: 24303.8, 300 sec: 24326.1). Total num frames: 241377280. Throughput: 0: 6100.4. Samples: 60339780. Policy #0 lag: (min: 1.0, avg: 3.6, max: 7.0) [2024-08-05 17:26:53,126][15372] Avg episode reward: [(0, '40.396')] [2024-08-05 17:26:54,654][15444] Updated weights for policy 0, policy_version 29471 (0.0013) [2024-08-05 17:26:58,119][15372] Fps is (10 sec: 23756.5, 60 sec: 24302.9, 300 sec: 24326.1). Total num frames: 241500160. Throughput: 0: 6124.0. Samples: 60376570. Policy #0 lag: (min: 1.0, avg: 3.6, max: 7.0) [2024-08-05 17:26:58,126][15372] Avg episode reward: [(0, '40.032')] [2024-08-05 17:26:58,374][15444] Updated weights for policy 0, policy_version 29481 (0.0017) [2024-08-05 17:27:01,447][15444] Updated weights for policy 0, policy_version 29491 (0.0013) [2024-08-05 17:27:03,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24439.4, 300 sec: 24326.1). Total num frames: 241623040. Throughput: 0: 6103.8. Samples: 60412680. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 17:27:03,126][15372] Avg episode reward: [(0, '40.578')] [2024-08-05 17:27:04,987][15444] Updated weights for policy 0, policy_version 29501 (0.0035) [2024-08-05 17:27:05,079][15417] Signal inference workers to stop experience collection... (11000 times) [2024-08-05 17:27:05,080][15417] Signal inference workers to resume experience collection... (11000 times) [2024-08-05 17:27:05,132][15444] InferenceWorker_p0-w0: stopping experience collection (11000 times) [2024-08-05 17:27:05,132][15444] InferenceWorker_p0-w0: resuming experience collection (11000 times) [2024-08-05 17:27:08,090][15444] Updated weights for policy 0, policy_version 29511 (0.0022) [2024-08-05 17:27:08,118][15372] Fps is (10 sec: 25395.5, 60 sec: 24576.0, 300 sec: 24353.8). Total num frames: 241754112. Throughput: 0: 6116.0. Samples: 60431450. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 17:27:08,119][15372] Avg episode reward: [(0, '41.249')] [2024-08-05 17:27:11,435][15444] Updated weights for policy 0, policy_version 29521 (0.0019) [2024-08-05 17:27:13,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24439.6, 300 sec: 24353.8). Total num frames: 241868800. Throughput: 0: 6077.8. Samples: 60467780. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 17:27:13,126][15372] Avg episode reward: [(0, '40.639')] [2024-08-05 17:27:14,984][15444] Updated weights for policy 0, policy_version 29531 (0.0016) [2024-08-05 17:27:18,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24303.0, 300 sec: 24353.8). Total num frames: 241991680. Throughput: 0: 6100.9. Samples: 60504680. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 17:27:18,126][15372] Avg episode reward: [(0, '39.541')] [2024-08-05 17:27:18,153][15444] Updated weights for policy 0, policy_version 29541 (0.0018) [2024-08-05 17:27:21,521][15444] Updated weights for policy 0, policy_version 29551 (0.0024) [2024-08-05 17:27:23,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24439.5, 300 sec: 24381.6). Total num frames: 242114560. Throughput: 0: 6106.0. Samples: 60523400. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 17:27:23,126][15372] Avg episode reward: [(0, '40.536')] [2024-08-05 17:27:24,965][15444] Updated weights for policy 0, policy_version 29561 (0.0021) [2024-08-05 17:27:28,119][15372] Fps is (10 sec: 24575.8, 60 sec: 24303.0, 300 sec: 24381.6). Total num frames: 242237440. Throughput: 0: 6116.4. Samples: 60560100. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 17:27:28,126][15372] Avg episode reward: [(0, '40.685')] [2024-08-05 17:27:28,249][15444] Updated weights for policy 0, policy_version 29571 (0.0028) [2024-08-05 17:27:31,675][15444] Updated weights for policy 0, policy_version 29581 (0.0027) [2024-08-05 17:27:33,119][15372] Fps is (10 sec: 24573.8, 60 sec: 24575.8, 300 sec: 24381.5). Total num frames: 242360320. Throughput: 0: 6099.2. Samples: 60596330. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 17:27:33,120][15372] Avg episode reward: [(0, '39.634')] [2024-08-05 17:27:34,933][15444] Updated weights for policy 0, policy_version 29591 (0.0022) [2024-08-05 17:27:38,119][15372] Fps is (10 sec: 24575.8, 60 sec: 24439.4, 300 sec: 24381.6). Total num frames: 242483200. Throughput: 0: 6127.5. Samples: 60615520. Policy #0 lag: (min: 0.0, avg: 4.0, max: 9.0) [2024-08-05 17:27:38,126][15372] Avg episode reward: [(0, '39.278')] [2024-08-05 17:27:38,329][15444] Updated weights for policy 0, policy_version 29601 (0.0018) [2024-08-05 17:27:41,696][15444] Updated weights for policy 0, policy_version 29611 (0.0014) [2024-08-05 17:27:43,118][15372] Fps is (10 sec: 24578.2, 60 sec: 24439.5, 300 sec: 24381.6). Total num frames: 242606080. Throughput: 0: 6107.4. Samples: 60651400. Policy #0 lag: (min: 0.0, avg: 4.0, max: 9.0) [2024-08-05 17:27:43,119][15372] Avg episode reward: [(0, '40.577')] [2024-08-05 17:27:44,922][15444] Updated weights for policy 0, policy_version 29621 (0.0026) [2024-08-05 17:27:48,119][15372] Fps is (10 sec: 23755.8, 60 sec: 24302.7, 300 sec: 24381.6). Total num frames: 242720768. Throughput: 0: 6113.0. Samples: 60687770. Policy #0 lag: (min: 0.0, avg: 4.0, max: 9.0) [2024-08-05 17:27:48,127][15372] Avg episode reward: [(0, '40.761')] [2024-08-05 17:27:48,550][15444] Updated weights for policy 0, policy_version 29631 (0.0038) [2024-08-05 17:27:51,671][15444] Updated weights for policy 0, policy_version 29641 (0.0014) [2024-08-05 17:27:53,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24576.0, 300 sec: 24381.6). Total num frames: 242851840. Throughput: 0: 6104.9. Samples: 60706170. Policy #0 lag: (min: 0.0, avg: 4.0, max: 9.0) [2024-08-05 17:27:53,126][15372] Avg episode reward: [(0, '40.404')] [2024-08-05 17:27:55,039][15444] Updated weights for policy 0, policy_version 29651 (0.0021) [2024-08-05 17:27:55,172][15417] Signal inference workers to stop experience collection... (11050 times) [2024-08-05 17:27:55,173][15417] Signal inference workers to resume experience collection... (11050 times) [2024-08-05 17:27:55,205][15444] InferenceWorker_p0-w0: stopping experience collection (11050 times) [2024-08-05 17:27:55,210][15444] InferenceWorker_p0-w0: resuming experience collection (11050 times) [2024-08-05 17:27:58,118][15372] Fps is (10 sec: 25396.5, 60 sec: 24576.0, 300 sec: 24409.4). Total num frames: 242974720. Throughput: 0: 6116.4. Samples: 60743020. Policy #0 lag: (min: 0.0, avg: 3.2, max: 7.0) [2024-08-05 17:27:58,119][15372] Avg episode reward: [(0, '40.213')] [2024-08-05 17:27:58,536][15444] Updated weights for policy 0, policy_version 29661 (0.0018) [2024-08-05 17:28:01,742][15444] Updated weights for policy 0, policy_version 29671 (0.0015) [2024-08-05 17:28:03,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24439.5, 300 sec: 24381.6). Total num frames: 243089408. Throughput: 0: 6100.4. Samples: 60779200. Policy #0 lag: (min: 0.0, avg: 3.2, max: 7.0) [2024-08-05 17:28:03,119][15372] Avg episode reward: [(0, '40.248')] [2024-08-05 17:28:05,328][15444] Updated weights for policy 0, policy_version 29681 (0.0030) [2024-08-05 17:28:08,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24439.5, 300 sec: 24409.4). Total num frames: 243220480. Throughput: 0: 6094.4. Samples: 60797650. Policy #0 lag: (min: 0.0, avg: 3.2, max: 7.0) [2024-08-05 17:28:08,119][15372] Avg episode reward: [(0, '40.803')] [2024-08-05 17:28:08,691][15444] Updated weights for policy 0, policy_version 29691 (0.0012) [2024-08-05 17:28:12,000][15444] Updated weights for policy 0, policy_version 29701 (0.0014) [2024-08-05 17:28:13,119][15372] Fps is (10 sec: 24574.7, 60 sec: 24439.3, 300 sec: 24381.6). Total num frames: 243335168. Throughput: 0: 6085.7. Samples: 60833960. Policy #0 lag: (min: 0.0, avg: 3.2, max: 7.0) [2024-08-05 17:28:13,119][15372] Avg episode reward: [(0, '41.026')] [2024-08-05 17:28:15,378][15444] Updated weights for policy 0, policy_version 29711 (0.0013) [2024-08-05 17:28:18,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24439.4, 300 sec: 24381.6). Total num frames: 243458048. Throughput: 0: 6107.0. Samples: 60871140. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 17:28:18,119][15372] Avg episode reward: [(0, '40.091')] [2024-08-05 17:28:18,122][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000029719_243458048.pth... [2024-08-05 17:28:18,282][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000029006_237617152.pth [2024-08-05 17:28:18,575][15444] Updated weights for policy 0, policy_version 29721 (0.0026) [2024-08-05 17:28:21,967][15444] Updated weights for policy 0, policy_version 29731 (0.0023) [2024-08-05 17:28:23,119][15372] Fps is (10 sec: 24577.0, 60 sec: 24439.4, 300 sec: 24353.9). Total num frames: 243580928. Throughput: 0: 6082.5. Samples: 60889230. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 17:28:23,119][15372] Avg episode reward: [(0, '39.974')] [2024-08-05 17:28:25,418][15444] Updated weights for policy 0, policy_version 29741 (0.0011) [2024-08-05 17:28:28,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24303.0, 300 sec: 24353.8). Total num frames: 243695616. Throughput: 0: 6090.7. Samples: 60925480. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 17:28:28,126][15372] Avg episode reward: [(0, '40.122')] [2024-08-05 17:28:28,879][15444] Updated weights for policy 0, policy_version 29751 (0.0034) [2024-08-05 17:28:32,309][15444] Updated weights for policy 0, policy_version 29761 (0.0012) [2024-08-05 17:28:33,119][15372] Fps is (10 sec: 23756.6, 60 sec: 24303.2, 300 sec: 24353.8). Total num frames: 243818496. Throughput: 0: 6084.7. Samples: 60961580. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 17:28:33,119][15372] Avg episode reward: [(0, '39.973')] [2024-08-05 17:28:35,398][15444] Updated weights for policy 0, policy_version 29771 (0.0037) [2024-08-05 17:28:38,121][15372] Fps is (10 sec: 24569.7, 60 sec: 24302.0, 300 sec: 24325.9). Total num frames: 243941376. Throughput: 0: 6093.9. Samples: 60980410. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 17:28:38,121][15372] Avg episode reward: [(0, '39.203')] [2024-08-05 17:28:38,994][15444] Updated weights for policy 0, policy_version 29781 (0.0042) [2024-08-05 17:28:41,575][15417] Signal inference workers to stop experience collection... (11100 times) [2024-08-05 17:28:41,575][15417] Signal inference workers to resume experience collection... (11100 times) [2024-08-05 17:28:41,651][15444] InferenceWorker_p0-w0: stopping experience collection (11100 times) [2024-08-05 17:28:41,656][15444] InferenceWorker_p0-w0: resuming experience collection (11100 times) [2024-08-05 17:28:42,176][15444] Updated weights for policy 0, policy_version 29791 (0.0024) [2024-08-05 17:28:43,119][15372] Fps is (10 sec: 24575.8, 60 sec: 24302.8, 300 sec: 24353.8). Total num frames: 244064256. Throughput: 0: 6081.8. Samples: 61016700. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 17:28:43,119][15372] Avg episode reward: [(0, '39.040')] [2024-08-05 17:28:45,538][15444] Updated weights for policy 0, policy_version 29801 (0.0020) [2024-08-05 17:28:48,119][15372] Fps is (10 sec: 25401.1, 60 sec: 24576.1, 300 sec: 24381.6). Total num frames: 244195328. Throughput: 0: 6094.9. Samples: 61053470. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 17:28:48,126][15372] Avg episode reward: [(0, '40.140')] [2024-08-05 17:28:49,145][15444] Updated weights for policy 0, policy_version 29811 (0.0022) [2024-08-05 17:28:52,248][15444] Updated weights for policy 0, policy_version 29821 (0.0016) [2024-08-05 17:28:53,120][15372] Fps is (10 sec: 23752.9, 60 sec: 24165.6, 300 sec: 24353.7). Total num frames: 244301824. Throughput: 0: 6078.4. Samples: 61071190. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 17:28:53,128][15372] Avg episode reward: [(0, '40.719')] [2024-08-05 17:28:55,786][15444] Updated weights for policy 0, policy_version 29831 (0.0020) [2024-08-05 17:28:58,118][15372] Fps is (10 sec: 23757.3, 60 sec: 24303.0, 300 sec: 24381.6). Total num frames: 244432896. Throughput: 0: 6082.7. Samples: 61107680. Policy #0 lag: (min: 1.0, avg: 4.0, max: 7.0) [2024-08-05 17:28:58,119][15372] Avg episode reward: [(0, '40.549')] [2024-08-05 17:28:59,110][15444] Updated weights for policy 0, policy_version 29841 (0.0012) [2024-08-05 17:29:02,405][15444] Updated weights for policy 0, policy_version 29851 (0.0027) [2024-08-05 17:29:03,118][15372] Fps is (10 sec: 24580.7, 60 sec: 24302.9, 300 sec: 24326.1). Total num frames: 244547584. Throughput: 0: 6059.6. Samples: 61143820. Policy #0 lag: (min: 1.0, avg: 4.0, max: 7.0) [2024-08-05 17:29:03,119][15372] Avg episode reward: [(0, '40.660')] [2024-08-05 17:29:05,880][15444] Updated weights for policy 0, policy_version 29861 (0.0021) [2024-08-05 17:29:08,119][15372] Fps is (10 sec: 23756.3, 60 sec: 24166.3, 300 sec: 24326.1). Total num frames: 244670464. Throughput: 0: 6072.7. Samples: 61162500. Policy #0 lag: (min: 1.0, avg: 4.0, max: 7.0) [2024-08-05 17:29:08,126][15372] Avg episode reward: [(0, '40.844')] [2024-08-05 17:29:09,047][15444] Updated weights for policy 0, policy_version 29871 (0.0011) [2024-08-05 17:29:12,865][15444] Updated weights for policy 0, policy_version 29881 (0.0030) [2024-08-05 17:29:13,119][15372] Fps is (10 sec: 24575.3, 60 sec: 24303.0, 300 sec: 24353.9). Total num frames: 244793344. Throughput: 0: 6060.6. Samples: 61198210. Policy #0 lag: (min: 1.0, avg: 4.0, max: 7.0) [2024-08-05 17:29:13,119][15372] Avg episode reward: [(0, '40.406')] [2024-08-05 17:29:16,035][15444] Updated weights for policy 0, policy_version 29891 (0.0011) [2024-08-05 17:29:18,118][15372] Fps is (10 sec: 24576.4, 60 sec: 24302.9, 300 sec: 24326.1). Total num frames: 244916224. Throughput: 0: 6065.8. Samples: 61234540. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 17:29:18,119][15372] Avg episode reward: [(0, '40.308')] [2024-08-05 17:29:19,530][15444] Updated weights for policy 0, policy_version 29901 (0.0026) [2024-08-05 17:29:22,660][15444] Updated weights for policy 0, policy_version 29911 (0.0011) [2024-08-05 17:29:23,118][15372] Fps is (10 sec: 23757.4, 60 sec: 24166.4, 300 sec: 24326.1). Total num frames: 245030912. Throughput: 0: 6052.3. Samples: 61252750. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 17:29:23,119][15372] Avg episode reward: [(0, '39.530')] [2024-08-05 17:29:26,105][15444] Updated weights for policy 0, policy_version 29921 (0.0014) [2024-08-05 17:29:28,131][15372] Fps is (10 sec: 23726.5, 60 sec: 24297.8, 300 sec: 24325.0). Total num frames: 245153792. Throughput: 0: 6051.2. Samples: 61289080. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 17:29:28,139][15372] Avg episode reward: [(0, '39.900')] [2024-08-05 17:29:29,810][15444] Updated weights for policy 0, policy_version 29931 (0.0012) [2024-08-05 17:29:30,386][15417] Signal inference workers to stop experience collection... (11150 times) [2024-08-05 17:29:30,387][15417] Signal inference workers to resume experience collection... (11150 times) [2024-08-05 17:29:30,413][15444] InferenceWorker_p0-w0: stopping experience collection (11150 times) [2024-08-05 17:29:30,417][15444] InferenceWorker_p0-w0: resuming experience collection (11150 times) [2024-08-05 17:29:32,796][15444] Updated weights for policy 0, policy_version 29941 (0.0011) [2024-08-05 17:29:33,119][15372] Fps is (10 sec: 25394.7, 60 sec: 24439.4, 300 sec: 24353.8). Total num frames: 245284864. Throughput: 0: 6058.7. Samples: 61326110. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 17:29:33,119][15372] Avg episode reward: [(0, '40.162')] [2024-08-05 17:29:36,474][15444] Updated weights for policy 0, policy_version 29951 (0.0012) [2024-08-05 17:29:38,118][15372] Fps is (10 sec: 23787.1, 60 sec: 24167.4, 300 sec: 24298.3). Total num frames: 245391360. Throughput: 0: 6063.8. Samples: 61344050. Policy #0 lag: (min: 1.0, avg: 3.5, max: 8.0) [2024-08-05 17:29:38,119][15372] Avg episode reward: [(0, '41.086')] [2024-08-05 17:29:39,654][15444] Updated weights for policy 0, policy_version 29961 (0.0015) [2024-08-05 17:29:42,943][15444] Updated weights for policy 0, policy_version 29971 (0.0025) [2024-08-05 17:29:43,118][15372] Fps is (10 sec: 23757.4, 60 sec: 24303.0, 300 sec: 24353.9). Total num frames: 245522432. Throughput: 0: 6070.9. Samples: 61380870. Policy #0 lag: (min: 1.0, avg: 3.5, max: 8.0) [2024-08-05 17:29:43,119][15372] Avg episode reward: [(0, '40.615')] [2024-08-05 17:29:46,411][15444] Updated weights for policy 0, policy_version 29981 (0.0012) [2024-08-05 17:29:48,119][15372] Fps is (10 sec: 25395.0, 60 sec: 24166.4, 300 sec: 24326.1). Total num frames: 245645312. Throughput: 0: 6070.6. Samples: 61417000. Policy #0 lag: (min: 1.0, avg: 3.5, max: 8.0) [2024-08-05 17:29:48,126][15372] Avg episode reward: [(0, '39.369')] [2024-08-05 17:29:49,696][15444] Updated weights for policy 0, policy_version 29991 (0.0011) [2024-08-05 17:29:53,019][15444] Updated weights for policy 0, policy_version 30001 (0.0018) [2024-08-05 17:29:53,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24440.2, 300 sec: 24353.8). Total num frames: 245768192. Throughput: 0: 6074.3. Samples: 61435840. Policy #0 lag: (min: 1.0, avg: 3.5, max: 8.0) [2024-08-05 17:29:53,119][15372] Avg episode reward: [(0, '38.605')] [2024-08-05 17:29:56,358][15444] Updated weights for policy 0, policy_version 30011 (0.0015) [2024-08-05 17:29:58,119][15372] Fps is (10 sec: 24576.0, 60 sec: 24302.9, 300 sec: 24353.9). Total num frames: 245891072. Throughput: 0: 6080.2. Samples: 61471820. Policy #0 lag: (min: 1.0, avg: 3.9, max: 8.0) [2024-08-05 17:29:58,126][15372] Avg episode reward: [(0, '39.595')] [2024-08-05 17:29:59,932][15444] Updated weights for policy 0, policy_version 30021 (0.0012) [2024-08-05 17:30:03,119][15372] Fps is (10 sec: 23755.8, 60 sec: 24302.8, 300 sec: 24326.0). Total num frames: 246005760. Throughput: 0: 6089.3. Samples: 61508560. Policy #0 lag: (min: 1.0, avg: 3.9, max: 8.0) [2024-08-05 17:30:03,127][15372] Avg episode reward: [(0, '40.464')] [2024-08-05 17:30:03,257][15444] Updated weights for policy 0, policy_version 30031 (0.0038) [2024-08-05 17:30:06,414][15444] Updated weights for policy 0, policy_version 30041 (0.0030) [2024-08-05 17:30:08,119][15372] Fps is (10 sec: 23755.6, 60 sec: 24302.8, 300 sec: 24326.0). Total num frames: 246128640. Throughput: 0: 6095.5. Samples: 61527050. Policy #0 lag: (min: 1.0, avg: 3.9, max: 8.0) [2024-08-05 17:30:08,127][15372] Avg episode reward: [(0, '40.501')] [2024-08-05 17:30:09,756][15444] Updated weights for policy 0, policy_version 30051 (0.0020) [2024-08-05 17:30:10,681][15417] Signal inference workers to stop experience collection... (11200 times) [2024-08-05 17:30:10,684][15417] Signal inference workers to resume experience collection... (11200 times) [2024-08-05 17:30:10,763][15444] InferenceWorker_p0-w0: stopping experience collection (11200 times) [2024-08-05 17:30:10,763][15444] InferenceWorker_p0-w0: resuming experience collection (11200 times) [2024-08-05 17:30:13,118][15372] Fps is (10 sec: 24576.9, 60 sec: 24303.0, 300 sec: 24326.1). Total num frames: 246251520. Throughput: 0: 6101.3. Samples: 61563560. Policy #0 lag: (min: 1.0, avg: 3.9, max: 8.0) [2024-08-05 17:30:13,119][15372] Avg episode reward: [(0, '40.891')] [2024-08-05 17:30:13,368][15444] Updated weights for policy 0, policy_version 30061 (0.0016) [2024-08-05 17:30:16,670][15444] Updated weights for policy 0, policy_version 30071 (0.0014) [2024-08-05 17:30:18,118][15372] Fps is (10 sec: 25396.6, 60 sec: 24439.5, 300 sec: 24353.8). Total num frames: 246382592. Throughput: 0: 6099.8. Samples: 61600600. Policy #0 lag: (min: 1.0, avg: 3.6, max: 7.0) [2024-08-05 17:30:18,119][15372] Avg episode reward: [(0, '40.114')] [2024-08-05 17:30:18,122][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000030076_246382592.pth... [2024-08-05 17:30:18,232][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000029362_240533504.pth [2024-08-05 17:30:19,894][15444] Updated weights for policy 0, policy_version 30081 (0.0024) [2024-08-05 17:30:23,119][15372] Fps is (10 sec: 24575.1, 60 sec: 24439.3, 300 sec: 24326.1). Total num frames: 246497280. Throughput: 0: 6102.8. Samples: 61618680. Policy #0 lag: (min: 1.0, avg: 3.6, max: 7.0) [2024-08-05 17:30:23,127][15372] Avg episode reward: [(0, '40.158')] [2024-08-05 17:30:23,318][15444] Updated weights for policy 0, policy_version 30091 (0.0022) [2024-08-05 17:30:26,558][15444] Updated weights for policy 0, policy_version 30101 (0.0013) [2024-08-05 17:30:28,119][15372] Fps is (10 sec: 23756.2, 60 sec: 24444.6, 300 sec: 24353.8). Total num frames: 246620160. Throughput: 0: 6088.4. Samples: 61654850. Policy #0 lag: (min: 1.0, avg: 3.6, max: 7.0) [2024-08-05 17:30:28,119][15372] Avg episode reward: [(0, '40.785')] [2024-08-05 17:30:30,071][15444] Updated weights for policy 0, policy_version 30111 (0.0010) [2024-08-05 17:30:33,118][15372] Fps is (10 sec: 24577.0, 60 sec: 24303.0, 300 sec: 24353.8). Total num frames: 246743040. Throughput: 0: 6107.6. Samples: 61691840. Policy #0 lag: (min: 1.0, avg: 3.6, max: 7.0) [2024-08-05 17:30:33,126][15372] Avg episode reward: [(0, '40.381')] [2024-08-05 17:30:33,515][15444] Updated weights for policy 0, policy_version 30121 (0.0012) [2024-08-05 17:30:36,592][15444] Updated weights for policy 0, policy_version 30131 (0.0021) [2024-08-05 17:30:38,118][15372] Fps is (10 sec: 24576.7, 60 sec: 24576.0, 300 sec: 24353.9). Total num frames: 246865920. Throughput: 0: 6094.7. Samples: 61710100. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 17:30:38,126][15372] Avg episode reward: [(0, '39.937')] [2024-08-05 17:30:40,119][15444] Updated weights for policy 0, policy_version 30141 (0.0012) [2024-08-05 17:30:43,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24439.5, 300 sec: 24381.6). Total num frames: 246988800. Throughput: 0: 6120.9. Samples: 61747260. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 17:30:43,126][15372] Avg episode reward: [(0, '40.540')] [2024-08-05 17:30:43,327][15444] Updated weights for policy 0, policy_version 30151 (0.0012) [2024-08-05 17:30:46,756][15444] Updated weights for policy 0, policy_version 30161 (0.0016) [2024-08-05 17:30:48,119][15372] Fps is (10 sec: 23756.3, 60 sec: 24302.9, 300 sec: 24354.0). Total num frames: 247103488. Throughput: 0: 6103.8. Samples: 61783230. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 17:30:48,119][15372] Avg episode reward: [(0, '41.300')] [2024-08-05 17:30:50,305][15444] Updated weights for policy 0, policy_version 30171 (0.0013) [2024-08-05 17:30:53,118][15372] Fps is (10 sec: 23756.6, 60 sec: 24302.9, 300 sec: 24353.8). Total num frames: 247226368. Throughput: 0: 6106.5. Samples: 61801840. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 17:30:53,119][15372] Avg episode reward: [(0, '40.645')] [2024-08-05 17:30:53,570][15444] Updated weights for policy 0, policy_version 30181 (0.0030) [2024-08-05 17:30:55,528][15417] Signal inference workers to stop experience collection... (11250 times) [2024-08-05 17:30:55,529][15417] Signal inference workers to resume experience collection... (11250 times) [2024-08-05 17:30:55,576][15444] InferenceWorker_p0-w0: stopping experience collection (11250 times) [2024-08-05 17:30:55,582][15444] InferenceWorker_p0-w0: resuming experience collection (11250 times) [2024-08-05 17:30:57,125][15444] Updated weights for policy 0, policy_version 30191 (0.0021) [2024-08-05 17:30:58,119][15372] Fps is (10 sec: 24575.8, 60 sec: 24302.8, 300 sec: 24381.6). Total num frames: 247349248. Throughput: 0: 6078.6. Samples: 61837100. Policy #0 lag: (min: 1.0, avg: 3.7, max: 9.0) [2024-08-05 17:30:58,119][15372] Avg episode reward: [(0, '40.758')] [2024-08-05 17:31:00,322][15444] Updated weights for policy 0, policy_version 30201 (0.0016) [2024-08-05 17:31:03,119][15372] Fps is (10 sec: 23755.9, 60 sec: 24302.9, 300 sec: 24353.8). Total num frames: 247463936. Throughput: 0: 6072.8. Samples: 61873880. Policy #0 lag: (min: 1.0, avg: 3.7, max: 9.0) [2024-08-05 17:31:03,119][15372] Avg episode reward: [(0, '40.012')] [2024-08-05 17:31:03,922][15444] Updated weights for policy 0, policy_version 30211 (0.0015) [2024-08-05 17:31:07,343][15444] Updated weights for policy 0, policy_version 30221 (0.0022) [2024-08-05 17:31:08,118][15372] Fps is (10 sec: 23757.5, 60 sec: 24303.2, 300 sec: 24353.9). Total num frames: 247586816. Throughput: 0: 6071.2. Samples: 61891880. Policy #0 lag: (min: 1.0, avg: 3.7, max: 9.0) [2024-08-05 17:31:08,119][15372] Avg episode reward: [(0, '40.083')] [2024-08-05 17:31:10,545][15444] Updated weights for policy 0, policy_version 30231 (0.0012) [2024-08-05 17:31:13,119][15372] Fps is (10 sec: 24576.6, 60 sec: 24302.9, 300 sec: 24326.1). Total num frames: 247709696. Throughput: 0: 6066.9. Samples: 61927860. Policy #0 lag: (min: 1.0, avg: 3.9, max: 7.0) [2024-08-05 17:31:13,126][15372] Avg episode reward: [(0, '40.102')] [2024-08-05 17:31:14,059][15444] Updated weights for policy 0, policy_version 30241 (0.0014) [2024-08-05 17:31:17,349][15444] Updated weights for policy 0, policy_version 30251 (0.0022) [2024-08-05 17:31:18,119][15372] Fps is (10 sec: 24575.7, 60 sec: 24166.4, 300 sec: 24353.8). Total num frames: 247832576. Throughput: 0: 6044.0. Samples: 61963820. Policy #0 lag: (min: 1.0, avg: 3.9, max: 7.0) [2024-08-05 17:31:18,119][15372] Avg episode reward: [(0, '40.433')] [2024-08-05 17:31:20,835][15444] Updated weights for policy 0, policy_version 30261 (0.0021) [2024-08-05 17:31:23,119][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.5, 300 sec: 24298.3). Total num frames: 247947264. Throughput: 0: 6048.9. Samples: 61982300. Policy #0 lag: (min: 1.0, avg: 3.9, max: 7.0) [2024-08-05 17:31:23,126][15372] Avg episode reward: [(0, '40.854')] [2024-08-05 17:31:24,370][15444] Updated weights for policy 0, policy_version 30271 (0.0022) [2024-08-05 17:31:27,591][15444] Updated weights for policy 0, policy_version 30281 (0.0014) [2024-08-05 17:31:28,118][15372] Fps is (10 sec: 23757.2, 60 sec: 24166.5, 300 sec: 24353.9). Total num frames: 248070144. Throughput: 0: 6021.6. Samples: 62018230. Policy #0 lag: (min: 1.0, avg: 3.9, max: 7.0) [2024-08-05 17:31:28,119][15372] Avg episode reward: [(0, '40.638')] [2024-08-05 17:31:31,017][15444] Updated weights for policy 0, policy_version 30291 (0.0027) [2024-08-05 17:31:33,118][15372] Fps is (10 sec: 23757.3, 60 sec: 24029.9, 300 sec: 24298.3). Total num frames: 248184832. Throughput: 0: 6020.9. Samples: 62054170. Policy #0 lag: (min: 0.0, avg: 4.1, max: 9.0) [2024-08-05 17:31:33,119][15372] Avg episode reward: [(0, '40.143')] [2024-08-05 17:31:34,191][15444] Updated weights for policy 0, policy_version 30301 (0.0022) [2024-08-05 17:31:37,745][15417] Signal inference workers to stop experience collection... (11300 times) [2024-08-05 17:31:37,753][15417] Signal inference workers to resume experience collection... (11300 times) [2024-08-05 17:31:37,821][15444] InferenceWorker_p0-w0: stopping experience collection (11300 times) [2024-08-05 17:31:37,821][15444] InferenceWorker_p0-w0: resuming experience collection (11300 times) [2024-08-05 17:31:37,844][15444] Updated weights for policy 0, policy_version 30311 (0.0025) [2024-08-05 17:31:38,119][15372] Fps is (10 sec: 23754.5, 60 sec: 24029.5, 300 sec: 24298.2). Total num frames: 248307712. Throughput: 0: 6011.4. Samples: 62072360. Policy #0 lag: (min: 0.0, avg: 4.1, max: 9.0) [2024-08-05 17:31:38,120][15372] Avg episode reward: [(0, '40.495')] [2024-08-05 17:31:41,251][15444] Updated weights for policy 0, policy_version 30321 (0.0017) [2024-08-05 17:31:43,118][15372] Fps is (10 sec: 25395.1, 60 sec: 24166.4, 300 sec: 24326.1). Total num frames: 248438784. Throughput: 0: 6029.4. Samples: 62108420. Policy #0 lag: (min: 0.0, avg: 4.1, max: 9.0) [2024-08-05 17:31:43,126][15372] Avg episode reward: [(0, '39.927')] [2024-08-05 17:31:44,584][15444] Updated weights for policy 0, policy_version 30331 (0.0011) [2024-08-05 17:31:48,107][15444] Updated weights for policy 0, policy_version 30341 (0.0013) [2024-08-05 17:31:48,118][15372] Fps is (10 sec: 24578.1, 60 sec: 24166.5, 300 sec: 24326.1). Total num frames: 248553472. Throughput: 0: 6007.2. Samples: 62144200. Policy #0 lag: (min: 0.0, avg: 4.1, max: 9.0) [2024-08-05 17:31:48,119][15372] Avg episode reward: [(0, '38.947')] [2024-08-05 17:31:51,276][15444] Updated weights for policy 0, policy_version 30351 (0.0012) [2024-08-05 17:31:53,119][15372] Fps is (10 sec: 23755.2, 60 sec: 24166.1, 300 sec: 24326.0). Total num frames: 248676352. Throughput: 0: 6019.2. Samples: 62162750. Policy #0 lag: (min: 1.0, avg: 3.2, max: 7.0) [2024-08-05 17:31:53,127][15372] Avg episode reward: [(0, '39.706')] [2024-08-05 17:31:54,866][15444] Updated weights for policy 0, policy_version 30361 (0.0014) [2024-08-05 17:31:58,059][15444] Updated weights for policy 0, policy_version 30371 (0.0023) [2024-08-05 17:31:58,119][15372] Fps is (10 sec: 24575.7, 60 sec: 24166.5, 300 sec: 24326.1). Total num frames: 248799232. Throughput: 0: 6025.8. Samples: 62199020. Policy #0 lag: (min: 1.0, avg: 3.2, max: 7.0) [2024-08-05 17:31:58,119][15372] Avg episode reward: [(0, '40.909')] [2024-08-05 17:32:01,708][15444] Updated weights for policy 0, policy_version 30381 (0.0030) [2024-08-05 17:32:03,118][15372] Fps is (10 sec: 23758.6, 60 sec: 24166.6, 300 sec: 24270.5). Total num frames: 248913920. Throughput: 0: 6000.2. Samples: 62233830. Policy #0 lag: (min: 1.0, avg: 3.2, max: 7.0) [2024-08-05 17:32:03,119][15372] Avg episode reward: [(0, '40.299')] [2024-08-05 17:32:05,147][15444] Updated weights for policy 0, policy_version 30391 (0.0021) [2024-08-05 17:32:08,118][15372] Fps is (10 sec: 23757.1, 60 sec: 24166.4, 300 sec: 24298.3). Total num frames: 249036800. Throughput: 0: 6014.5. Samples: 62252950. Policy #0 lag: (min: 1.0, avg: 3.2, max: 7.0) [2024-08-05 17:32:08,126][15372] Avg episode reward: [(0, '40.847')] [2024-08-05 17:32:08,461][15444] Updated weights for policy 0, policy_version 30401 (0.0023) [2024-08-05 17:32:11,824][15444] Updated weights for policy 0, policy_version 30411 (0.0010) [2024-08-05 17:32:13,119][15372] Fps is (10 sec: 23756.4, 60 sec: 24029.9, 300 sec: 24270.5). Total num frames: 249151488. Throughput: 0: 6025.7. Samples: 62289390. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 17:32:13,119][15372] Avg episode reward: [(0, '40.788')] [2024-08-05 17:32:15,145][15444] Updated weights for policy 0, policy_version 30421 (0.0017) [2024-08-05 17:32:18,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.9, 300 sec: 24270.5). Total num frames: 249274368. Throughput: 0: 6021.3. Samples: 62325130. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 17:32:18,126][15372] Avg episode reward: [(0, '41.006')] [2024-08-05 17:32:18,129][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000030429_249274368.pth... [2024-08-05 17:32:18,257][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000029719_243458048.pth [2024-08-05 17:32:18,601][15444] Updated weights for policy 0, policy_version 30431 (0.0025) [2024-08-05 17:32:21,951][15444] Updated weights for policy 0, policy_version 30441 (0.0011) [2024-08-05 17:32:23,118][15372] Fps is (10 sec: 24576.4, 60 sec: 24166.5, 300 sec: 24270.5). Total num frames: 249397248. Throughput: 0: 6021.0. Samples: 62343300. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 17:32:23,119][15372] Avg episode reward: [(0, '40.665')] [2024-08-05 17:32:25,478][15444] Updated weights for policy 0, policy_version 30451 (0.0012) [2024-08-05 17:32:28,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.8, 300 sec: 24242.8). Total num frames: 249511936. Throughput: 0: 6014.0. Samples: 62379050. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 17:32:28,126][15372] Avg episode reward: [(0, '41.060')] [2024-08-05 17:32:28,653][15444] Updated weights for policy 0, policy_version 30461 (0.0016) [2024-08-05 17:32:30,380][15417] Signal inference workers to stop experience collection... (11350 times) [2024-08-05 17:32:30,380][15417] Signal inference workers to resume experience collection... (11350 times) [2024-08-05 17:32:30,458][15444] InferenceWorker_p0-w0: stopping experience collection (11350 times) [2024-08-05 17:32:30,458][15444] InferenceWorker_p0-w0: resuming experience collection (11350 times) [2024-08-05 17:32:32,276][15444] Updated weights for policy 0, policy_version 30471 (0.0015) [2024-08-05 17:32:33,119][15372] Fps is (10 sec: 24575.4, 60 sec: 24302.8, 300 sec: 24270.5). Total num frames: 249643008. Throughput: 0: 6034.0. Samples: 62415730. Policy #0 lag: (min: 1.0, avg: 3.4, max: 7.0) [2024-08-05 17:32:33,126][15372] Avg episode reward: [(0, '41.144')] [2024-08-05 17:32:35,325][15444] Updated weights for policy 0, policy_version 30481 (0.0024) [2024-08-05 17:32:38,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.8, 300 sec: 24242.8). Total num frames: 249757696. Throughput: 0: 6037.0. Samples: 62434410. Policy #0 lag: (min: 1.0, avg: 3.4, max: 7.0) [2024-08-05 17:32:38,119][15372] Avg episode reward: [(0, '40.991')] [2024-08-05 17:32:38,786][15444] Updated weights for policy 0, policy_version 30491 (0.0021) [2024-08-05 17:32:42,397][15444] Updated weights for policy 0, policy_version 30501 (0.0031) [2024-08-05 17:32:43,118][15372] Fps is (10 sec: 23757.4, 60 sec: 24029.9, 300 sec: 24270.6). Total num frames: 249880576. Throughput: 0: 6017.6. Samples: 62469810. Policy #0 lag: (min: 1.0, avg: 3.4, max: 7.0) [2024-08-05 17:32:43,119][15372] Avg episode reward: [(0, '41.350')] [2024-08-05 17:32:45,749][15444] Updated weights for policy 0, policy_version 30511 (0.0023) [2024-08-05 17:32:48,134][15372] Fps is (10 sec: 23721.0, 60 sec: 24023.8, 300 sec: 24213.8). Total num frames: 249995264. Throughput: 0: 6051.3. Samples: 62506230. Policy #0 lag: (min: 1.0, avg: 3.4, max: 7.0) [2024-08-05 17:32:48,141][15372] Avg episode reward: [(0, '40.346')] [2024-08-05 17:32:49,209][15444] Updated weights for policy 0, policy_version 30521 (0.0027) [2024-08-05 17:32:52,702][15444] Updated weights for policy 0, policy_version 30531 (0.0012) [2024-08-05 17:32:53,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24030.2, 300 sec: 24215.0). Total num frames: 250118144. Throughput: 0: 6021.3. Samples: 62523910. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 17:32:53,119][15372] Avg episode reward: [(0, '40.300')] [2024-08-05 17:32:55,821][15444] Updated weights for policy 0, policy_version 30541 (0.0020) [2024-08-05 17:32:58,118][15372] Fps is (10 sec: 24613.1, 60 sec: 24029.9, 300 sec: 24242.8). Total num frames: 250241024. Throughput: 0: 6010.2. Samples: 62559850. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 17:32:58,126][15372] Avg episode reward: [(0, '39.849')] [2024-08-05 17:32:59,669][15444] Updated weights for policy 0, policy_version 30551 (0.0014) [2024-08-05 17:33:03,007][15444] Updated weights for policy 0, policy_version 30561 (0.0016) [2024-08-05 17:33:03,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.9, 300 sec: 24187.2). Total num frames: 250355712. Throughput: 0: 6002.4. Samples: 62595240. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 17:33:03,119][15372] Avg episode reward: [(0, '40.815')] [2024-08-05 17:33:06,271][15444] Updated weights for policy 0, policy_version 30571 (0.0014) [2024-08-05 17:33:08,119][15372] Fps is (10 sec: 23756.1, 60 sec: 24029.8, 300 sec: 24215.0). Total num frames: 250478592. Throughput: 0: 6016.6. Samples: 62614050. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 17:33:08,126][15372] Avg episode reward: [(0, '40.571')] [2024-08-05 17:33:09,620][15444] Updated weights for policy 0, policy_version 30581 (0.0021) [2024-08-05 17:33:12,844][15444] Updated weights for policy 0, policy_version 30591 (0.0032) [2024-08-05 17:33:13,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.5, 300 sec: 24215.0). Total num frames: 250601472. Throughput: 0: 6034.4. Samples: 62650600. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 17:33:13,119][15372] Avg episode reward: [(0, '40.709')] [2024-08-05 17:33:16,484][15444] Updated weights for policy 0, policy_version 30601 (0.0011) [2024-08-05 17:33:18,119][15372] Fps is (10 sec: 24576.5, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 250724352. Throughput: 0: 6023.1. Samples: 62686770. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 17:33:18,119][15372] Avg episode reward: [(0, '40.827')] [2024-08-05 17:33:19,696][15444] Updated weights for policy 0, policy_version 30611 (0.0020) [2024-08-05 17:33:23,103][15444] Updated weights for policy 0, policy_version 30621 (0.0018) [2024-08-05 17:33:23,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.4, 300 sec: 24242.8). Total num frames: 250847232. Throughput: 0: 6028.4. Samples: 62705690. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 17:33:23,119][15372] Avg episode reward: [(0, '40.598')] [2024-08-05 17:33:26,493][15444] Updated weights for policy 0, policy_version 30631 (0.0031) [2024-08-05 17:33:27,704][15417] Signal inference workers to stop experience collection... (11400 times) [2024-08-05 17:33:27,712][15417] Signal inference workers to resume experience collection... (11400 times) [2024-08-05 17:33:27,751][15444] InferenceWorker_p0-w0: stopping experience collection (11400 times) [2024-08-05 17:33:27,751][15444] InferenceWorker_p0-w0: resuming experience collection (11400 times) [2024-08-05 17:33:28,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 250961920. Throughput: 0: 6032.2. Samples: 62741260. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 17:33:28,119][15372] Avg episode reward: [(0, '40.497')] [2024-08-05 17:33:29,823][15444] Updated weights for policy 0, policy_version 30641 (0.0013) [2024-08-05 17:33:33,119][15372] Fps is (10 sec: 23756.5, 60 sec: 24029.9, 300 sec: 24215.2). Total num frames: 251084800. Throughput: 0: 6031.1. Samples: 62777540. Policy #0 lag: (min: 1.0, avg: 4.4, max: 7.0) [2024-08-05 17:33:33,126][15372] Avg episode reward: [(0, '40.149')] [2024-08-05 17:33:33,299][15444] Updated weights for policy 0, policy_version 30651 (0.0029) [2024-08-05 17:33:36,907][15444] Updated weights for policy 0, policy_version 30661 (0.0012) [2024-08-05 17:33:38,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 251207680. Throughput: 0: 6052.4. Samples: 62796270. Policy #0 lag: (min: 1.0, avg: 4.4, max: 7.0) [2024-08-05 17:33:38,119][15372] Avg episode reward: [(0, '41.107')] [2024-08-05 17:33:39,921][15444] Updated weights for policy 0, policy_version 30671 (0.0013) [2024-08-05 17:33:43,118][15372] Fps is (10 sec: 24576.3, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 251330560. Throughput: 0: 6066.0. Samples: 62832820. Policy #0 lag: (min: 1.0, avg: 4.4, max: 7.0) [2024-08-05 17:33:43,126][15372] Avg episode reward: [(0, '41.342')] [2024-08-05 17:33:43,376][15444] Updated weights for policy 0, policy_version 30681 (0.0023) [2024-08-05 17:33:46,706][15444] Updated weights for policy 0, policy_version 30691 (0.0012) [2024-08-05 17:33:48,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24309.1, 300 sec: 24242.9). Total num frames: 251453440. Throughput: 0: 6082.9. Samples: 62868970. Policy #0 lag: (min: 1.0, avg: 4.4, max: 7.0) [2024-08-05 17:33:48,119][15372] Avg episode reward: [(0, '40.729')] [2024-08-05 17:33:50,167][15444] Updated weights for policy 0, policy_version 30701 (0.0019) [2024-08-05 17:33:53,120][15372] Fps is (10 sec: 24572.4, 60 sec: 24302.3, 300 sec: 24214.9). Total num frames: 251576320. Throughput: 0: 6064.1. Samples: 62886940. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 17:33:53,129][15372] Avg episode reward: [(0, '39.812')] [2024-08-05 17:33:53,740][15444] Updated weights for policy 0, policy_version 30711 (0.0028) [2024-08-05 17:33:56,895][15444] Updated weights for policy 0, policy_version 30721 (0.0012) [2024-08-05 17:33:58,118][15372] Fps is (10 sec: 22937.5, 60 sec: 24029.9, 300 sec: 24187.2). Total num frames: 251682816. Throughput: 0: 6041.8. Samples: 62922480. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 17:33:58,119][15372] Avg episode reward: [(0, '40.324')] [2024-08-05 17:34:00,473][15444] Updated weights for policy 0, policy_version 30731 (0.0028) [2024-08-05 17:34:03,119][15372] Fps is (10 sec: 23759.7, 60 sec: 24302.8, 300 sec: 24215.0). Total num frames: 251813888. Throughput: 0: 6048.9. Samples: 62958970. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 17:34:03,126][15372] Avg episode reward: [(0, '39.929')] [2024-08-05 17:34:03,780][15444] Updated weights for policy 0, policy_version 30741 (0.0020) [2024-08-05 17:34:07,337][15444] Updated weights for policy 0, policy_version 30751 (0.0030) [2024-08-05 17:34:08,119][15372] Fps is (10 sec: 24575.0, 60 sec: 24166.3, 300 sec: 24187.2). Total num frames: 251928576. Throughput: 0: 6016.4. Samples: 62976430. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 17:34:08,120][15372] Avg episode reward: [(0, '40.021')] [2024-08-05 17:34:10,851][15444] Updated weights for policy 0, policy_version 30761 (0.0021) [2024-08-05 17:34:13,119][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.3, 300 sec: 24187.2). Total num frames: 252051456. Throughput: 0: 6023.1. Samples: 63012300. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 17:34:13,127][15372] Avg episode reward: [(0, '41.196')] [2024-08-05 17:34:14,150][15444] Updated weights for policy 0, policy_version 30771 (0.0013) [2024-08-05 17:34:17,571][15444] Updated weights for policy 0, policy_version 30781 (0.0013) [2024-08-05 17:34:18,118][15372] Fps is (10 sec: 24577.0, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 252174336. Throughput: 0: 6011.6. Samples: 63048060. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 17:34:18,119][15372] Avg episode reward: [(0, '40.689')] [2024-08-05 17:34:18,122][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000030783_252174336.pth... [2024-08-05 17:34:18,219][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000030076_246382592.pth [2024-08-05 17:34:21,085][15444] Updated weights for policy 0, policy_version 30791 (0.0010) [2024-08-05 17:34:23,118][15372] Fps is (10 sec: 23757.4, 60 sec: 24029.9, 300 sec: 24188.3). Total num frames: 252289024. Throughput: 0: 6003.6. Samples: 63066430. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 17:34:23,120][15372] Avg episode reward: [(0, '40.623')] [2024-08-05 17:34:24,279][15444] Updated weights for policy 0, policy_version 30801 (0.0014) [2024-08-05 17:34:27,907][15444] Updated weights for policy 0, policy_version 30811 (0.0017) [2024-08-05 17:34:28,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 252411904. Throughput: 0: 5993.3. Samples: 63102520. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 17:34:28,119][15372] Avg episode reward: [(0, '39.699')] [2024-08-05 17:34:31,575][15444] Updated weights for policy 0, policy_version 30821 (0.0012) [2024-08-05 17:34:31,908][15417] Signal inference workers to stop experience collection... (11450 times) [2024-08-05 17:34:31,909][15417] Signal inference workers to resume experience collection... (11450 times) [2024-08-05 17:34:31,973][15444] InferenceWorker_p0-w0: stopping experience collection (11450 times) [2024-08-05 17:34:31,973][15444] InferenceWorker_p0-w0: resuming experience collection (11450 times) [2024-08-05 17:34:33,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.9, 300 sec: 24187.2). Total num frames: 252526592. Throughput: 0: 5968.0. Samples: 63137530. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 17:34:33,119][15372] Avg episode reward: [(0, '39.970')] [2024-08-05 17:34:34,558][15444] Updated weights for policy 0, policy_version 30831 (0.0015) [2024-08-05 17:34:38,119][15372] Fps is (10 sec: 22936.2, 60 sec: 23893.1, 300 sec: 24131.6). Total num frames: 252641280. Throughput: 0: 5992.1. Samples: 63156580. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 17:34:38,127][15372] Avg episode reward: [(0, '40.331')] [2024-08-05 17:34:38,189][15444] Updated weights for policy 0, policy_version 30841 (0.0025) [2024-08-05 17:34:41,168][15444] Updated weights for policy 0, policy_version 30851 (0.0013) [2024-08-05 17:34:43,119][15372] Fps is (10 sec: 24575.2, 60 sec: 24029.7, 300 sec: 24159.4). Total num frames: 252772352. Throughput: 0: 5996.2. Samples: 63192310. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 17:34:43,127][15372] Avg episode reward: [(0, '40.399')] [2024-08-05 17:34:44,797][15444] Updated weights for policy 0, policy_version 30861 (0.0011) [2024-08-05 17:34:48,118][15372] Fps is (10 sec: 24577.4, 60 sec: 23893.3, 300 sec: 24131.7). Total num frames: 252887040. Throughput: 0: 5983.8. Samples: 63228240. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 17:34:48,126][15372] Avg episode reward: [(0, '39.765')] [2024-08-05 17:34:48,614][15444] Updated weights for policy 0, policy_version 30871 (0.0027) [2024-08-05 17:34:51,601][15444] Updated weights for policy 0, policy_version 30881 (0.0029) [2024-08-05 17:34:53,119][15372] Fps is (10 sec: 22938.2, 60 sec: 23757.3, 300 sec: 24103.9). Total num frames: 253001728. Throughput: 0: 6017.4. Samples: 63247210. Policy #0 lag: (min: 0.0, avg: 4.9, max: 9.0) [2024-08-05 17:34:53,127][15372] Avg episode reward: [(0, '39.537')] [2024-08-05 17:34:55,212][15444] Updated weights for policy 0, policy_version 30891 (0.0011) [2024-08-05 17:34:58,119][15372] Fps is (10 sec: 25395.1, 60 sec: 24302.9, 300 sec: 24187.3). Total num frames: 253140992. Throughput: 0: 6034.0. Samples: 63283830. Policy #0 lag: (min: 0.0, avg: 4.9, max: 9.0) [2024-08-05 17:34:58,120][15372] Avg episode reward: [(0, '40.291')] [2024-08-05 17:34:58,121][15444] Updated weights for policy 0, policy_version 30901 (0.0016) [2024-08-05 17:35:00,192][15417] Signal inference workers to stop experience collection... (11500 times) [2024-08-05 17:35:00,192][15417] Signal inference workers to resume experience collection... (11500 times) [2024-08-05 17:35:00,235][15444] InferenceWorker_p0-w0: stopping experience collection (11500 times) [2024-08-05 17:35:00,235][15444] InferenceWorker_p0-w0: resuming experience collection (11500 times) [2024-08-05 17:35:01,939][15444] Updated weights for policy 0, policy_version 30911 (0.0014) [2024-08-05 17:35:03,118][15372] Fps is (10 sec: 24576.1, 60 sec: 23893.4, 300 sec: 24131.7). Total num frames: 253247488. Throughput: 0: 6021.1. Samples: 63319010. Policy #0 lag: (min: 0.0, avg: 4.9, max: 9.0) [2024-08-05 17:35:03,119][15372] Avg episode reward: [(0, '41.045')] [2024-08-05 17:35:05,339][15444] Updated weights for policy 0, policy_version 30921 (0.0025) [2024-08-05 17:35:08,118][15372] Fps is (10 sec: 22937.8, 60 sec: 24030.0, 300 sec: 24131.7). Total num frames: 253370368. Throughput: 0: 6039.1. Samples: 63338190. Policy #0 lag: (min: 0.0, avg: 4.9, max: 9.0) [2024-08-05 17:35:08,119][15372] Avg episode reward: [(0, '39.930')] [2024-08-05 17:35:08,537][15444] Updated weights for policy 0, policy_version 30931 (0.0019) [2024-08-05 17:35:12,088][15444] Updated weights for policy 0, policy_version 30941 (0.0035) [2024-08-05 17:35:13,118][15372] Fps is (10 sec: 25395.4, 60 sec: 24166.5, 300 sec: 24131.7). Total num frames: 253501440. Throughput: 0: 6029.1. Samples: 63373830. Policy #0 lag: (min: 1.0, avg: 3.3, max: 7.0) [2024-08-05 17:35:13,119][15372] Avg episode reward: [(0, '39.850')] [2024-08-05 17:35:15,088][15444] Updated weights for policy 0, policy_version 30951 (0.0024) [2024-08-05 17:35:18,118][15372] Fps is (10 sec: 23756.8, 60 sec: 23893.3, 300 sec: 24103.9). Total num frames: 253607936. Throughput: 0: 6056.2. Samples: 63410060. Policy #0 lag: (min: 1.0, avg: 3.3, max: 7.0) [2024-08-05 17:35:18,126][15372] Avg episode reward: [(0, '40.677')] [2024-08-05 17:35:18,771][15444] Updated weights for policy 0, policy_version 30961 (0.0034) [2024-08-05 17:35:22,305][15444] Updated weights for policy 0, policy_version 30971 (0.0022) [2024-08-05 17:35:23,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 253739008. Throughput: 0: 6036.1. Samples: 63428200. Policy #0 lag: (min: 1.0, avg: 3.3, max: 7.0) [2024-08-05 17:35:23,120][15372] Avg episode reward: [(0, '40.049')] [2024-08-05 17:35:25,310][15444] Updated weights for policy 0, policy_version 30981 (0.0015) [2024-08-05 17:35:28,118][15372] Fps is (10 sec: 25395.2, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 253861888. Throughput: 0: 6059.8. Samples: 63465000. Policy #0 lag: (min: 1.0, avg: 3.3, max: 7.0) [2024-08-05 17:35:28,119][15372] Avg episode reward: [(0, '40.127')] [2024-08-05 17:35:28,988][15444] Updated weights for policy 0, policy_version 30991 (0.0012) [2024-08-05 17:35:31,358][15417] Signal inference workers to stop experience collection... (11550 times) [2024-08-05 17:35:31,359][15417] Signal inference workers to resume experience collection... (11550 times) [2024-08-05 17:35:31,409][15444] InferenceWorker_p0-w0: stopping experience collection (11550 times) [2024-08-05 17:35:31,409][15444] InferenceWorker_p0-w0: resuming experience collection (11550 times) [2024-08-05 17:35:32,141][15444] Updated weights for policy 0, policy_version 31001 (0.0022) [2024-08-05 17:35:33,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24103.9). Total num frames: 253976576. Throughput: 0: 6078.4. Samples: 63501770. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 17:35:33,119][15372] Avg episode reward: [(0, '39.779')] [2024-08-05 17:35:35,371][15444] Updated weights for policy 0, policy_version 31011 (0.0013) [2024-08-05 17:35:38,123][15372] Fps is (10 sec: 23746.0, 60 sec: 24301.3, 300 sec: 24103.5). Total num frames: 254099456. Throughput: 0: 6053.2. Samples: 63519630. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 17:35:38,124][15372] Avg episode reward: [(0, '39.578')] [2024-08-05 17:35:38,950][15444] Updated weights for policy 0, policy_version 31021 (0.0012) [2024-08-05 17:35:42,224][15444] Updated weights for policy 0, policy_version 31031 (0.0012) [2024-08-05 17:35:43,119][15372] Fps is (10 sec: 24575.9, 60 sec: 24166.5, 300 sec: 24131.7). Total num frames: 254222336. Throughput: 0: 6054.7. Samples: 63556290. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 17:35:43,126][15372] Avg episode reward: [(0, '40.661')] [2024-08-05 17:35:45,750][15444] Updated weights for policy 0, policy_version 31041 (0.0018) [2024-08-05 17:35:48,118][15372] Fps is (10 sec: 24587.3, 60 sec: 24303.0, 300 sec: 24131.7). Total num frames: 254345216. Throughput: 0: 6074.0. Samples: 63592340. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 17:35:48,119][15372] Avg episode reward: [(0, '40.480')] [2024-08-05 17:35:48,913][15444] Updated weights for policy 0, policy_version 31051 (0.0021) [2024-08-05 17:35:52,684][15444] Updated weights for policy 0, policy_version 31061 (0.0025) [2024-08-05 17:35:53,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24439.5, 300 sec: 24131.7). Total num frames: 254468096. Throughput: 0: 6050.0. Samples: 63610440. Policy #0 lag: (min: 1.0, avg: 3.2, max: 7.0) [2024-08-05 17:35:53,119][15372] Avg episode reward: [(0, '40.695')] [2024-08-05 17:35:55,702][15444] Updated weights for policy 0, policy_version 31071 (0.0012) [2024-08-05 17:35:58,121][15372] Fps is (10 sec: 23751.0, 60 sec: 24028.9, 300 sec: 24131.5). Total num frames: 254582784. Throughput: 0: 6047.4. Samples: 63645980. Policy #0 lag: (min: 1.0, avg: 3.2, max: 7.0) [2024-08-05 17:35:58,129][15372] Avg episode reward: [(0, '40.580')] [2024-08-05 17:35:59,447][15444] Updated weights for policy 0, policy_version 31081 (0.0017) [2024-08-05 17:36:03,016][15444] Updated weights for policy 0, policy_version 31091 (0.0033) [2024-08-05 17:36:03,118][15372] Fps is (10 sec: 22937.6, 60 sec: 24166.4, 300 sec: 24103.9). Total num frames: 254697472. Throughput: 0: 6023.8. Samples: 63681130. Policy #0 lag: (min: 1.0, avg: 3.2, max: 7.0) [2024-08-05 17:36:03,119][15372] Avg episode reward: [(0, '41.312')] [2024-08-05 17:36:06,126][15444] Updated weights for policy 0, policy_version 31101 (0.0024) [2024-08-05 17:36:08,119][15372] Fps is (10 sec: 23761.1, 60 sec: 24166.2, 300 sec: 24103.9). Total num frames: 254820352. Throughput: 0: 6039.0. Samples: 63699960. Policy #0 lag: (min: 1.0, avg: 3.2, max: 7.0) [2024-08-05 17:36:08,127][15372] Avg episode reward: [(0, '40.479')] [2024-08-05 17:36:09,758][15444] Updated weights for policy 0, policy_version 31111 (0.0011) [2024-08-05 17:36:10,110][15417] Signal inference workers to stop experience collection... (11600 times) [2024-08-05 17:36:10,110][15417] Signal inference workers to resume experience collection... (11600 times) [2024-08-05 17:36:10,183][15444] InferenceWorker_p0-w0: stopping experience collection (11600 times) [2024-08-05 17:36:10,183][15444] InferenceWorker_p0-w0: resuming experience collection (11600 times) [2024-08-05 17:36:12,903][15444] Updated weights for policy 0, policy_version 31121 (0.0015) [2024-08-05 17:36:13,119][15372] Fps is (10 sec: 24575.8, 60 sec: 24029.8, 300 sec: 24103.9). Total num frames: 254943232. Throughput: 0: 6030.9. Samples: 63736390. Policy #0 lag: (min: 1.0, avg: 3.2, max: 7.0) [2024-08-05 17:36:13,119][15372] Avg episode reward: [(0, '40.764')] [2024-08-05 17:36:16,423][15444] Updated weights for policy 0, policy_version 31131 (0.0019) [2024-08-05 17:36:18,119][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.2, 300 sec: 24103.9). Total num frames: 255057920. Throughput: 0: 5999.3. Samples: 63771740. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 17:36:18,127][15372] Avg episode reward: [(0, '41.437')] [2024-08-05 17:36:18,207][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000031136_255066112.pth... [2024-08-05 17:36:18,369][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000030429_249274368.pth [2024-08-05 17:36:19,913][15444] Updated weights for policy 0, policy_version 31141 (0.0021) [2024-08-05 17:36:23,118][15372] Fps is (10 sec: 23757.1, 60 sec: 24029.9, 300 sec: 24103.9). Total num frames: 255180800. Throughput: 0: 6016.2. Samples: 63790330. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 17:36:23,126][15372] Avg episode reward: [(0, '40.697')] [2024-08-05 17:36:23,187][15444] Updated weights for policy 0, policy_version 31151 (0.0015) [2024-08-05 17:36:26,512][15444] Updated weights for policy 0, policy_version 31161 (0.0011) [2024-08-05 17:36:28,118][15372] Fps is (10 sec: 24577.3, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 255303680. Throughput: 0: 6002.2. Samples: 63826390. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 17:36:28,126][15372] Avg episode reward: [(0, '40.553')] [2024-08-05 17:36:29,907][15444] Updated weights for policy 0, policy_version 31171 (0.0025) [2024-08-05 17:36:33,127][15372] Fps is (10 sec: 24558.6, 60 sec: 24163.6, 300 sec: 24131.2). Total num frames: 255426560. Throughput: 0: 6024.4. Samples: 63863480. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 17:36:33,135][15372] Avg episode reward: [(0, '40.632')] [2024-08-05 17:36:33,154][15444] Updated weights for policy 0, policy_version 31181 (0.0023) [2024-08-05 17:36:36,653][15444] Updated weights for policy 0, policy_version 31191 (0.0020) [2024-08-05 17:36:38,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24168.2, 300 sec: 24103.9). Total num frames: 255549440. Throughput: 0: 6031.3. Samples: 63881850. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 17:36:38,119][15372] Avg episode reward: [(0, '41.554')] [2024-08-05 17:36:39,883][15444] Updated weights for policy 0, policy_version 31201 (0.0019) [2024-08-05 17:36:43,118][15372] Fps is (10 sec: 24593.4, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 255672320. Throughput: 0: 6059.7. Samples: 63918650. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 17:36:43,126][15372] Avg episode reward: [(0, '41.112')] [2024-08-05 17:36:43,382][15444] Updated weights for policy 0, policy_version 31211 (0.0012) [2024-08-05 17:36:46,696][15444] Updated weights for policy 0, policy_version 31221 (0.0026) [2024-08-05 17:36:48,119][15372] Fps is (10 sec: 24575.0, 60 sec: 24166.2, 300 sec: 24131.7). Total num frames: 255795200. Throughput: 0: 6076.4. Samples: 63954570. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 17:36:48,119][15372] Avg episode reward: [(0, '41.016')] [2024-08-05 17:36:49,962][15444] Updated weights for policy 0, policy_version 31231 (0.0011) [2024-08-05 17:36:53,119][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 255918080. Throughput: 0: 6069.2. Samples: 63973070. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 17:36:53,126][15372] Avg episode reward: [(0, '40.501')] [2024-08-05 17:36:53,612][15444] Updated weights for policy 0, policy_version 31241 (0.0017) [2024-08-05 17:36:56,591][15444] Updated weights for policy 0, policy_version 31251 (0.0032) [2024-08-05 17:36:58,064][15417] Signal inference workers to stop experience collection... (11650 times) [2024-08-05 17:36:58,072][15417] Signal inference workers to resume experience collection... (11650 times) [2024-08-05 17:36:58,112][15444] InferenceWorker_p0-w0: stopping experience collection (11650 times) [2024-08-05 17:36:58,113][15444] InferenceWorker_p0-w0: resuming experience collection (11650 times) [2024-08-05 17:36:58,119][15372] Fps is (10 sec: 23756.8, 60 sec: 24167.2, 300 sec: 24131.6). Total num frames: 256032768. Throughput: 0: 6059.1. Samples: 64009050. Policy #0 lag: (min: 1.0, avg: 3.8, max: 8.0) [2024-08-05 17:36:58,119][15372] Avg episode reward: [(0, '40.230')] [2024-08-05 17:37:00,136][15444] Updated weights for policy 0, policy_version 31261 (0.0017) [2024-08-05 17:37:03,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24439.5, 300 sec: 24159.5). Total num frames: 256163840. Throughput: 0: 6091.2. Samples: 64045840. Policy #0 lag: (min: 1.0, avg: 3.8, max: 8.0) [2024-08-05 17:37:03,126][15372] Avg episode reward: [(0, '39.767')] [2024-08-05 17:37:03,767][15444] Updated weights for policy 0, policy_version 31271 (0.0033) [2024-08-05 17:37:06,889][15444] Updated weights for policy 0, policy_version 31281 (0.0012) [2024-08-05 17:37:08,118][15372] Fps is (10 sec: 24577.1, 60 sec: 24303.2, 300 sec: 24159.5). Total num frames: 256278528. Throughput: 0: 6081.3. Samples: 64063990. Policy #0 lag: (min: 1.0, avg: 3.8, max: 8.0) [2024-08-05 17:37:08,119][15372] Avg episode reward: [(0, '40.409')] [2024-08-05 17:37:10,390][15444] Updated weights for policy 0, policy_version 31291 (0.0015) [2024-08-05 17:37:13,119][15372] Fps is (10 sec: 23756.5, 60 sec: 24302.9, 300 sec: 24159.4). Total num frames: 256401408. Throughput: 0: 6097.5. Samples: 64100780. Policy #0 lag: (min: 1.0, avg: 3.8, max: 8.0) [2024-08-05 17:37:13,119][15372] Avg episode reward: [(0, '40.699')] [2024-08-05 17:37:13,449][15444] Updated weights for policy 0, policy_version 31301 (0.0012) [2024-08-05 17:37:17,220][15444] Updated weights for policy 0, policy_version 31311 (0.0015) [2024-08-05 17:37:18,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24439.7, 300 sec: 24159.5). Total num frames: 256524288. Throughput: 0: 6071.8. Samples: 64136670. Policy #0 lag: (min: 1.0, avg: 3.5, max: 7.0) [2024-08-05 17:37:18,119][15372] Avg episode reward: [(0, '39.851')] [2024-08-05 17:37:20,502][15444] Updated weights for policy 0, policy_version 31321 (0.0019) [2024-08-05 17:37:23,118][15372] Fps is (10 sec: 24576.3, 60 sec: 24439.5, 300 sec: 24187.2). Total num frames: 256647168. Throughput: 0: 6082.9. Samples: 64155580. Policy #0 lag: (min: 1.0, avg: 3.5, max: 7.0) [2024-08-05 17:37:23,119][15372] Avg episode reward: [(0, '39.755')] [2024-08-05 17:37:23,784][15444] Updated weights for policy 0, policy_version 31331 (0.0014) [2024-08-05 17:37:27,278][15444] Updated weights for policy 0, policy_version 31341 (0.0021) [2024-08-05 17:37:28,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24302.9, 300 sec: 24131.7). Total num frames: 256761856. Throughput: 0: 6059.8. Samples: 64191340. Policy #0 lag: (min: 1.0, avg: 3.5, max: 7.0) [2024-08-05 17:37:28,119][15372] Avg episode reward: [(0, '40.037')] [2024-08-05 17:37:30,491][15444] Updated weights for policy 0, policy_version 31351 (0.0017) [2024-08-05 17:37:33,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24305.8, 300 sec: 24159.5). Total num frames: 256884736. Throughput: 0: 6070.3. Samples: 64227730. Policy #0 lag: (min: 1.0, avg: 3.5, max: 7.0) [2024-08-05 17:37:33,119][15372] Avg episode reward: [(0, '40.204')] [2024-08-05 17:37:34,170][15444] Updated weights for policy 0, policy_version 31361 (0.0020) [2024-08-05 17:37:37,368][15444] Updated weights for policy 0, policy_version 31371 (0.0022) [2024-08-05 17:37:38,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24302.9, 300 sec: 24159.5). Total num frames: 257007616. Throughput: 0: 6066.0. Samples: 64246040. Policy #0 lag: (min: 0.0, avg: 4.6, max: 9.0) [2024-08-05 17:37:38,126][15372] Avg episode reward: [(0, '40.739')] [2024-08-05 17:37:40,771][15444] Updated weights for policy 0, policy_version 31381 (0.0021) [2024-08-05 17:37:42,927][15417] Signal inference workers to stop experience collection... (11700 times) [2024-08-05 17:37:42,928][15417] Signal inference workers to resume experience collection... (11700 times) [2024-08-05 17:37:43,014][15444] InferenceWorker_p0-w0: stopping experience collection (11700 times) [2024-08-05 17:37:43,019][15444] InferenceWorker_p0-w0: resuming experience collection (11700 times) [2024-08-05 17:37:43,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24302.9, 300 sec: 24188.5). Total num frames: 257130496. Throughput: 0: 6063.4. Samples: 64281900. Policy #0 lag: (min: 0.0, avg: 4.6, max: 9.0) [2024-08-05 17:37:43,119][15372] Avg episode reward: [(0, '40.012')] [2024-08-05 17:37:44,315][15444] Updated weights for policy 0, policy_version 31391 (0.0013) [2024-08-05 17:37:47,451][15444] Updated weights for policy 0, policy_version 31401 (0.0013) [2024-08-05 17:37:48,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24303.1, 300 sec: 24187.2). Total num frames: 257253376. Throughput: 0: 6050.0. Samples: 64318090. Policy #0 lag: (min: 0.0, avg: 4.6, max: 9.0) [2024-08-05 17:37:48,119][15372] Avg episode reward: [(0, '40.512')] [2024-08-05 17:37:50,880][15444] Updated weights for policy 0, policy_version 31411 (0.0011) [2024-08-05 17:37:53,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 257368064. Throughput: 0: 6058.2. Samples: 64336610. Policy #0 lag: (min: 0.0, avg: 4.6, max: 9.0) [2024-08-05 17:37:53,126][15372] Avg episode reward: [(0, '40.059')] [2024-08-05 17:37:54,344][15444] Updated weights for policy 0, policy_version 31421 (0.0018) [2024-08-05 17:37:57,669][15444] Updated weights for policy 0, policy_version 31431 (0.0014) [2024-08-05 17:37:58,119][15372] Fps is (10 sec: 23756.6, 60 sec: 24303.1, 300 sec: 24187.2). Total num frames: 257490944. Throughput: 0: 6044.0. Samples: 64372760. Policy #0 lag: (min: 0.0, avg: 3.7, max: 9.0) [2024-08-05 17:37:58,119][15372] Avg episode reward: [(0, '41.326')] [2024-08-05 17:38:01,211][15444] Updated weights for policy 0, policy_version 31441 (0.0015) [2024-08-05 17:38:03,119][15372] Fps is (10 sec: 23756.0, 60 sec: 24029.7, 300 sec: 24159.5). Total num frames: 257605632. Throughput: 0: 6046.6. Samples: 64408770. Policy #0 lag: (min: 0.0, avg: 3.7, max: 9.0) [2024-08-05 17:38:03,127][15372] Avg episode reward: [(0, '41.849')] [2024-08-05 17:38:04,399][15444] Updated weights for policy 0, policy_version 31451 (0.0010) [2024-08-05 17:38:07,946][15444] Updated weights for policy 0, policy_version 31461 (0.0013) [2024-08-05 17:38:08,119][15372] Fps is (10 sec: 23756.6, 60 sec: 24166.3, 300 sec: 24159.4). Total num frames: 257728512. Throughput: 0: 6046.6. Samples: 64427680. Policy #0 lag: (min: 0.0, avg: 3.7, max: 9.0) [2024-08-05 17:38:08,119][15372] Avg episode reward: [(0, '41.442')] [2024-08-05 17:38:11,152][15444] Updated weights for policy 0, policy_version 31471 (0.0011) [2024-08-05 17:38:13,118][15372] Fps is (10 sec: 24576.7, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 257851392. Throughput: 0: 6048.9. Samples: 64463540. Policy #0 lag: (min: 0.0, avg: 3.7, max: 9.0) [2024-08-05 17:38:13,126][15372] Avg episode reward: [(0, '40.116')] [2024-08-05 17:38:14,679][15444] Updated weights for policy 0, policy_version 31481 (0.0020) [2024-08-05 17:38:17,869][15444] Updated weights for policy 0, policy_version 31491 (0.0011) [2024-08-05 17:38:18,119][15372] Fps is (10 sec: 24575.4, 60 sec: 24166.2, 300 sec: 24159.4). Total num frames: 257974272. Throughput: 0: 6055.7. Samples: 64500240. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 17:38:18,119][15372] Avg episode reward: [(0, '39.812')] [2024-08-05 17:38:18,124][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000031491_257974272.pth... [2024-08-05 17:38:18,288][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000030783_252174336.pth [2024-08-05 17:38:21,350][15444] Updated weights for policy 0, policy_version 31501 (0.0014) [2024-08-05 17:38:23,119][15372] Fps is (10 sec: 24575.9, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 258097152. Throughput: 0: 6054.2. Samples: 64518480. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 17:38:23,126][15372] Avg episode reward: [(0, '40.444')] [2024-08-05 17:38:24,838][15444] Updated weights for policy 0, policy_version 31511 (0.0012) [2024-08-05 17:38:27,930][15444] Updated weights for policy 0, policy_version 31521 (0.0020) [2024-08-05 17:38:28,118][15372] Fps is (10 sec: 24577.1, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 258220032. Throughput: 0: 6068.4. Samples: 64554980. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 17:38:28,119][15372] Avg episode reward: [(0, '40.905')] [2024-08-05 17:38:31,575][15444] Updated weights for policy 0, policy_version 31531 (0.0014) [2024-08-05 17:38:33,118][15372] Fps is (10 sec: 23757.1, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 258334720. Throughput: 0: 6052.5. Samples: 64590450. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 17:38:33,119][15372] Avg episode reward: [(0, '40.816')] [2024-08-05 17:38:33,211][15417] Signal inference workers to stop experience collection... (11750 times) [2024-08-05 17:38:33,211][15417] Signal inference workers to resume experience collection... (11750 times) [2024-08-05 17:38:33,257][15444] InferenceWorker_p0-w0: stopping experience collection (11750 times) [2024-08-05 17:38:33,258][15444] InferenceWorker_p0-w0: resuming experience collection (11750 times) [2024-08-05 17:38:35,054][15444] Updated weights for policy 0, policy_version 31541 (0.0015) [2024-08-05 17:38:38,119][15372] Fps is (10 sec: 23755.8, 60 sec: 24166.2, 300 sec: 24159.4). Total num frames: 258457600. Throughput: 0: 6048.6. Samples: 64608800. Policy #0 lag: (min: 0.0, avg: 4.7, max: 8.0) [2024-08-05 17:38:38,127][15372] Avg episode reward: [(0, '41.647')] [2024-08-05 17:38:38,279][15444] Updated weights for policy 0, policy_version 31551 (0.0021) [2024-08-05 17:38:41,680][15444] Updated weights for policy 0, policy_version 31561 (0.0012) [2024-08-05 17:38:43,119][15372] Fps is (10 sec: 24575.5, 60 sec: 24166.3, 300 sec: 24159.4). Total num frames: 258580480. Throughput: 0: 6062.0. Samples: 64645550. Policy #0 lag: (min: 0.0, avg: 4.7, max: 8.0) [2024-08-05 17:38:43,120][15372] Avg episode reward: [(0, '41.674')] [2024-08-05 17:38:44,870][15444] Updated weights for policy 0, policy_version 31571 (0.0013) [2024-08-05 17:38:48,118][15372] Fps is (10 sec: 24577.0, 60 sec: 24166.4, 300 sec: 24159.6). Total num frames: 258703360. Throughput: 0: 6072.3. Samples: 64682020. Policy #0 lag: (min: 0.0, avg: 4.7, max: 8.0) [2024-08-05 17:38:48,126][15372] Avg episode reward: [(0, '41.195')] [2024-08-05 17:38:48,494][15444] Updated weights for policy 0, policy_version 31581 (0.0019) [2024-08-05 17:38:51,674][15444] Updated weights for policy 0, policy_version 31591 (0.0027) [2024-08-05 17:38:53,118][15372] Fps is (10 sec: 24576.5, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 258826240. Throughput: 0: 6071.1. Samples: 64700880. Policy #0 lag: (min: 0.0, avg: 4.7, max: 8.0) [2024-08-05 17:38:53,126][15372] Avg episode reward: [(0, '41.282')] [2024-08-05 17:38:55,096][15444] Updated weights for policy 0, policy_version 31601 (0.0011) [2024-08-05 17:38:58,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24303.0, 300 sec: 24187.2). Total num frames: 258949120. Throughput: 0: 6083.6. Samples: 64737300. Policy #0 lag: (min: 1.0, avg: 3.4, max: 7.0) [2024-08-05 17:38:58,126][15372] Avg episode reward: [(0, '40.858')] [2024-08-05 17:38:58,490][15444] Updated weights for policy 0, policy_version 31611 (0.0032) [2024-08-05 17:39:01,813][15444] Updated weights for policy 0, policy_version 31621 (0.0023) [2024-08-05 17:39:03,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24303.1, 300 sec: 24187.3). Total num frames: 259063808. Throughput: 0: 6058.3. Samples: 64772860. Policy #0 lag: (min: 1.0, avg: 3.4, max: 7.0) [2024-08-05 17:39:03,119][15372] Avg episode reward: [(0, '40.562')] [2024-08-05 17:39:05,294][15444] Updated weights for policy 0, policy_version 31631 (0.0022) [2024-08-05 17:39:08,118][15372] Fps is (10 sec: 23756.6, 60 sec: 24303.0, 300 sec: 24187.2). Total num frames: 259186688. Throughput: 0: 6053.3. Samples: 64790880. Policy #0 lag: (min: 1.0, avg: 3.4, max: 7.0) [2024-08-05 17:39:08,119][15372] Avg episode reward: [(0, '40.077')] [2024-08-05 17:39:08,645][15444] Updated weights for policy 0, policy_version 31641 (0.0016) [2024-08-05 17:39:12,129][15444] Updated weights for policy 0, policy_version 31651 (0.0021) [2024-08-05 17:39:13,119][15372] Fps is (10 sec: 24574.3, 60 sec: 24302.7, 300 sec: 24187.2). Total num frames: 259309568. Throughput: 0: 6037.2. Samples: 64826660. Policy #0 lag: (min: 1.0, avg: 3.4, max: 7.0) [2024-08-05 17:39:13,119][15372] Avg episode reward: [(0, '40.915')] [2024-08-05 17:39:15,406][15444] Updated weights for policy 0, policy_version 31661 (0.0012) [2024-08-05 17:39:18,120][15372] Fps is (10 sec: 23752.6, 60 sec: 24165.9, 300 sec: 24187.1). Total num frames: 259424256. Throughput: 0: 6070.6. Samples: 64863640. Policy #0 lag: (min: 1.0, avg: 3.7, max: 7.0) [2024-08-05 17:39:18,121][15372] Avg episode reward: [(0, '41.502')] [2024-08-05 17:39:18,761][15444] Updated weights for policy 0, policy_version 31671 (0.0018) [2024-08-05 17:39:22,193][15444] Updated weights for policy 0, policy_version 31681 (0.0033) [2024-08-05 17:39:23,118][15372] Fps is (10 sec: 23758.6, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 259547136. Throughput: 0: 6066.7. Samples: 64881800. Policy #0 lag: (min: 1.0, avg: 3.7, max: 7.0) [2024-08-05 17:39:23,119][15372] Avg episode reward: [(0, '40.814')] [2024-08-05 17:39:23,123][15417] Signal inference workers to stop experience collection... (11800 times) [2024-08-05 17:39:23,124][15417] Signal inference workers to resume experience collection... (11800 times) [2024-08-05 17:39:23,201][15444] InferenceWorker_p0-w0: stopping experience collection (11800 times) [2024-08-05 17:39:23,201][15444] InferenceWorker_p0-w0: resuming experience collection (11800 times) [2024-08-05 17:39:25,655][15444] Updated weights for policy 0, policy_version 31691 (0.0029) [2024-08-05 17:39:28,119][15372] Fps is (10 sec: 24578.9, 60 sec: 24166.2, 300 sec: 24214.9). Total num frames: 259670016. Throughput: 0: 6054.2. Samples: 64917990. Policy #0 lag: (min: 1.0, avg: 3.7, max: 7.0) [2024-08-05 17:39:28,119][15372] Avg episode reward: [(0, '39.789')] [2024-08-05 17:39:29,097][15444] Updated weights for policy 0, policy_version 31701 (0.0015) [2024-08-05 17:39:32,316][15444] Updated weights for policy 0, policy_version 31711 (0.0017) [2024-08-05 17:39:33,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 259784704. Throughput: 0: 6030.7. Samples: 64953400. Policy #0 lag: (min: 1.0, avg: 3.7, max: 7.0) [2024-08-05 17:39:33,126][15372] Avg episode reward: [(0, '39.784')] [2024-08-05 17:39:35,803][15444] Updated weights for policy 0, policy_version 31721 (0.0022) [2024-08-05 17:39:38,118][15372] Fps is (10 sec: 24577.5, 60 sec: 24303.1, 300 sec: 24215.0). Total num frames: 259915776. Throughput: 0: 6028.9. Samples: 64972180. Policy #0 lag: (min: 1.0, avg: 3.7, max: 7.0) [2024-08-05 17:39:38,119][15372] Avg episode reward: [(0, '40.255')] [2024-08-05 17:39:39,306][15444] Updated weights for policy 0, policy_version 31731 (0.0013) [2024-08-05 17:39:42,501][15444] Updated weights for policy 0, policy_version 31741 (0.0019) [2024-08-05 17:39:43,118][15372] Fps is (10 sec: 25395.2, 60 sec: 24303.0, 300 sec: 24242.8). Total num frames: 260038656. Throughput: 0: 6024.0. Samples: 65008380. Policy #0 lag: (min: 0.0, avg: 4.2, max: 9.0) [2024-08-05 17:39:43,119][15372] Avg episode reward: [(0, '40.187')] [2024-08-05 17:39:46,126][15444] Updated weights for policy 0, policy_version 31751 (0.0013) [2024-08-05 17:39:48,119][15372] Fps is (10 sec: 23756.3, 60 sec: 24166.3, 300 sec: 24242.8). Total num frames: 260153344. Throughput: 0: 6035.1. Samples: 65044440. Policy #0 lag: (min: 0.0, avg: 4.2, max: 9.0) [2024-08-05 17:39:48,119][15372] Avg episode reward: [(0, '40.784')] [2024-08-05 17:39:49,100][15444] Updated weights for policy 0, policy_version 31761 (0.0028) [2024-08-05 17:39:52,697][15444] Updated weights for policy 0, policy_version 31771 (0.0022) [2024-08-05 17:39:53,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 260276224. Throughput: 0: 6046.2. Samples: 65062960. Policy #0 lag: (min: 0.0, avg: 4.2, max: 9.0) [2024-08-05 17:39:53,119][15372] Avg episode reward: [(0, '41.145')] [2024-08-05 17:39:56,064][15444] Updated weights for policy 0, policy_version 31781 (0.0016) [2024-08-05 17:39:58,118][15372] Fps is (10 sec: 24576.5, 60 sec: 24166.4, 300 sec: 24242.8). Total num frames: 260399104. Throughput: 0: 6053.7. Samples: 65099070. Policy #0 lag: (min: 0.0, avg: 4.2, max: 9.0) [2024-08-05 17:39:58,119][15372] Avg episode reward: [(0, '40.819')] [2024-08-05 17:39:59,495][15444] Updated weights for policy 0, policy_version 31791 (0.0020) [2024-08-05 17:40:03,118][15372] Fps is (10 sec: 22937.5, 60 sec: 24029.9, 300 sec: 24187.2). Total num frames: 260505600. Throughput: 0: 6021.6. Samples: 65134600. Policy #0 lag: (min: 0.0, avg: 3.0, max: 7.0) [2024-08-05 17:40:03,126][15372] Avg episode reward: [(0, '41.003')] [2024-08-05 17:40:03,173][15444] Updated weights for policy 0, policy_version 31801 (0.0028) [2024-08-05 17:40:03,734][15417] Signal inference workers to stop experience collection... (11850 times) [2024-08-05 17:40:03,734][15417] Signal inference workers to resume experience collection... (11850 times) [2024-08-05 17:40:03,806][15444] InferenceWorker_p0-w0: stopping experience collection (11850 times) [2024-08-05 17:40:03,806][15444] InferenceWorker_p0-w0: resuming experience collection (11850 times) [2024-08-05 17:40:06,148][15444] Updated weights for policy 0, policy_version 31811 (0.0026) [2024-08-05 17:40:08,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 260636672. Throughput: 0: 6020.4. Samples: 65152720. Policy #0 lag: (min: 0.0, avg: 3.0, max: 7.0) [2024-08-05 17:40:08,126][15372] Avg episode reward: [(0, '41.400')] [2024-08-05 17:40:09,900][15444] Updated weights for policy 0, policy_version 31821 (0.0030) [2024-08-05 17:40:13,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24030.1, 300 sec: 24215.0). Total num frames: 260751360. Throughput: 0: 6035.2. Samples: 65189570. Policy #0 lag: (min: 0.0, avg: 3.0, max: 7.0) [2024-08-05 17:40:13,119][15372] Avg episode reward: [(0, '41.336')] [2024-08-05 17:40:13,231][15444] Updated weights for policy 0, policy_version 31831 (0.0012) [2024-08-05 17:40:16,457][15444] Updated weights for policy 0, policy_version 31841 (0.0011) [2024-08-05 17:40:18,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24167.1, 300 sec: 24187.2). Total num frames: 260874240. Throughput: 0: 6042.4. Samples: 65225310. Policy #0 lag: (min: 0.0, avg: 3.0, max: 7.0) [2024-08-05 17:40:18,126][15372] Avg episode reward: [(0, '41.087')] [2024-08-05 17:40:18,129][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000031845_260874240.pth... [2024-08-05 17:40:18,270][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000031136_255066112.pth [2024-08-05 17:40:20,135][15444] Updated weights for policy 0, policy_version 31851 (0.0019) [2024-08-05 17:40:23,120][15372] Fps is (10 sec: 24572.3, 60 sec: 24165.8, 300 sec: 24187.1). Total num frames: 260997120. Throughput: 0: 6024.0. Samples: 65243270. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 17:40:23,128][15372] Avg episode reward: [(0, '39.790')] [2024-08-05 17:40:23,163][15444] Updated weights for policy 0, policy_version 31861 (0.0011) [2024-08-05 17:40:26,801][15444] Updated weights for policy 0, policy_version 31871 (0.0033) [2024-08-05 17:40:28,119][15372] Fps is (10 sec: 24575.5, 60 sec: 24166.6, 300 sec: 24215.0). Total num frames: 261120000. Throughput: 0: 6026.6. Samples: 65279580. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 17:40:28,119][15372] Avg episode reward: [(0, '40.146')] [2024-08-05 17:40:30,044][15444] Updated weights for policy 0, policy_version 31881 (0.0012) [2024-08-05 17:40:33,120][15372] Fps is (10 sec: 22938.1, 60 sec: 24029.3, 300 sec: 24159.7). Total num frames: 261226496. Throughput: 0: 6032.5. Samples: 65315910. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 17:40:33,128][15372] Avg episode reward: [(0, '40.332')] [2024-08-05 17:40:33,615][15444] Updated weights for policy 0, policy_version 31891 (0.0033) [2024-08-05 17:40:37,125][15444] Updated weights for policy 0, policy_version 31901 (0.0018) [2024-08-05 17:40:38,118][15372] Fps is (10 sec: 24576.5, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 261365760. Throughput: 0: 6024.2. Samples: 65334050. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 17:40:38,119][15372] Avg episode reward: [(0, '41.262')] [2024-08-05 17:40:40,096][15444] Updated weights for policy 0, policy_version 31911 (0.0015) [2024-08-05 17:40:43,119][15372] Fps is (10 sec: 25398.3, 60 sec: 24029.8, 300 sec: 24187.2). Total num frames: 261480448. Throughput: 0: 6041.1. Samples: 65370920. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 17:40:43,126][15372] Avg episode reward: [(0, '41.342')] [2024-08-05 17:40:43,768][15444] Updated weights for policy 0, policy_version 31921 (0.0020) [2024-08-05 17:40:44,105][15417] Signal inference workers to stop experience collection... (11900 times) [2024-08-05 17:40:44,106][15417] Signal inference workers to resume experience collection... (11900 times) [2024-08-05 17:40:44,150][15444] InferenceWorker_p0-w0: stopping experience collection (11900 times) [2024-08-05 17:40:44,150][15444] InferenceWorker_p0-w0: resuming experience collection (11900 times) [2024-08-05 17:40:46,904][15444] Updated weights for policy 0, policy_version 31931 (0.0018) [2024-08-05 17:40:48,119][15372] Fps is (10 sec: 22937.2, 60 sec: 24029.9, 300 sec: 24159.4). Total num frames: 261595136. Throughput: 0: 6058.0. Samples: 65407210. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 17:40:48,127][15372] Avg episode reward: [(0, '41.389')] [2024-08-05 17:40:50,312][15444] Updated weights for policy 0, policy_version 31941 (0.0036) [2024-08-05 17:40:53,119][15372] Fps is (10 sec: 24575.7, 60 sec: 24166.3, 300 sec: 24215.2). Total num frames: 261726208. Throughput: 0: 6054.2. Samples: 65425160. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 17:40:53,119][15372] Avg episode reward: [(0, '41.012')] [2024-08-05 17:40:53,897][15444] Updated weights for policy 0, policy_version 31951 (0.0016) [2024-08-05 17:40:56,952][15444] Updated weights for policy 0, policy_version 31961 (0.0028) [2024-08-05 17:40:58,118][15372] Fps is (10 sec: 24576.5, 60 sec: 24029.9, 300 sec: 24215.0). Total num frames: 261840896. Throughput: 0: 6041.8. Samples: 65461450. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 17:40:58,119][15372] Avg episode reward: [(0, '40.828')] [2024-08-05 17:41:00,474][15444] Updated weights for policy 0, policy_version 31971 (0.0030) [2024-08-05 17:41:03,118][15372] Fps is (10 sec: 23757.1, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 261963776. Throughput: 0: 6048.9. Samples: 65497510. Policy #0 lag: (min: 0.0, avg: 3.3, max: 8.0) [2024-08-05 17:41:03,130][15372] Avg episode reward: [(0, '40.771')] [2024-08-05 17:41:04,201][15444] Updated weights for policy 0, policy_version 31981 (0.0017) [2024-08-05 17:41:07,324][15444] Updated weights for policy 0, policy_version 31991 (0.0014) [2024-08-05 17:41:08,120][15372] Fps is (10 sec: 23753.9, 60 sec: 24029.4, 300 sec: 24187.1). Total num frames: 262078464. Throughput: 0: 6042.5. Samples: 65515180. Policy #0 lag: (min: 0.0, avg: 3.3, max: 8.0) [2024-08-05 17:41:08,128][15372] Avg episode reward: [(0, '40.729')] [2024-08-05 17:41:10,742][15444] Updated weights for policy 0, policy_version 32001 (0.0020) [2024-08-05 17:41:13,119][15372] Fps is (10 sec: 24576.2, 60 sec: 24303.0, 300 sec: 24242.8). Total num frames: 262209536. Throughput: 0: 6054.5. Samples: 65552030. Policy #0 lag: (min: 0.0, avg: 3.3, max: 8.0) [2024-08-05 17:41:13,119][15372] Avg episode reward: [(0, '41.111')] [2024-08-05 17:41:14,153][15444] Updated weights for policy 0, policy_version 32011 (0.0012) [2024-08-05 17:41:17,542][15444] Updated weights for policy 0, policy_version 32021 (0.0011) [2024-08-05 17:41:18,119][15372] Fps is (10 sec: 24577.8, 60 sec: 24166.2, 300 sec: 24215.0). Total num frames: 262324224. Throughput: 0: 6049.0. Samples: 65588110. Policy #0 lag: (min: 0.0, avg: 3.3, max: 8.0) [2024-08-05 17:41:18,119][15372] Avg episode reward: [(0, '40.754')] [2024-08-05 17:41:20,867][15444] Updated weights for policy 0, policy_version 32031 (0.0011) [2024-08-05 17:41:23,124][15372] Fps is (10 sec: 23743.7, 60 sec: 24164.8, 300 sec: 24214.5). Total num frames: 262447104. Throughput: 0: 6055.3. Samples: 65606570. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 17:41:23,132][15372] Avg episode reward: [(0, '40.266')] [2024-08-05 17:41:24,278][15444] Updated weights for policy 0, policy_version 32041 (0.0012) [2024-08-05 17:41:27,835][15444] Updated weights for policy 0, policy_version 32051 (0.0027) [2024-08-05 17:41:28,118][15372] Fps is (10 sec: 23757.9, 60 sec: 24030.0, 300 sec: 24187.8). Total num frames: 262561792. Throughput: 0: 6030.0. Samples: 65642270. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 17:41:28,119][15372] Avg episode reward: [(0, '41.127')] [2024-08-05 17:41:31,164][15444] Updated weights for policy 0, policy_version 32061 (0.0014) [2024-08-05 17:41:33,119][15372] Fps is (10 sec: 23769.4, 60 sec: 24303.4, 300 sec: 24187.2). Total num frames: 262684672. Throughput: 0: 6018.9. Samples: 65678060. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 17:41:33,127][15372] Avg episode reward: [(0, '41.138')] [2024-08-05 17:41:34,523][15444] Updated weights for policy 0, policy_version 32071 (0.0012) [2024-08-05 17:41:37,917][15444] Updated weights for policy 0, policy_version 32081 (0.0012) [2024-08-05 17:41:38,040][15417] Signal inference workers to stop experience collection... (11950 times) [2024-08-05 17:41:38,041][15417] Signal inference workers to resume experience collection... (11950 times) [2024-08-05 17:41:38,092][15444] InferenceWorker_p0-w0: stopping experience collection (11950 times) [2024-08-05 17:41:38,099][15444] InferenceWorker_p0-w0: resuming experience collection (11950 times) [2024-08-05 17:41:38,121][15372] Fps is (10 sec: 25389.7, 60 sec: 24165.5, 300 sec: 24214.8). Total num frames: 262815744. Throughput: 0: 6040.6. Samples: 65697000. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 17:41:38,121][15372] Avg episode reward: [(0, '41.432')] [2024-08-05 17:41:41,066][15444] Updated weights for policy 0, policy_version 32091 (0.0014) [2024-08-05 17:41:43,118][15372] Fps is (10 sec: 24576.5, 60 sec: 24166.4, 300 sec: 24187.3). Total num frames: 262930432. Throughput: 0: 6036.0. Samples: 65733070. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 17:41:43,119][15372] Avg episode reward: [(0, '41.479')] [2024-08-05 17:41:44,674][15444] Updated weights for policy 0, policy_version 32101 (0.0014) [2024-08-05 17:41:47,955][15444] Updated weights for policy 0, policy_version 32111 (0.0019) [2024-08-05 17:41:48,119][15372] Fps is (10 sec: 23760.9, 60 sec: 24302.8, 300 sec: 24187.2). Total num frames: 263053312. Throughput: 0: 6033.3. Samples: 65769010. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 17:41:48,119][15372] Avg episode reward: [(0, '40.086')] [2024-08-05 17:41:51,488][15444] Updated weights for policy 0, policy_version 32121 (0.0020) [2024-08-05 17:41:53,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24166.5, 300 sec: 24215.0). Total num frames: 263176192. Throughput: 0: 6049.1. Samples: 65787380. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 17:41:53,126][15372] Avg episode reward: [(0, '40.631')] [2024-08-05 17:41:54,588][15444] Updated weights for policy 0, policy_version 32131 (0.0019) [2024-08-05 17:41:58,119][15372] Fps is (10 sec: 23757.5, 60 sec: 24166.3, 300 sec: 24159.4). Total num frames: 263290880. Throughput: 0: 6046.2. Samples: 65824110. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 17:41:58,127][15372] Avg episode reward: [(0, '40.662')] [2024-08-05 17:41:58,168][15444] Updated weights for policy 0, policy_version 32141 (0.0027) [2024-08-05 17:42:01,376][15444] Updated weights for policy 0, policy_version 32151 (0.0016) [2024-08-05 17:42:03,119][15372] Fps is (10 sec: 23756.1, 60 sec: 24166.3, 300 sec: 24187.2). Total num frames: 263413760. Throughput: 0: 6046.7. Samples: 65860210. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 17:42:03,126][15372] Avg episode reward: [(0, '40.687')] [2024-08-05 17:42:04,908][15444] Updated weights for policy 0, policy_version 32161 (0.0019) [2024-08-05 17:42:08,119][15372] Fps is (10 sec: 24576.2, 60 sec: 24303.4, 300 sec: 24187.2). Total num frames: 263536640. Throughput: 0: 6055.8. Samples: 65879050. Policy #0 lag: (min: 1.0, avg: 3.7, max: 9.0) [2024-08-05 17:42:08,126][15372] Avg episode reward: [(0, '39.478')] [2024-08-05 17:42:08,304][15444] Updated weights for policy 0, policy_version 32171 (0.0021) [2024-08-05 17:42:11,368][15444] Updated weights for policy 0, policy_version 32181 (0.0012) [2024-08-05 17:42:13,118][15372] Fps is (10 sec: 24576.7, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 263659520. Throughput: 0: 6062.2. Samples: 65915070. Policy #0 lag: (min: 1.0, avg: 3.7, max: 9.0) [2024-08-05 17:42:13,126][15372] Avg episode reward: [(0, '40.960')] [2024-08-05 17:42:14,958][15444] Updated weights for policy 0, policy_version 32191 (0.0017) [2024-08-05 17:42:18,118][15372] Fps is (10 sec: 24576.3, 60 sec: 24303.1, 300 sec: 24187.2). Total num frames: 263782400. Throughput: 0: 6081.1. Samples: 65951710. Policy #0 lag: (min: 1.0, avg: 3.7, max: 9.0) [2024-08-05 17:42:18,126][15372] Avg episode reward: [(0, '41.201')] [2024-08-05 17:42:18,131][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000032200_263782400.pth... [2024-08-05 17:42:18,296][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000031491_257974272.pth [2024-08-05 17:42:18,388][15444] Updated weights for policy 0, policy_version 32201 (0.0014) [2024-08-05 17:42:21,686][15444] Updated weights for policy 0, policy_version 32211 (0.0012) [2024-08-05 17:42:23,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24305.2, 300 sec: 24215.0). Total num frames: 263905280. Throughput: 0: 6062.7. Samples: 65969810. Policy #0 lag: (min: 1.0, avg: 3.7, max: 9.0) [2024-08-05 17:42:23,119][15372] Avg episode reward: [(0, '39.845')] [2024-08-05 17:42:25,149][15444] Updated weights for policy 0, policy_version 32221 (0.0011) [2024-08-05 17:42:28,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 264019968. Throughput: 0: 6054.4. Samples: 66005520. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 17:42:28,126][15372] Avg episode reward: [(0, '40.537')] [2024-08-05 17:42:28,656][15444] Updated weights for policy 0, policy_version 32231 (0.0015) [2024-08-05 17:42:31,894][15444] Updated weights for policy 0, policy_version 32241 (0.0016) [2024-08-05 17:42:33,119][15372] Fps is (10 sec: 23756.1, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 264142848. Throughput: 0: 6051.8. Samples: 66041340. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 17:42:33,119][15372] Avg episode reward: [(0, '41.357')] [2024-08-05 17:42:35,580][15444] Updated weights for policy 0, policy_version 32251 (0.0040) [2024-08-05 17:42:36,607][15417] Signal inference workers to stop experience collection... (12000 times) [2024-08-05 17:42:36,621][15417] Signal inference workers to resume experience collection... (12000 times) [2024-08-05 17:42:36,657][15444] InferenceWorker_p0-w0: stopping experience collection (12000 times) [2024-08-05 17:42:36,657][15444] InferenceWorker_p0-w0: resuming experience collection (12000 times) [2024-08-05 17:42:38,121][15372] Fps is (10 sec: 24570.2, 60 sec: 24166.3, 300 sec: 24187.0). Total num frames: 264265728. Throughput: 0: 6055.9. Samples: 66059910. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 17:42:38,121][15372] Avg episode reward: [(0, '40.437')] [2024-08-05 17:42:38,710][15444] Updated weights for policy 0, policy_version 32261 (0.0036) [2024-08-05 17:42:42,127][15444] Updated weights for policy 0, policy_version 32271 (0.0013) [2024-08-05 17:42:43,118][15372] Fps is (10 sec: 24576.7, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 264388608. Throughput: 0: 6055.4. Samples: 66096600. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 17:42:43,119][15372] Avg episode reward: [(0, '41.177')] [2024-08-05 17:42:45,413][15444] Updated weights for policy 0, policy_version 32281 (0.0013) [2024-08-05 17:42:48,119][15372] Fps is (10 sec: 23761.6, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 264503296. Throughput: 0: 6066.4. Samples: 66133200. Policy #0 lag: (min: 1.0, avg: 3.6, max: 7.0) [2024-08-05 17:42:48,119][15372] Avg episode reward: [(0, '40.807')] [2024-08-05 17:42:48,979][15444] Updated weights for policy 0, policy_version 32291 (0.0038) [2024-08-05 17:42:52,222][15444] Updated weights for policy 0, policy_version 32301 (0.0032) [2024-08-05 17:42:53,118][15372] Fps is (10 sec: 22937.5, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 264617984. Throughput: 0: 6052.2. Samples: 66151400. Policy #0 lag: (min: 1.0, avg: 3.6, max: 7.0) [2024-08-05 17:42:53,127][15372] Avg episode reward: [(0, '40.716')] [2024-08-05 17:42:55,554][15444] Updated weights for policy 0, policy_version 32311 (0.0020) [2024-08-05 17:42:58,119][15372] Fps is (10 sec: 24576.3, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 264749056. Throughput: 0: 6062.6. Samples: 66187890. Policy #0 lag: (min: 1.0, avg: 3.6, max: 7.0) [2024-08-05 17:42:58,119][15372] Avg episode reward: [(0, '41.803')] [2024-08-05 17:42:58,935][15444] Updated weights for policy 0, policy_version 32321 (0.0021) [2024-08-05 17:43:02,427][15444] Updated weights for policy 0, policy_version 32331 (0.0028) [2024-08-05 17:43:03,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24166.5, 300 sec: 24187.2). Total num frames: 264863744. Throughput: 0: 6042.0. Samples: 66223600. Policy #0 lag: (min: 1.0, avg: 3.6, max: 7.0) [2024-08-05 17:43:03,126][15372] Avg episode reward: [(0, '41.570')] [2024-08-05 17:43:05,767][15444] Updated weights for policy 0, policy_version 32341 (0.0018) [2024-08-05 17:43:08,119][15372] Fps is (10 sec: 23756.2, 60 sec: 24166.3, 300 sec: 24187.2). Total num frames: 264986624. Throughput: 0: 6045.5. Samples: 66241860. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 17:43:08,127][15372] Avg episode reward: [(0, '42.171')] [2024-08-05 17:43:09,220][15444] Updated weights for policy 0, policy_version 32351 (0.0010) [2024-08-05 17:43:12,705][15444] Updated weights for policy 0, policy_version 32361 (0.0029) [2024-08-05 17:43:13,119][15372] Fps is (10 sec: 24575.5, 60 sec: 24166.3, 300 sec: 24187.3). Total num frames: 265109504. Throughput: 0: 6057.1. Samples: 66278090. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 17:43:13,119][15372] Avg episode reward: [(0, '41.410')] [2024-08-05 17:43:16,036][15444] Updated weights for policy 0, policy_version 32371 (0.0016) [2024-08-05 17:43:18,119][15372] Fps is (10 sec: 24576.6, 60 sec: 24166.3, 300 sec: 24187.2). Total num frames: 265232384. Throughput: 0: 6064.5. Samples: 66314240. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 17:43:18,119][15372] Avg episode reward: [(0, '40.471')] [2024-08-05 17:43:19,294][15444] Updated weights for policy 0, policy_version 32381 (0.0024) [2024-08-05 17:43:22,829][15444] Updated weights for policy 0, policy_version 32391 (0.0013) [2024-08-05 17:43:23,118][15372] Fps is (10 sec: 23757.2, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 265347072. Throughput: 0: 6053.2. Samples: 66332290. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 17:43:23,119][15372] Avg episode reward: [(0, '40.397')] [2024-08-05 17:43:26,356][15444] Updated weights for policy 0, policy_version 32401 (0.0011) [2024-08-05 17:43:28,119][15372] Fps is (10 sec: 23757.1, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 265469952. Throughput: 0: 6028.7. Samples: 66367890. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 17:43:28,126][15372] Avg episode reward: [(0, '40.904')] [2024-08-05 17:43:29,348][15417] Signal inference workers to stop experience collection... (12050 times) [2024-08-05 17:43:29,349][15417] Signal inference workers to resume experience collection... (12050 times) [2024-08-05 17:43:29,378][15444] InferenceWorker_p0-w0: stopping experience collection (12050 times) [2024-08-05 17:43:29,378][15444] InferenceWorker_p0-w0: resuming experience collection (12050 times) [2024-08-05 17:43:29,428][15444] Updated weights for policy 0, policy_version 32411 (0.0019) [2024-08-05 17:43:33,117][15444] Updated weights for policy 0, policy_version 32421 (0.0021) [2024-08-05 17:43:33,119][15372] Fps is (10 sec: 24574.8, 60 sec: 24166.3, 300 sec: 24187.2). Total num frames: 265592832. Throughput: 0: 6025.8. Samples: 66404360. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 17:43:33,124][15372] Avg episode reward: [(0, '40.905')] [2024-08-05 17:43:36,251][15444] Updated weights for policy 0, policy_version 32431 (0.0015) [2024-08-05 17:43:38,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24167.4, 300 sec: 24187.2). Total num frames: 265715712. Throughput: 0: 6030.9. Samples: 66422790. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 17:43:38,126][15372] Avg episode reward: [(0, '41.190')] [2024-08-05 17:43:39,720][15444] Updated weights for policy 0, policy_version 32441 (0.0021) [2024-08-05 17:43:43,119][15372] Fps is (10 sec: 23757.7, 60 sec: 24029.8, 300 sec: 24159.4). Total num frames: 265830400. Throughput: 0: 6031.3. Samples: 66459300. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 17:43:43,127][15372] Avg episode reward: [(0, '41.059')] [2024-08-05 17:43:43,162][15444] Updated weights for policy 0, policy_version 32451 (0.0022) [2024-08-05 17:43:46,281][15444] Updated weights for policy 0, policy_version 32461 (0.0015) [2024-08-05 17:43:48,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24303.1, 300 sec: 24187.2). Total num frames: 265961472. Throughput: 0: 6041.8. Samples: 66495480. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 17:43:48,126][15372] Avg episode reward: [(0, '40.443')] [2024-08-05 17:43:49,766][15444] Updated weights for policy 0, policy_version 32471 (0.0010) [2024-08-05 17:43:53,119][15372] Fps is (10 sec: 24576.2, 60 sec: 24302.9, 300 sec: 24159.4). Total num frames: 266076160. Throughput: 0: 6049.8. Samples: 66514100. Policy #0 lag: (min: 1.0, avg: 3.5, max: 7.0) [2024-08-05 17:43:53,126][15372] Avg episode reward: [(0, '40.451')] [2024-08-05 17:43:53,169][15444] Updated weights for policy 0, policy_version 32481 (0.0023) [2024-08-05 17:43:56,547][15444] Updated weights for policy 0, policy_version 32491 (0.0012) [2024-08-05 17:43:58,119][15372] Fps is (10 sec: 23756.3, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 266199040. Throughput: 0: 6056.4. Samples: 66550630. Policy #0 lag: (min: 1.0, avg: 3.5, max: 7.0) [2024-08-05 17:43:58,127][15372] Avg episode reward: [(0, '40.992')] [2024-08-05 17:43:59,797][15444] Updated weights for policy 0, policy_version 32501 (0.0036) [2024-08-05 17:44:03,108][15444] Updated weights for policy 0, policy_version 32511 (0.0023) [2024-08-05 17:44:03,118][15372] Fps is (10 sec: 25395.3, 60 sec: 24439.4, 300 sec: 24215.0). Total num frames: 266330112. Throughput: 0: 6057.8. Samples: 66586840. Policy #0 lag: (min: 1.0, avg: 3.5, max: 7.0) [2024-08-05 17:44:03,119][15372] Avg episode reward: [(0, '40.619')] [2024-08-05 17:44:06,706][15444] Updated weights for policy 0, policy_version 32521 (0.0021) [2024-08-05 17:44:08,118][15372] Fps is (10 sec: 24576.5, 60 sec: 24303.1, 300 sec: 24187.3). Total num frames: 266444800. Throughput: 0: 6065.3. Samples: 66605230. Policy #0 lag: (min: 1.0, avg: 3.5, max: 7.0) [2024-08-05 17:44:08,119][15372] Avg episode reward: [(0, '40.924')] [2024-08-05 17:44:09,773][15444] Updated weights for policy 0, policy_version 32531 (0.0013) [2024-08-05 17:44:13,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24303.0, 300 sec: 24215.1). Total num frames: 266567680. Throughput: 0: 6096.2. Samples: 66642220. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 17:44:13,126][15372] Avg episode reward: [(0, '41.527')] [2024-08-05 17:44:13,417][15444] Updated weights for policy 0, policy_version 32541 (0.0016) [2024-08-05 17:44:16,661][15444] Updated weights for policy 0, policy_version 32551 (0.0012) [2024-08-05 17:44:18,119][15372] Fps is (10 sec: 24576.0, 60 sec: 24303.0, 300 sec: 24215.0). Total num frames: 266690560. Throughput: 0: 6084.3. Samples: 66678150. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 17:44:18,119][15372] Avg episode reward: [(0, '41.553')] [2024-08-05 17:44:18,122][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000032555_266690560.pth... [2024-08-05 17:44:18,242][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000031845_260874240.pth [2024-08-05 17:44:20,090][15444] Updated weights for policy 0, policy_version 32561 (0.0035) [2024-08-05 17:44:23,119][15372] Fps is (10 sec: 23756.3, 60 sec: 24302.8, 300 sec: 24187.3). Total num frames: 266805248. Throughput: 0: 6074.0. Samples: 66696120. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 17:44:23,127][15372] Avg episode reward: [(0, '41.421')] [2024-08-05 17:44:23,519][15444] Updated weights for policy 0, policy_version 32571 (0.0016) [2024-08-05 17:44:24,370][15417] Signal inference workers to stop experience collection... (12100 times) [2024-08-05 17:44:24,371][15417] Signal inference workers to resume experience collection... (12100 times) [2024-08-05 17:44:24,400][15444] InferenceWorker_p0-w0: stopping experience collection (12100 times) [2024-08-05 17:44:24,401][15444] InferenceWorker_p0-w0: resuming experience collection (12100 times) [2024-08-05 17:44:27,025][15444] Updated weights for policy 0, policy_version 32581 (0.0026) [2024-08-05 17:44:28,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24303.0, 300 sec: 24215.0). Total num frames: 266928128. Throughput: 0: 6079.8. Samples: 66732890. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 17:44:28,126][15372] Avg episode reward: [(0, '40.980')] [2024-08-05 17:44:30,237][15444] Updated weights for policy 0, policy_version 32591 (0.0010) [2024-08-05 17:44:33,118][15372] Fps is (10 sec: 24576.6, 60 sec: 24303.1, 300 sec: 24187.2). Total num frames: 267051008. Throughput: 0: 6084.9. Samples: 66769300. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 17:44:33,119][15372] Avg episode reward: [(0, '40.484')] [2024-08-05 17:44:33,750][15444] Updated weights for policy 0, policy_version 32601 (0.0019) [2024-08-05 17:44:36,936][15444] Updated weights for policy 0, policy_version 32611 (0.0025) [2024-08-05 17:44:38,119][15372] Fps is (10 sec: 24575.7, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 267173888. Throughput: 0: 6085.3. Samples: 66787940. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 17:44:38,119][15372] Avg episode reward: [(0, '41.041')] [2024-08-05 17:44:40,300][15444] Updated weights for policy 0, policy_version 32621 (0.0011) [2024-08-05 17:44:43,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24439.5, 300 sec: 24215.0). Total num frames: 267296768. Throughput: 0: 6073.8. Samples: 66823950. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 17:44:43,119][15372] Avg episode reward: [(0, '42.259')] [2024-08-05 17:44:43,746][15444] Updated weights for policy 0, policy_version 32631 (0.0017) [2024-08-05 17:44:47,314][15444] Updated weights for policy 0, policy_version 32641 (0.0016) [2024-08-05 17:44:48,118][15372] Fps is (10 sec: 23757.1, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 267411456. Throughput: 0: 6056.4. Samples: 66859380. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 17:44:48,119][15372] Avg episode reward: [(0, '41.391')] [2024-08-05 17:44:50,644][15444] Updated weights for policy 0, policy_version 32651 (0.0018) [2024-08-05 17:44:53,119][15372] Fps is (10 sec: 23756.6, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 267534336. Throughput: 0: 6065.1. Samples: 66878160. Policy #0 lag: (min: 0.0, avg: 3.8, max: 9.0) [2024-08-05 17:44:53,126][15372] Avg episode reward: [(0, '40.647')] [2024-08-05 17:44:53,894][15444] Updated weights for policy 0, policy_version 32661 (0.0021) [2024-08-05 17:44:57,724][15444] Updated weights for policy 0, policy_version 32671 (0.0031) [2024-08-05 17:44:58,119][15372] Fps is (10 sec: 23756.5, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 267649024. Throughput: 0: 6027.8. Samples: 66913470. Policy #0 lag: (min: 0.0, avg: 3.8, max: 9.0) [2024-08-05 17:44:58,119][15372] Avg episode reward: [(0, '40.848')] [2024-08-05 17:45:00,584][15444] Updated weights for policy 0, policy_version 32681 (0.0027) [2024-08-05 17:45:03,119][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.9, 300 sec: 24187.2). Total num frames: 267771904. Throughput: 0: 6023.1. Samples: 66949190. Policy #0 lag: (min: 0.0, avg: 3.8, max: 9.0) [2024-08-05 17:45:03,126][15372] Avg episode reward: [(0, '40.968')] [2024-08-05 17:45:04,315][15444] Updated weights for policy 0, policy_version 32691 (0.0010) [2024-08-05 17:45:07,797][15444] Updated weights for policy 0, policy_version 32701 (0.0014) [2024-08-05 17:45:08,118][15372] Fps is (10 sec: 24576.3, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 267894784. Throughput: 0: 6032.7. Samples: 66967590. Policy #0 lag: (min: 0.0, avg: 3.8, max: 9.0) [2024-08-05 17:45:08,119][15372] Avg episode reward: [(0, '41.901')] [2024-08-05 17:45:10,986][15444] Updated weights for policy 0, policy_version 32711 (0.0023) [2024-08-05 17:45:13,123][15372] Fps is (10 sec: 24565.6, 60 sec: 24164.7, 300 sec: 24214.6). Total num frames: 268017664. Throughput: 0: 6009.2. Samples: 67003330. Policy #0 lag: (min: 0.0, avg: 3.8, max: 9.0) [2024-08-05 17:45:13,123][15372] Avg episode reward: [(0, '41.558')] [2024-08-05 17:45:14,662][15444] Updated weights for policy 0, policy_version 32721 (0.0022) [2024-08-05 17:45:17,930][15444] Updated weights for policy 0, policy_version 32731 (0.0011) [2024-08-05 17:45:18,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24029.9, 300 sec: 24187.4). Total num frames: 268132352. Throughput: 0: 5994.4. Samples: 67039050. Policy #0 lag: (min: 0.0, avg: 3.6, max: 9.0) [2024-08-05 17:45:18,119][15372] Avg episode reward: [(0, '40.649')] [2024-08-05 17:45:19,431][15417] Signal inference workers to stop experience collection... (12150 times) [2024-08-05 17:45:19,431][15417] Signal inference workers to resume experience collection... (12150 times) [2024-08-05 17:45:19,482][15444] InferenceWorker_p0-w0: stopping experience collection (12150 times) [2024-08-05 17:45:19,482][15444] InferenceWorker_p0-w0: resuming experience collection (12150 times) [2024-08-05 17:45:21,288][15444] Updated weights for policy 0, policy_version 32741 (0.0022) [2024-08-05 17:45:23,119][15372] Fps is (10 sec: 23766.7, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 268255232. Throughput: 0: 5998.4. Samples: 67057870. Policy #0 lag: (min: 0.0, avg: 3.6, max: 9.0) [2024-08-05 17:45:23,119][15372] Avg episode reward: [(0, '40.927')] [2024-08-05 17:45:24,752][15444] Updated weights for policy 0, policy_version 32751 (0.0023) [2024-08-05 17:45:27,862][15444] Updated weights for policy 0, policy_version 32761 (0.0025) [2024-08-05 17:45:28,118][15372] Fps is (10 sec: 25395.2, 60 sec: 24302.9, 300 sec: 24270.6). Total num frames: 268386304. Throughput: 0: 6019.3. Samples: 67094820. Policy #0 lag: (min: 0.0, avg: 3.6, max: 9.0) [2024-08-05 17:45:28,119][15372] Avg episode reward: [(0, '41.778')] [2024-08-05 17:45:31,605][15444] Updated weights for policy 0, policy_version 32771 (0.0012) [2024-08-05 17:45:33,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 268492800. Throughput: 0: 6035.8. Samples: 67130990. Policy #0 lag: (min: 0.0, avg: 3.6, max: 9.0) [2024-08-05 17:45:33,119][15372] Avg episode reward: [(0, '41.794')] [2024-08-05 17:45:34,776][15444] Updated weights for policy 0, policy_version 32781 (0.0022) [2024-08-05 17:45:38,119][15372] Fps is (10 sec: 22937.1, 60 sec: 24029.8, 300 sec: 24187.2). Total num frames: 268615680. Throughput: 0: 6034.2. Samples: 67149700. Policy #0 lag: (min: 1.0, avg: 3.8, max: 7.0) [2024-08-05 17:45:38,127][15372] Avg episode reward: [(0, '41.127')] [2024-08-05 17:45:38,130][15444] Updated weights for policy 0, policy_version 32791 (0.0020) [2024-08-05 17:45:41,391][15444] Updated weights for policy 0, policy_version 32801 (0.0021) [2024-08-05 17:45:43,118][15372] Fps is (10 sec: 25395.3, 60 sec: 24166.4, 300 sec: 24242.8). Total num frames: 268746752. Throughput: 0: 6052.7. Samples: 67185840. Policy #0 lag: (min: 1.0, avg: 3.8, max: 7.0) [2024-08-05 17:45:43,126][15372] Avg episode reward: [(0, '41.323')] [2024-08-05 17:45:44,767][15444] Updated weights for policy 0, policy_version 32811 (0.0010) [2024-08-05 17:45:48,118][15372] Fps is (10 sec: 24576.4, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 268861440. Throughput: 0: 6061.8. Samples: 67221970. Policy #0 lag: (min: 1.0, avg: 3.8, max: 7.0) [2024-08-05 17:45:48,126][15372] Avg episode reward: [(0, '40.756')] [2024-08-05 17:45:48,424][15444] Updated weights for policy 0, policy_version 32821 (0.0016) [2024-08-05 17:45:51,550][15444] Updated weights for policy 0, policy_version 32831 (0.0018) [2024-08-05 17:45:53,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 268984320. Throughput: 0: 6060.7. Samples: 67240320. Policy #0 lag: (min: 1.0, avg: 3.8, max: 7.0) [2024-08-05 17:45:53,126][15372] Avg episode reward: [(0, '40.353')] [2024-08-05 17:45:54,930][15444] Updated weights for policy 0, policy_version 32841 (0.0011) [2024-08-05 17:45:58,119][15372] Fps is (10 sec: 24575.7, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 269107200. Throughput: 0: 6087.7. Samples: 67277250. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 17:45:58,126][15372] Avg episode reward: [(0, '40.672')] [2024-08-05 17:45:58,502][15444] Updated weights for policy 0, policy_version 32851 (0.0011) [2024-08-05 17:46:01,621][15444] Updated weights for policy 0, policy_version 32861 (0.0032) [2024-08-05 17:46:03,119][15372] Fps is (10 sec: 24575.4, 60 sec: 24302.9, 300 sec: 24242.8). Total num frames: 269230080. Throughput: 0: 6085.1. Samples: 67312880. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 17:46:03,126][15372] Avg episode reward: [(0, '40.768')] [2024-08-05 17:46:05,002][15444] Updated weights for policy 0, policy_version 32871 (0.0034) [2024-08-05 17:46:08,118][15372] Fps is (10 sec: 24576.4, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 269352960. Throughput: 0: 6082.5. Samples: 67331580. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 17:46:08,126][15372] Avg episode reward: [(0, '40.990')] [2024-08-05 17:46:08,694][15444] Updated weights for policy 0, policy_version 32881 (0.0014) [2024-08-05 17:46:09,026][15417] Signal inference workers to stop experience collection... (12200 times) [2024-08-05 17:46:09,026][15417] Signal inference workers to resume experience collection... (12200 times) [2024-08-05 17:46:09,093][15444] InferenceWorker_p0-w0: stopping experience collection (12200 times) [2024-08-05 17:46:09,093][15444] InferenceWorker_p0-w0: resuming experience collection (12200 times) [2024-08-05 17:46:11,681][15444] Updated weights for policy 0, policy_version 32891 (0.0014) [2024-08-05 17:46:13,118][15372] Fps is (10 sec: 24576.7, 60 sec: 24304.7, 300 sec: 24242.8). Total num frames: 269475840. Throughput: 0: 6070.4. Samples: 67367990. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 17:46:13,119][15372] Avg episode reward: [(0, '41.038')] [2024-08-05 17:46:15,078][15444] Updated weights for policy 0, policy_version 32901 (0.0013) [2024-08-05 17:46:18,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24439.5, 300 sec: 24243.2). Total num frames: 269598720. Throughput: 0: 6082.7. Samples: 67404710. Policy #0 lag: (min: 0.0, avg: 4.4, max: 8.0) [2024-08-05 17:46:18,126][15372] Avg episode reward: [(0, '40.937')] [2024-08-05 17:46:18,130][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000032910_269598720.pth... [2024-08-05 17:46:18,247][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000032200_263782400.pth [2024-08-05 17:46:18,756][15444] Updated weights for policy 0, policy_version 32911 (0.0030) [2024-08-05 17:46:21,709][15444] Updated weights for policy 0, policy_version 32921 (0.0020) [2024-08-05 17:46:23,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24303.0, 300 sec: 24242.8). Total num frames: 269713408. Throughput: 0: 6071.8. Samples: 67422930. Policy #0 lag: (min: 0.0, avg: 4.4, max: 8.0) [2024-08-05 17:46:23,119][15372] Avg episode reward: [(0, '40.539')] [2024-08-05 17:46:25,446][15444] Updated weights for policy 0, policy_version 32931 (0.0012) [2024-08-05 17:46:28,119][15372] Fps is (10 sec: 24575.8, 60 sec: 24302.9, 300 sec: 24270.5). Total num frames: 269844480. Throughput: 0: 6071.8. Samples: 67459070. Policy #0 lag: (min: 0.0, avg: 4.4, max: 8.0) [2024-08-05 17:46:28,119][15372] Avg episode reward: [(0, '40.991')] [2024-08-05 17:46:28,830][15444] Updated weights for policy 0, policy_version 32941 (0.0012) [2024-08-05 17:46:32,077][15444] Updated weights for policy 0, policy_version 32951 (0.0014) [2024-08-05 17:46:33,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24302.9, 300 sec: 24187.4). Total num frames: 269950976. Throughput: 0: 6057.3. Samples: 67494550. Policy #0 lag: (min: 0.0, avg: 4.4, max: 8.0) [2024-08-05 17:46:33,119][15372] Avg episode reward: [(0, '41.748')] [2024-08-05 17:46:35,705][15444] Updated weights for policy 0, policy_version 32961 (0.0013) [2024-08-05 17:46:38,119][15372] Fps is (10 sec: 22937.5, 60 sec: 24303.0, 300 sec: 24215.0). Total num frames: 270073856. Throughput: 0: 6076.9. Samples: 67513780. Policy #0 lag: (min: 0.0, avg: 4.4, max: 8.0) [2024-08-05 17:46:38,119][15372] Avg episode reward: [(0, '40.858')] [2024-08-05 17:46:38,593][15444] Updated weights for policy 0, policy_version 32971 (0.0012) [2024-08-05 17:46:42,327][15444] Updated weights for policy 0, policy_version 32981 (0.0014) [2024-08-05 17:46:43,119][15372] Fps is (10 sec: 24574.9, 60 sec: 24166.2, 300 sec: 24215.0). Total num frames: 270196736. Throughput: 0: 6048.6. Samples: 67549440. Policy #0 lag: (min: 0.0, avg: 3.7, max: 9.0) [2024-08-05 17:46:43,119][15372] Avg episode reward: [(0, '41.341')] [2024-08-05 17:46:44,514][15417] Signal inference workers to stop experience collection... (12250 times) [2024-08-05 17:46:44,514][15417] Signal inference workers to resume experience collection... (12250 times) [2024-08-05 17:46:44,563][15444] InferenceWorker_p0-w0: stopping experience collection (12250 times) [2024-08-05 17:46:44,568][15444] InferenceWorker_p0-w0: resuming experience collection (12250 times) [2024-08-05 17:46:45,860][15444] Updated weights for policy 0, policy_version 32991 (0.0019) [2024-08-05 17:46:48,118][15372] Fps is (10 sec: 24576.3, 60 sec: 24303.0, 300 sec: 24215.0). Total num frames: 270319616. Throughput: 0: 6076.9. Samples: 67586340. Policy #0 lag: (min: 0.0, avg: 3.7, max: 9.0) [2024-08-05 17:46:48,119][15372] Avg episode reward: [(0, '40.323')] [2024-08-05 17:46:48,897][15444] Updated weights for policy 0, policy_version 33001 (0.0014) [2024-08-05 17:46:52,383][15444] Updated weights for policy 0, policy_version 33011 (0.0011) [2024-08-05 17:46:53,119][15372] Fps is (10 sec: 24575.8, 60 sec: 24302.7, 300 sec: 24242.7). Total num frames: 270442496. Throughput: 0: 6079.5. Samples: 67605160. Policy #0 lag: (min: 0.0, avg: 3.7, max: 9.0) [2024-08-05 17:46:53,119][15372] Avg episode reward: [(0, '41.006')] [2024-08-05 17:46:55,421][15444] Updated weights for policy 0, policy_version 33021 (0.0010) [2024-08-05 17:46:58,119][15372] Fps is (10 sec: 24575.4, 60 sec: 24302.9, 300 sec: 24242.8). Total num frames: 270565376. Throughput: 0: 6070.0. Samples: 67641140. Policy #0 lag: (min: 0.0, avg: 3.7, max: 9.0) [2024-08-05 17:46:58,126][15372] Avg episode reward: [(0, '41.097')] [2024-08-05 17:46:59,204][15444] Updated weights for policy 0, policy_version 33031 (0.0012) [2024-08-05 17:47:02,957][15444] Updated weights for policy 0, policy_version 33041 (0.0016) [2024-08-05 17:47:03,119][15372] Fps is (10 sec: 22937.8, 60 sec: 24029.8, 300 sec: 24187.2). Total num frames: 270671872. Throughput: 0: 6046.6. Samples: 67676810. Policy #0 lag: (min: 0.0, avg: 2.9, max: 7.0) [2024-08-05 17:47:03,120][15372] Avg episode reward: [(0, '40.586')] [2024-08-05 17:47:05,856][15444] Updated weights for policy 0, policy_version 33051 (0.0019) [2024-08-05 17:47:08,119][15372] Fps is (10 sec: 23757.1, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 270802944. Throughput: 0: 6038.4. Samples: 67694660. Policy #0 lag: (min: 0.0, avg: 2.9, max: 7.0) [2024-08-05 17:47:08,126][15372] Avg episode reward: [(0, '40.825')] [2024-08-05 17:47:09,454][15444] Updated weights for policy 0, policy_version 33061 (0.0014) [2024-08-05 17:47:12,845][15444] Updated weights for policy 0, policy_version 33071 (0.0020) [2024-08-05 17:47:13,119][15372] Fps is (10 sec: 24576.4, 60 sec: 24029.8, 300 sec: 24187.2). Total num frames: 270917632. Throughput: 0: 6038.0. Samples: 67730780. Policy #0 lag: (min: 0.0, avg: 2.9, max: 7.0) [2024-08-05 17:47:13,119][15372] Avg episode reward: [(0, '40.671')] [2024-08-05 17:47:16,228][15444] Updated weights for policy 0, policy_version 33081 (0.0020) [2024-08-05 17:47:18,119][15372] Fps is (10 sec: 23756.2, 60 sec: 24029.7, 300 sec: 24187.2). Total num frames: 271040512. Throughput: 0: 6058.4. Samples: 67767180. Policy #0 lag: (min: 0.0, avg: 2.9, max: 7.0) [2024-08-05 17:47:18,127][15372] Avg episode reward: [(0, '40.797')] [2024-08-05 17:47:19,641][15444] Updated weights for policy 0, policy_version 33091 (0.0011) [2024-08-05 17:47:22,750][15444] Updated weights for policy 0, policy_version 33101 (0.0026) [2024-08-05 17:47:23,118][15372] Fps is (10 sec: 24576.6, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 271163392. Throughput: 0: 6033.3. Samples: 67785280. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 17:47:23,119][15372] Avg episode reward: [(0, '40.968')] [2024-08-05 17:47:24,774][15417] Signal inference workers to stop experience collection... (12300 times) [2024-08-05 17:47:24,774][15417] Signal inference workers to resume experience collection... (12300 times) [2024-08-05 17:47:24,805][15444] InferenceWorker_p0-w0: stopping experience collection (12300 times) [2024-08-05 17:47:24,806][15444] InferenceWorker_p0-w0: resuming experience collection (12300 times) [2024-08-05 17:47:26,292][15444] Updated weights for policy 0, policy_version 33111 (0.0017) [2024-08-05 17:47:28,118][15372] Fps is (10 sec: 24576.8, 60 sec: 24029.9, 300 sec: 24215.0). Total num frames: 271286272. Throughput: 0: 6057.2. Samples: 67822010. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 17:47:28,119][15372] Avg episode reward: [(0, '40.850')] [2024-08-05 17:47:29,446][15444] Updated weights for policy 0, policy_version 33121 (0.0020) [2024-08-05 17:47:33,014][15444] Updated weights for policy 0, policy_version 33131 (0.0024) [2024-08-05 17:47:33,119][15372] Fps is (10 sec: 24575.8, 60 sec: 24302.9, 300 sec: 24215.2). Total num frames: 271409152. Throughput: 0: 6053.8. Samples: 67858760. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 17:47:33,119][15372] Avg episode reward: [(0, '40.822')] [2024-08-05 17:47:36,375][15444] Updated weights for policy 0, policy_version 33141 (0.0013) [2024-08-05 17:47:38,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24303.0, 300 sec: 24215.0). Total num frames: 271532032. Throughput: 0: 6038.5. Samples: 67876890. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 17:47:38,126][15372] Avg episode reward: [(0, '40.831')] [2024-08-05 17:47:39,676][15444] Updated weights for policy 0, policy_version 33151 (0.0016) [2024-08-05 17:47:43,118][15372] Fps is (10 sec: 23757.1, 60 sec: 24166.6, 300 sec: 24215.0). Total num frames: 271646720. Throughput: 0: 6058.0. Samples: 67913750. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 17:47:43,126][15372] Avg episode reward: [(0, '40.971')] [2024-08-05 17:47:43,197][15444] Updated weights for policy 0, policy_version 33161 (0.0011) [2024-08-05 17:47:46,282][15444] Updated weights for policy 0, policy_version 33171 (0.0024) [2024-08-05 17:47:48,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24302.9, 300 sec: 24270.5). Total num frames: 271777792. Throughput: 0: 6063.6. Samples: 67949670. Policy #0 lag: (min: 2.0, avg: 4.5, max: 9.0) [2024-08-05 17:47:48,126][15372] Avg episode reward: [(0, '40.782')] [2024-08-05 17:47:49,877][15444] Updated weights for policy 0, policy_version 33181 (0.0022) [2024-08-05 17:47:53,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24166.6, 300 sec: 24215.0). Total num frames: 271892480. Throughput: 0: 6082.5. Samples: 67968370. Policy #0 lag: (min: 2.0, avg: 4.5, max: 9.0) [2024-08-05 17:47:53,126][15372] Avg episode reward: [(0, '40.033')] [2024-08-05 17:47:53,273][15444] Updated weights for policy 0, policy_version 33191 (0.0012) [2024-08-05 17:47:56,429][15444] Updated weights for policy 0, policy_version 33201 (0.0017) [2024-08-05 17:47:58,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.5, 300 sec: 24242.8). Total num frames: 272015360. Throughput: 0: 6068.5. Samples: 68003860. Policy #0 lag: (min: 2.0, avg: 4.5, max: 9.0) [2024-08-05 17:47:58,126][15372] Avg episode reward: [(0, '40.910')] [2024-08-05 17:48:00,099][15444] Updated weights for policy 0, policy_version 33211 (0.0020) [2024-08-05 17:48:03,119][15372] Fps is (10 sec: 24575.6, 60 sec: 24439.6, 300 sec: 24242.8). Total num frames: 272138240. Throughput: 0: 6073.8. Samples: 68040500. Policy #0 lag: (min: 2.0, avg: 4.5, max: 9.0) [2024-08-05 17:48:03,126][15372] Avg episode reward: [(0, '40.773')] [2024-08-05 17:48:03,436][15444] Updated weights for policy 0, policy_version 33221 (0.0013) [2024-08-05 17:48:06,925][15444] Updated weights for policy 0, policy_version 33231 (0.0013) [2024-08-05 17:48:08,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24303.0, 300 sec: 24242.8). Total num frames: 272261120. Throughput: 0: 6076.9. Samples: 68058740. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 17:48:08,119][15372] Avg episode reward: [(0, '40.013')] [2024-08-05 17:48:10,265][15444] Updated weights for policy 0, policy_version 33241 (0.0022) [2024-08-05 17:48:13,119][15372] Fps is (10 sec: 22937.0, 60 sec: 24166.3, 300 sec: 24187.2). Total num frames: 272367616. Throughput: 0: 6047.1. Samples: 68094130. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 17:48:13,128][15372] Avg episode reward: [(0, '41.016')] [2024-08-05 17:48:13,775][15444] Updated weights for policy 0, policy_version 33251 (0.0014) [2024-08-05 17:48:13,916][15417] Signal inference workers to stop experience collection... (12350 times) [2024-08-05 17:48:13,916][15417] Signal inference workers to resume experience collection... (12350 times) [2024-08-05 17:48:13,969][15444] InferenceWorker_p0-w0: stopping experience collection (12350 times) [2024-08-05 17:48:13,969][15444] InferenceWorker_p0-w0: resuming experience collection (12350 times) [2024-08-05 17:48:16,956][15444] Updated weights for policy 0, policy_version 33261 (0.0029) [2024-08-05 17:48:18,119][15372] Fps is (10 sec: 22937.4, 60 sec: 24166.5, 300 sec: 24215.0). Total num frames: 272490496. Throughput: 0: 6033.6. Samples: 68130270. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 17:48:18,126][15372] Avg episode reward: [(0, '39.988')] [2024-08-05 17:48:18,186][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000033264_272498688.pth... [2024-08-05 17:48:18,329][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000032555_266690560.pth [2024-08-05 17:48:20,447][15444] Updated weights for policy 0, policy_version 33271 (0.0012) [2024-08-05 17:48:23,118][15372] Fps is (10 sec: 24576.9, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 272613376. Throughput: 0: 6015.1. Samples: 68147570. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 17:48:23,119][15372] Avg episode reward: [(0, '41.567')] [2024-08-05 17:48:24,106][15444] Updated weights for policy 0, policy_version 33281 (0.0016) [2024-08-05 17:48:27,361][15444] Updated weights for policy 0, policy_version 33291 (0.0016) [2024-08-05 17:48:28,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24029.9, 300 sec: 24187.3). Total num frames: 272728064. Throughput: 0: 6003.1. Samples: 68183890. Policy #0 lag: (min: 2.0, avg: 4.6, max: 8.0) [2024-08-05 17:48:28,119][15372] Avg episode reward: [(0, '42.033')] [2024-08-05 17:48:30,769][15444] Updated weights for policy 0, policy_version 33301 (0.0014) [2024-08-05 17:48:33,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24166.5, 300 sec: 24215.0). Total num frames: 272859136. Throughput: 0: 6006.7. Samples: 68219970. Policy #0 lag: (min: 2.0, avg: 4.6, max: 8.0) [2024-08-05 17:48:33,126][15372] Avg episode reward: [(0, '40.562')] [2024-08-05 17:48:34,461][15444] Updated weights for policy 0, policy_version 33311 (0.0022) [2024-08-05 17:48:37,495][15444] Updated weights for policy 0, policy_version 33321 (0.0020) [2024-08-05 17:48:38,119][15372] Fps is (10 sec: 24575.8, 60 sec: 24029.8, 300 sec: 24215.0). Total num frames: 272973824. Throughput: 0: 5985.8. Samples: 68237730. Policy #0 lag: (min: 2.0, avg: 4.6, max: 8.0) [2024-08-05 17:48:38,126][15372] Avg episode reward: [(0, '39.954')] [2024-08-05 17:48:41,092][15444] Updated weights for policy 0, policy_version 33331 (0.0012) [2024-08-05 17:48:43,119][15372] Fps is (10 sec: 23756.5, 60 sec: 24166.3, 300 sec: 24187.2). Total num frames: 273096704. Throughput: 0: 5995.8. Samples: 68273670. Policy #0 lag: (min: 2.0, avg: 4.6, max: 8.0) [2024-08-05 17:48:43,119][15372] Avg episode reward: [(0, '39.996')] [2024-08-05 17:48:44,463][15444] Updated weights for policy 0, policy_version 33341 (0.0017) [2024-08-05 17:48:47,826][15444] Updated weights for policy 0, policy_version 33351 (0.0027) [2024-08-05 17:48:48,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24029.9, 300 sec: 24215.0). Total num frames: 273219584. Throughput: 0: 5988.9. Samples: 68310000. Policy #0 lag: (min: 2.0, avg: 4.6, max: 8.0) [2024-08-05 17:48:48,119][15372] Avg episode reward: [(0, '41.713')] [2024-08-05 17:48:51,249][15444] Updated weights for policy 0, policy_version 33361 (0.0012) [2024-08-05 17:48:53,118][15372] Fps is (10 sec: 23757.1, 60 sec: 24029.9, 300 sec: 24187.2). Total num frames: 273334272. Throughput: 0: 6000.4. Samples: 68328760. Policy #0 lag: (min: 0.0, avg: 4.3, max: 8.0) [2024-08-05 17:48:53,126][15372] Avg episode reward: [(0, '41.236')] [2024-08-05 17:48:54,418][15444] Updated weights for policy 0, policy_version 33371 (0.0012) [2024-08-05 17:48:57,922][15444] Updated weights for policy 0, policy_version 33381 (0.0013) [2024-08-05 17:48:58,119][15372] Fps is (10 sec: 23756.2, 60 sec: 24029.8, 300 sec: 24159.4). Total num frames: 273457152. Throughput: 0: 6024.9. Samples: 68365250. Policy #0 lag: (min: 0.0, avg: 4.3, max: 8.0) [2024-08-05 17:48:58,119][15372] Avg episode reward: [(0, '40.672')] [2024-08-05 17:49:01,186][15444] Updated weights for policy 0, policy_version 33391 (0.0014) [2024-08-05 17:49:02,048][15417] Signal inference workers to stop experience collection... (12400 times) [2024-08-05 17:49:02,057][15417] Signal inference workers to resume experience collection... (12400 times) [2024-08-05 17:49:02,098][15444] InferenceWorker_p0-w0: stopping experience collection (12400 times) [2024-08-05 17:49:02,099][15444] InferenceWorker_p0-w0: resuming experience collection (12400 times) [2024-08-05 17:49:03,119][15372] Fps is (10 sec: 24574.4, 60 sec: 24029.7, 300 sec: 24187.2). Total num frames: 273580032. Throughput: 0: 6010.8. Samples: 68400760. Policy #0 lag: (min: 0.0, avg: 4.3, max: 8.0) [2024-08-05 17:49:03,119][15372] Avg episode reward: [(0, '40.716')] [2024-08-05 17:49:04,654][15444] Updated weights for policy 0, policy_version 33401 (0.0010) [2024-08-05 17:49:08,118][15372] Fps is (10 sec: 23757.4, 60 sec: 23893.3, 300 sec: 24159.5). Total num frames: 273694720. Throughput: 0: 6043.8. Samples: 68419540. Policy #0 lag: (min: 0.0, avg: 4.3, max: 8.0) [2024-08-05 17:49:08,126][15372] Avg episode reward: [(0, '41.619')] [2024-08-05 17:49:08,186][15444] Updated weights for policy 0, policy_version 33411 (0.0015) [2024-08-05 17:49:11,190][15444] Updated weights for policy 0, policy_version 33421 (0.0012) [2024-08-05 17:49:13,118][15372] Fps is (10 sec: 24577.5, 60 sec: 24303.1, 300 sec: 24187.2). Total num frames: 273825792. Throughput: 0: 6050.0. Samples: 68456140. Policy #0 lag: (min: 0.0, avg: 4.0, max: 9.0) [2024-08-05 17:49:13,126][15372] Avg episode reward: [(0, '41.182')] [2024-08-05 17:49:14,774][15444] Updated weights for policy 0, policy_version 33431 (0.0020) [2024-08-05 17:49:18,065][15444] Updated weights for policy 0, policy_version 33441 (0.0025) [2024-08-05 17:49:18,118][15372] Fps is (10 sec: 25395.1, 60 sec: 24303.0, 300 sec: 24215.0). Total num frames: 273948672. Throughput: 0: 6059.5. Samples: 68492650. Policy #0 lag: (min: 0.0, avg: 4.0, max: 9.0) [2024-08-05 17:49:18,119][15372] Avg episode reward: [(0, '40.864')] [2024-08-05 17:49:21,282][15444] Updated weights for policy 0, policy_version 33451 (0.0018) [2024-08-05 17:49:23,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 274063360. Throughput: 0: 6087.8. Samples: 68511680. Policy #0 lag: (min: 0.0, avg: 4.0, max: 9.0) [2024-08-05 17:49:23,126][15372] Avg episode reward: [(0, '41.392')] [2024-08-05 17:49:24,819][15444] Updated weights for policy 0, policy_version 33461 (0.0017) [2024-08-05 17:49:28,110][15444] Updated weights for policy 0, policy_version 33471 (0.0018) [2024-08-05 17:49:28,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24439.4, 300 sec: 24215.0). Total num frames: 274194432. Throughput: 0: 6094.0. Samples: 68547900. Policy #0 lag: (min: 0.0, avg: 4.0, max: 9.0) [2024-08-05 17:49:28,119][15372] Avg episode reward: [(0, '42.651')] [2024-08-05 17:49:28,122][15417] Saving new best policy, reward=42.651! [2024-08-05 17:49:31,528][15444] Updated weights for policy 0, policy_version 33481 (0.0029) [2024-08-05 17:49:33,118][15372] Fps is (10 sec: 24575.8, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 274309120. Throughput: 0: 6074.9. Samples: 68583370. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 17:49:33,126][15372] Avg episode reward: [(0, '42.467')] [2024-08-05 17:49:35,022][15444] Updated weights for policy 0, policy_version 33491 (0.0013) [2024-08-05 17:49:38,119][15372] Fps is (10 sec: 23756.4, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 274432000. Throughput: 0: 6072.4. Samples: 68602020. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 17:49:38,126][15372] Avg episode reward: [(0, '40.443')] [2024-08-05 17:49:38,308][15444] Updated weights for policy 0, policy_version 33501 (0.0020) [2024-08-05 17:49:41,690][15444] Updated weights for policy 0, policy_version 33511 (0.0023) [2024-08-05 17:49:43,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24303.0, 300 sec: 24215.0). Total num frames: 274554880. Throughput: 0: 6067.1. Samples: 68638270. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 17:49:43,119][15372] Avg episode reward: [(0, '39.587')] [2024-08-05 17:49:45,046][15444] Updated weights for policy 0, policy_version 33521 (0.0020) [2024-08-05 17:49:48,118][15372] Fps is (10 sec: 24576.5, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 274677760. Throughput: 0: 6104.8. Samples: 68675470. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 17:49:48,126][15372] Avg episode reward: [(0, '40.938')] [2024-08-05 17:49:48,213][15444] Updated weights for policy 0, policy_version 33531 (0.0020) [2024-08-05 17:49:51,603][15444] Updated weights for policy 0, policy_version 33541 (0.0019) [2024-08-05 17:49:53,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24439.5, 300 sec: 24242.8). Total num frames: 274800640. Throughput: 0: 6096.0. Samples: 68693860. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 17:49:53,126][15372] Avg episode reward: [(0, '41.978')] [2024-08-05 17:49:54,980][15444] Updated weights for policy 0, policy_version 33551 (0.0024) [2024-08-05 17:49:58,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24439.6, 300 sec: 24242.8). Total num frames: 274923520. Throughput: 0: 6111.8. Samples: 68731170. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 17:49:58,126][15372] Avg episode reward: [(0, '43.060')] [2024-08-05 17:49:58,129][15417] Saving new best policy, reward=43.060! [2024-08-05 17:49:58,557][15444] Updated weights for policy 0, policy_version 33561 (0.0017) [2024-08-05 17:50:01,636][15444] Updated weights for policy 0, policy_version 33571 (0.0012) [2024-08-05 17:50:03,043][15417] Signal inference workers to stop experience collection... (12450 times) [2024-08-05 17:50:03,053][15417] Signal inference workers to resume experience collection... (12450 times) [2024-08-05 17:50:03,119][15372] Fps is (10 sec: 23756.3, 60 sec: 24303.1, 300 sec: 24215.0). Total num frames: 275038208. Throughput: 0: 6089.5. Samples: 68766680. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 17:50:03,119][15372] Avg episode reward: [(0, '42.186')] [2024-08-05 17:50:03,124][15444] InferenceWorker_p0-w0: stopping experience collection (12450 times) [2024-08-05 17:50:03,124][15444] InferenceWorker_p0-w0: resuming experience collection (12450 times) [2024-08-05 17:50:05,066][15444] Updated weights for policy 0, policy_version 33581 (0.0011) [2024-08-05 17:50:08,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24439.4, 300 sec: 24215.3). Total num frames: 275161088. Throughput: 0: 6070.4. Samples: 68784850. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 17:50:08,126][15372] Avg episode reward: [(0, '41.735')] [2024-08-05 17:50:08,625][15444] Updated weights for policy 0, policy_version 33591 (0.0020) [2024-08-05 17:50:11,842][15444] Updated weights for policy 0, policy_version 33601 (0.0011) [2024-08-05 17:50:13,119][15372] Fps is (10 sec: 24576.2, 60 sec: 24302.9, 300 sec: 24242.8). Total num frames: 275283968. Throughput: 0: 6074.0. Samples: 68821230. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 17:50:13,119][15372] Avg episode reward: [(0, '42.398')] [2024-08-05 17:50:15,261][15444] Updated weights for policy 0, policy_version 33611 (0.0031) [2024-08-05 17:50:18,118][15372] Fps is (10 sec: 25395.3, 60 sec: 24439.5, 300 sec: 24270.5). Total num frames: 275415040. Throughput: 0: 6108.4. Samples: 68858250. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 17:50:18,126][15372] Avg episode reward: [(0, '41.958')] [2024-08-05 17:50:18,129][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000033620_275415040.pth... [2024-08-05 17:50:18,237][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000032910_269598720.pth [2024-08-05 17:50:18,780][15444] Updated weights for policy 0, policy_version 33621 (0.0026) [2024-08-05 17:50:22,180][15444] Updated weights for policy 0, policy_version 33631 (0.0012) [2024-08-05 17:50:23,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24439.5, 300 sec: 24215.0). Total num frames: 275529728. Throughput: 0: 6100.0. Samples: 68876520. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 17:50:23,119][15372] Avg episode reward: [(0, '41.061')] [2024-08-05 17:50:25,343][15444] Updated weights for policy 0, policy_version 33641 (0.0021) [2024-08-05 17:50:28,124][15372] Fps is (10 sec: 23744.6, 60 sec: 24300.9, 300 sec: 24270.1). Total num frames: 275652608. Throughput: 0: 6102.6. Samples: 68912920. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 17:50:28,124][15372] Avg episode reward: [(0, '40.891')] [2024-08-05 17:50:28,678][15444] Updated weights for policy 0, policy_version 33651 (0.0022) [2024-08-05 17:50:32,161][15444] Updated weights for policy 0, policy_version 33661 (0.0035) [2024-08-05 17:50:33,119][15372] Fps is (10 sec: 24575.1, 60 sec: 24439.3, 300 sec: 24270.5). Total num frames: 275775488. Throughput: 0: 6081.7. Samples: 68949150. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 17:50:33,119][15372] Avg episode reward: [(0, '40.883')] [2024-08-05 17:50:35,371][15444] Updated weights for policy 0, policy_version 33671 (0.0029) [2024-08-05 17:50:38,119][15372] Fps is (10 sec: 24588.4, 60 sec: 24439.5, 300 sec: 24242.8). Total num frames: 275898368. Throughput: 0: 6080.2. Samples: 68967470. Policy #0 lag: (min: 1.0, avg: 3.9, max: 8.0) [2024-08-05 17:50:38,119][15372] Avg episode reward: [(0, '41.149')] [2024-08-05 17:50:38,955][15444] Updated weights for policy 0, policy_version 33681 (0.0014) [2024-08-05 17:50:42,244][15444] Updated weights for policy 0, policy_version 33691 (0.0017) [2024-08-05 17:50:43,119][15372] Fps is (10 sec: 23757.4, 60 sec: 24302.9, 300 sec: 24242.8). Total num frames: 276013056. Throughput: 0: 6046.9. Samples: 69003280. Policy #0 lag: (min: 1.0, avg: 3.9, max: 8.0) [2024-08-05 17:50:43,119][15372] Avg episode reward: [(0, '41.190')] [2024-08-05 17:50:45,562][15444] Updated weights for policy 0, policy_version 33701 (0.0012) [2024-08-05 17:50:48,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24302.9, 300 sec: 24242.8). Total num frames: 276135936. Throughput: 0: 6076.9. Samples: 69040140. Policy #0 lag: (min: 1.0, avg: 3.9, max: 8.0) [2024-08-05 17:50:48,126][15372] Avg episode reward: [(0, '41.220')] [2024-08-05 17:50:49,330][15444] Updated weights for policy 0, policy_version 33711 (0.0025) [2024-08-05 17:50:52,320][15444] Updated weights for policy 0, policy_version 33721 (0.0012) [2024-08-05 17:50:53,118][15372] Fps is (10 sec: 23757.1, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 276250624. Throughput: 0: 6071.3. Samples: 69058060. Policy #0 lag: (min: 1.0, avg: 3.9, max: 8.0) [2024-08-05 17:50:53,126][15372] Avg episode reward: [(0, '40.928')] [2024-08-05 17:50:54,224][15417] Signal inference workers to stop experience collection... (12500 times) [2024-08-05 17:50:54,224][15417] Signal inference workers to resume experience collection... (12500 times) [2024-08-05 17:50:54,273][15444] InferenceWorker_p0-w0: stopping experience collection (12500 times) [2024-08-05 17:50:54,278][15444] InferenceWorker_p0-w0: resuming experience collection (12500 times) [2024-08-05 17:50:55,843][15444] Updated weights for policy 0, policy_version 33731 (0.0016) [2024-08-05 17:50:58,119][15372] Fps is (10 sec: 24575.8, 60 sec: 24302.9, 300 sec: 24242.8). Total num frames: 276381696. Throughput: 0: 6072.2. Samples: 69094480. Policy #0 lag: (min: 1.0, avg: 3.9, max: 8.0) [2024-08-05 17:50:58,119][15372] Avg episode reward: [(0, '41.296')] [2024-08-05 17:50:59,456][15444] Updated weights for policy 0, policy_version 33741 (0.0014) [2024-08-05 17:51:02,409][15444] Updated weights for policy 0, policy_version 33751 (0.0011) [2024-08-05 17:51:03,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24303.0, 300 sec: 24215.0). Total num frames: 276496384. Throughput: 0: 6038.0. Samples: 69129960. Policy #0 lag: (min: 0.0, avg: 4.2, max: 9.0) [2024-08-05 17:51:03,119][15372] Avg episode reward: [(0, '41.821')] [2024-08-05 17:51:06,150][15444] Updated weights for policy 0, policy_version 33761 (0.0012) [2024-08-05 17:51:08,118][15372] Fps is (10 sec: 22937.8, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 276611072. Throughput: 0: 6042.4. Samples: 69148430. Policy #0 lag: (min: 0.0, avg: 4.2, max: 9.0) [2024-08-05 17:51:08,119][15372] Avg episode reward: [(0, '41.118')] [2024-08-05 17:51:09,478][15444] Updated weights for policy 0, policy_version 33771 (0.0012) [2024-08-05 17:51:12,661][15444] Updated weights for policy 0, policy_version 33781 (0.0011) [2024-08-05 17:51:13,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24303.0, 300 sec: 24215.0). Total num frames: 276742144. Throughput: 0: 6054.0. Samples: 69185320. Policy #0 lag: (min: 0.0, avg: 4.2, max: 9.0) [2024-08-05 17:51:13,119][15372] Avg episode reward: [(0, '40.972')] [2024-08-05 17:51:16,332][15444] Updated weights for policy 0, policy_version 33791 (0.0017) [2024-08-05 17:51:18,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24029.9, 300 sec: 24215.0). Total num frames: 276856832. Throughput: 0: 6044.1. Samples: 69221130. Policy #0 lag: (min: 0.0, avg: 4.2, max: 9.0) [2024-08-05 17:51:18,119][15372] Avg episode reward: [(0, '42.051')] [2024-08-05 17:51:19,398][15444] Updated weights for policy 0, policy_version 33801 (0.0039) [2024-08-05 17:51:22,869][15444] Updated weights for policy 0, policy_version 33811 (0.0018) [2024-08-05 17:51:23,119][15372] Fps is (10 sec: 23756.6, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 276979712. Throughput: 0: 6056.0. Samples: 69239990. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 17:51:23,119][15372] Avg episode reward: [(0, '40.595')] [2024-08-05 17:51:26,270][15444] Updated weights for policy 0, policy_version 33821 (0.0015) [2024-08-05 17:51:28,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24168.5, 300 sec: 24242.8). Total num frames: 277102592. Throughput: 0: 6064.5. Samples: 69276180. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 17:51:28,126][15372] Avg episode reward: [(0, '40.875')] [2024-08-05 17:51:29,410][15444] Updated weights for policy 0, policy_version 33831 (0.0011) [2024-08-05 17:51:32,997][15444] Updated weights for policy 0, policy_version 33841 (0.0012) [2024-08-05 17:51:33,119][15372] Fps is (10 sec: 24575.5, 60 sec: 24166.4, 300 sec: 24242.7). Total num frames: 277225472. Throughput: 0: 6060.0. Samples: 69312840. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 17:51:33,119][15372] Avg episode reward: [(0, '40.626')] [2024-08-05 17:51:36,217][15444] Updated weights for policy 0, policy_version 33851 (0.0018) [2024-08-05 17:51:38,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24166.4, 300 sec: 24242.8). Total num frames: 277348352. Throughput: 0: 6071.3. Samples: 69331270. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 17:51:38,126][15372] Avg episode reward: [(0, '39.634')] [2024-08-05 17:51:39,802][15444] Updated weights for policy 0, policy_version 33861 (0.0012) [2024-08-05 17:51:43,119][15372] Fps is (10 sec: 23756.6, 60 sec: 24166.3, 300 sec: 24215.0). Total num frames: 277463040. Throughput: 0: 6078.0. Samples: 69367990. Policy #0 lag: (min: 2.0, avg: 4.5, max: 8.0) [2024-08-05 17:51:43,127][15372] Avg episode reward: [(0, '40.244')] [2024-08-05 17:51:43,215][15444] Updated weights for policy 0, policy_version 33871 (0.0013) [2024-08-05 17:51:43,781][15417] Signal inference workers to stop experience collection... (12550 times) [2024-08-05 17:51:43,781][15417] Signal inference workers to resume experience collection... (12550 times) [2024-08-05 17:51:43,822][15444] InferenceWorker_p0-w0: stopping experience collection (12550 times) [2024-08-05 17:51:43,831][15444] InferenceWorker_p0-w0: resuming experience collection (12550 times) [2024-08-05 17:51:46,409][15444] Updated weights for policy 0, policy_version 33881 (0.0017) [2024-08-05 17:51:48,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 277585920. Throughput: 0: 6073.1. Samples: 69403250. Policy #0 lag: (min: 2.0, avg: 4.5, max: 8.0) [2024-08-05 17:51:48,126][15372] Avg episode reward: [(0, '40.183')] [2024-08-05 17:51:49,971][15444] Updated weights for policy 0, policy_version 33891 (0.0013) [2024-08-05 17:51:53,118][15372] Fps is (10 sec: 24577.0, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 277708800. Throughput: 0: 6070.0. Samples: 69421580. Policy #0 lag: (min: 2.0, avg: 4.5, max: 8.0) [2024-08-05 17:51:53,126][15372] Avg episode reward: [(0, '40.988')] [2024-08-05 17:51:53,158][15444] Updated weights for policy 0, policy_version 33901 (0.0019) [2024-08-05 17:51:56,635][15444] Updated weights for policy 0, policy_version 33911 (0.0013) [2024-08-05 17:51:58,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.9, 300 sec: 24242.8). Total num frames: 277823488. Throughput: 0: 6048.2. Samples: 69457490. Policy #0 lag: (min: 2.0, avg: 4.5, max: 8.0) [2024-08-05 17:51:58,126][15372] Avg episode reward: [(0, '41.945')] [2024-08-05 17:52:00,248][15444] Updated weights for policy 0, policy_version 33921 (0.0013) [2024-08-05 17:52:03,118][15372] Fps is (10 sec: 24575.8, 60 sec: 24302.9, 300 sec: 24242.8). Total num frames: 277954560. Throughput: 0: 6060.9. Samples: 69493870. Policy #0 lag: (min: 2.0, avg: 4.5, max: 8.0) [2024-08-05 17:52:03,126][15372] Avg episode reward: [(0, '41.310')] [2024-08-05 17:52:03,392][15444] Updated weights for policy 0, policy_version 33931 (0.0012) [2024-08-05 17:52:07,106][15444] Updated weights for policy 0, policy_version 33941 (0.0017) [2024-08-05 17:52:08,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24302.9, 300 sec: 24242.8). Total num frames: 278069248. Throughput: 0: 6042.5. Samples: 69511900. Policy #0 lag: (min: 1.0, avg: 3.6, max: 7.0) [2024-08-05 17:52:08,119][15372] Avg episode reward: [(0, '41.344')] [2024-08-05 17:52:10,297][15444] Updated weights for policy 0, policy_version 33951 (0.0035) [2024-08-05 17:52:13,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.4, 300 sec: 24242.8). Total num frames: 278192128. Throughput: 0: 6029.6. Samples: 69547510. Policy #0 lag: (min: 1.0, avg: 3.6, max: 7.0) [2024-08-05 17:52:13,126][15372] Avg episode reward: [(0, '41.335')] [2024-08-05 17:52:13,884][15444] Updated weights for policy 0, policy_version 33961 (0.0019) [2024-08-05 17:52:17,234][15444] Updated weights for policy 0, policy_version 33971 (0.0016) [2024-08-05 17:52:18,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 278306816. Throughput: 0: 6005.2. Samples: 69583070. Policy #0 lag: (min: 1.0, avg: 3.6, max: 7.0) [2024-08-05 17:52:18,119][15372] Avg episode reward: [(0, '41.644')] [2024-08-05 17:52:18,122][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000033973_278306816.pth... [2024-08-05 17:52:18,263][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000033264_272498688.pth [2024-08-05 17:52:20,588][15444] Updated weights for policy 0, policy_version 33981 (0.0026) [2024-08-05 17:52:22,442][15417] Signal inference workers to stop experience collection... (12600 times) [2024-08-05 17:52:22,446][15417] Signal inference workers to resume experience collection... (12600 times) [2024-08-05 17:52:22,487][15444] InferenceWorker_p0-w0: stopping experience collection (12600 times) [2024-08-05 17:52:22,497][15444] InferenceWorker_p0-w0: resuming experience collection (12600 times) [2024-08-05 17:52:23,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.5, 300 sec: 24215.0). Total num frames: 278429696. Throughput: 0: 5996.4. Samples: 69601110. Policy #0 lag: (min: 1.0, avg: 3.6, max: 7.0) [2024-08-05 17:52:23,119][15372] Avg episode reward: [(0, '41.161')] [2024-08-05 17:52:24,130][15444] Updated weights for policy 0, policy_version 33991 (0.0031) [2024-08-05 17:52:27,306][15444] Updated weights for policy 0, policy_version 34001 (0.0019) [2024-08-05 17:52:28,119][15372] Fps is (10 sec: 24575.1, 60 sec: 24166.3, 300 sec: 24215.0). Total num frames: 278552576. Throughput: 0: 5993.8. Samples: 69637710. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 17:52:28,119][15372] Avg episode reward: [(0, '40.864')] [2024-08-05 17:52:30,820][15444] Updated weights for policy 0, policy_version 34011 (0.0026) [2024-08-05 17:52:33,119][15372] Fps is (10 sec: 23756.4, 60 sec: 24029.9, 300 sec: 24187.2). Total num frames: 278667264. Throughput: 0: 6035.3. Samples: 69674840. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 17:52:33,126][15372] Avg episode reward: [(0, '41.231')] [2024-08-05 17:52:34,129][15444] Updated weights for policy 0, policy_version 34021 (0.0011) [2024-08-05 17:52:37,591][15444] Updated weights for policy 0, policy_version 34031 (0.0012) [2024-08-05 17:52:38,118][15372] Fps is (10 sec: 23757.7, 60 sec: 24029.9, 300 sec: 24215.0). Total num frames: 278790144. Throughput: 0: 6013.1. Samples: 69692170. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 17:52:38,119][15372] Avg episode reward: [(0, '41.719')] [2024-08-05 17:52:40,957][15444] Updated weights for policy 0, policy_version 34041 (0.0013) [2024-08-05 17:52:43,119][15372] Fps is (10 sec: 24576.2, 60 sec: 24166.5, 300 sec: 24187.2). Total num frames: 278913024. Throughput: 0: 6014.2. Samples: 69728130. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 17:52:43,126][15372] Avg episode reward: [(0, '41.611')] [2024-08-05 17:52:44,484][15444] Updated weights for policy 0, policy_version 34051 (0.0021) [2024-08-05 17:52:48,105][15444] Updated weights for policy 0, policy_version 34061 (0.0017) [2024-08-05 17:52:48,118][15372] Fps is (10 sec: 23756.6, 60 sec: 24029.8, 300 sec: 24187.2). Total num frames: 279027712. Throughput: 0: 5995.1. Samples: 69763650. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 17:52:48,119][15372] Avg episode reward: [(0, '42.071')] [2024-08-05 17:52:51,402][15444] Updated weights for policy 0, policy_version 34071 (0.0012) [2024-08-05 17:52:53,119][15372] Fps is (10 sec: 23756.3, 60 sec: 24029.7, 300 sec: 24187.2). Total num frames: 279150592. Throughput: 0: 5996.0. Samples: 69781720. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 17:52:53,119][15372] Avg episode reward: [(0, '40.520')] [2024-08-05 17:52:54,740][15444] Updated weights for policy 0, policy_version 34081 (0.0013) [2024-08-05 17:52:58,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 279265280. Throughput: 0: 6014.2. Samples: 69818150. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 17:52:58,126][15372] Avg episode reward: [(0, '40.335')] [2024-08-05 17:52:58,264][15444] Updated weights for policy 0, policy_version 34091 (0.0012) [2024-08-05 17:53:01,350][15444] Updated weights for policy 0, policy_version 34101 (0.0022) [2024-08-05 17:53:03,119][15372] Fps is (10 sec: 23757.1, 60 sec: 23893.3, 300 sec: 24159.4). Total num frames: 279388160. Throughput: 0: 6017.5. Samples: 69853860. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 17:53:03,126][15372] Avg episode reward: [(0, '40.894')] [2024-08-05 17:53:04,927][15444] Updated weights for policy 0, policy_version 34111 (0.0011) [2024-08-05 17:53:08,119][15372] Fps is (10 sec: 24574.8, 60 sec: 24029.7, 300 sec: 24215.0). Total num frames: 279511040. Throughput: 0: 6043.9. Samples: 69873090. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 17:53:08,127][15372] Avg episode reward: [(0, '41.420')] [2024-08-05 17:53:08,167][15444] Updated weights for policy 0, policy_version 34121 (0.0017) [2024-08-05 17:53:10,031][15417] Signal inference workers to stop experience collection... (12650 times) [2024-08-05 17:53:10,031][15417] Signal inference workers to resume experience collection... (12650 times) [2024-08-05 17:53:10,062][15444] InferenceWorker_p0-w0: stopping experience collection (12650 times) [2024-08-05 17:53:10,063][15444] InferenceWorker_p0-w0: resuming experience collection (12650 times) [2024-08-05 17:53:11,632][15444] Updated weights for policy 0, policy_version 34131 (0.0020) [2024-08-05 17:53:13,121][15372] Fps is (10 sec: 24569.7, 60 sec: 24028.8, 300 sec: 24214.8). Total num frames: 279633920. Throughput: 0: 6036.8. Samples: 69909380. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 17:53:13,122][15372] Avg episode reward: [(0, '42.030')] [2024-08-05 17:53:14,866][15444] Updated weights for policy 0, policy_version 34141 (0.0018) [2024-08-05 17:53:18,118][15372] Fps is (10 sec: 24577.3, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 279756800. Throughput: 0: 6030.5. Samples: 69946210. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 17:53:18,126][15372] Avg episode reward: [(0, '42.488')] [2024-08-05 17:53:18,266][15444] Updated weights for policy 0, policy_version 34151 (0.0023) [2024-08-05 17:53:21,519][15444] Updated weights for policy 0, policy_version 34161 (0.0013) [2024-08-05 17:53:23,133][15372] Fps is (10 sec: 24546.8, 60 sec: 24160.5, 300 sec: 24241.6). Total num frames: 279879680. Throughput: 0: 6049.6. Samples: 69964490. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 17:53:23,141][15372] Avg episode reward: [(0, '41.518')] [2024-08-05 17:53:24,883][15444] Updated weights for policy 0, policy_version 34171 (0.0012) [2024-08-05 17:53:28,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.5, 300 sec: 24215.0). Total num frames: 280002560. Throughput: 0: 6068.2. Samples: 70001200. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 17:53:28,126][15372] Avg episode reward: [(0, '40.427')] [2024-08-05 17:53:28,528][15444] Updated weights for policy 0, policy_version 34181 (0.0013) [2024-08-05 17:53:31,565][15444] Updated weights for policy 0, policy_version 34191 (0.0013) [2024-08-05 17:53:33,118][15372] Fps is (10 sec: 23791.8, 60 sec: 24166.5, 300 sec: 24215.0). Total num frames: 280117248. Throughput: 0: 6072.5. Samples: 70036910. Policy #0 lag: (min: 0.0, avg: 4.7, max: 9.0) [2024-08-05 17:53:33,126][15372] Avg episode reward: [(0, '41.247')] [2024-08-05 17:53:35,248][15444] Updated weights for policy 0, policy_version 34201 (0.0015) [2024-08-05 17:53:38,118][15372] Fps is (10 sec: 24575.8, 60 sec: 24302.9, 300 sec: 24242.8). Total num frames: 280248320. Throughput: 0: 6076.0. Samples: 70055140. Policy #0 lag: (min: 0.0, avg: 4.7, max: 9.0) [2024-08-05 17:53:38,123][15372] Avg episode reward: [(0, '41.736')] [2024-08-05 17:53:38,474][15444] Updated weights for policy 0, policy_version 34211 (0.0020) [2024-08-05 17:53:41,858][15444] Updated weights for policy 0, policy_version 34221 (0.0012) [2024-08-05 17:53:43,119][15372] Fps is (10 sec: 24575.4, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 280363008. Throughput: 0: 6074.2. Samples: 70091490. Policy #0 lag: (min: 0.0, avg: 4.7, max: 9.0) [2024-08-05 17:53:43,119][15372] Avg episode reward: [(0, '41.773')] [2024-08-05 17:53:45,284][15444] Updated weights for policy 0, policy_version 34231 (0.0037) [2024-08-05 17:53:48,119][15372] Fps is (10 sec: 23754.8, 60 sec: 24302.6, 300 sec: 24242.7). Total num frames: 280485888. Throughput: 0: 6097.2. Samples: 70128240. Policy #0 lag: (min: 0.0, avg: 4.7, max: 9.0) [2024-08-05 17:53:48,120][15372] Avg episode reward: [(0, '41.343')] [2024-08-05 17:53:48,562][15444] Updated weights for policy 0, policy_version 34241 (0.0013) [2024-08-05 17:53:50,016][15417] Signal inference workers to stop experience collection... (12700 times) [2024-08-05 17:53:50,017][15417] Signal inference workers to resume experience collection... (12700 times) [2024-08-05 17:53:50,062][15444] InferenceWorker_p0-w0: stopping experience collection (12700 times) [2024-08-05 17:53:50,067][15444] InferenceWorker_p0-w0: resuming experience collection (12700 times) [2024-08-05 17:53:52,096][15444] Updated weights for policy 0, policy_version 34251 (0.0014) [2024-08-05 17:53:53,119][15372] Fps is (10 sec: 24576.0, 60 sec: 24303.0, 300 sec: 24242.8). Total num frames: 280608768. Throughput: 0: 6055.6. Samples: 70145590. Policy #0 lag: (min: 0.0, avg: 4.7, max: 9.0) [2024-08-05 17:53:53,119][15372] Avg episode reward: [(0, '40.986')] [2024-08-05 17:53:55,225][15444] Updated weights for policy 0, policy_version 34261 (0.0025) [2024-08-05 17:53:58,120][15372] Fps is (10 sec: 23754.6, 60 sec: 24302.2, 300 sec: 24214.9). Total num frames: 280723456. Throughput: 0: 6058.6. Samples: 70182010. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 17:53:58,121][15372] Avg episode reward: [(0, '40.842')] [2024-08-05 17:53:58,975][15444] Updated weights for policy 0, policy_version 34271 (0.0021) [2024-08-05 17:54:02,438][15444] Updated weights for policy 0, policy_version 34281 (0.0016) [2024-08-05 17:54:03,118][15372] Fps is (10 sec: 22938.1, 60 sec: 24166.5, 300 sec: 24215.0). Total num frames: 280838144. Throughput: 0: 6030.7. Samples: 70217590. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 17:54:03,119][15372] Avg episode reward: [(0, '40.478')] [2024-08-05 17:54:05,587][15444] Updated weights for policy 0, policy_version 34291 (0.0013) [2024-08-05 17:54:08,126][15372] Fps is (10 sec: 24561.7, 60 sec: 24300.0, 300 sec: 24214.4). Total num frames: 280969216. Throughput: 0: 6019.8. Samples: 70235340. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 17:54:08,134][15372] Avg episode reward: [(0, '41.081')] [2024-08-05 17:54:09,372][15444] Updated weights for policy 0, policy_version 34301 (0.0016) [2024-08-05 17:54:12,397][15444] Updated weights for policy 0, policy_version 34311 (0.0014) [2024-08-05 17:54:13,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24031.0, 300 sec: 24159.5). Total num frames: 281075712. Throughput: 0: 6007.8. Samples: 70271550. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 17:54:13,119][15372] Avg episode reward: [(0, '40.612')] [2024-08-05 17:54:16,116][15444] Updated weights for policy 0, policy_version 34321 (0.0015) [2024-08-05 17:54:18,119][15372] Fps is (10 sec: 23774.2, 60 sec: 24166.3, 300 sec: 24215.0). Total num frames: 281206784. Throughput: 0: 6016.6. Samples: 70307660. Policy #0 lag: (min: 1.0, avg: 3.8, max: 8.0) [2024-08-05 17:54:18,119][15372] Avg episode reward: [(0, '41.374')] [2024-08-05 17:54:18,126][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000034327_281206784.pth... [2024-08-05 17:54:18,269][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000033620_275415040.pth [2024-08-05 17:54:19,650][15444] Updated weights for policy 0, policy_version 34331 (0.0010) [2024-08-05 17:54:22,719][15444] Updated weights for policy 0, policy_version 34341 (0.0012) [2024-08-05 17:54:23,118][15372] Fps is (10 sec: 25395.2, 60 sec: 24172.3, 300 sec: 24187.2). Total num frames: 281329664. Throughput: 0: 6002.0. Samples: 70325230. Policy #0 lag: (min: 1.0, avg: 3.8, max: 8.0) [2024-08-05 17:54:23,119][15372] Avg episode reward: [(0, '41.772')] [2024-08-05 17:54:24,651][15417] Signal inference workers to stop experience collection... (12750 times) [2024-08-05 17:54:24,652][15417] Signal inference workers to resume experience collection... (12750 times) [2024-08-05 17:54:24,695][15444] InferenceWorker_p0-w0: stopping experience collection (12750 times) [2024-08-05 17:54:24,701][15444] InferenceWorker_p0-w0: resuming experience collection (12750 times) [2024-08-05 17:54:26,309][15444] Updated weights for policy 0, policy_version 34351 (0.0020) [2024-08-05 17:54:28,118][15372] Fps is (10 sec: 23757.5, 60 sec: 24029.9, 300 sec: 24187.2). Total num frames: 281444352. Throughput: 0: 6011.4. Samples: 70362000. Policy #0 lag: (min: 1.0, avg: 3.8, max: 8.0) [2024-08-05 17:54:28,119][15372] Avg episode reward: [(0, '41.587')] [2024-08-05 17:54:29,262][15444] Updated weights for policy 0, policy_version 34361 (0.0028) [2024-08-05 17:54:32,829][15444] Updated weights for policy 0, policy_version 34371 (0.0026) [2024-08-05 17:54:33,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 281567232. Throughput: 0: 6007.7. Samples: 70398580. Policy #0 lag: (min: 1.0, avg: 3.8, max: 8.0) [2024-08-05 17:54:33,119][15372] Avg episode reward: [(0, '42.373')] [2024-08-05 17:54:36,649][15444] Updated weights for policy 0, policy_version 34381 (0.0018) [2024-08-05 17:54:38,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24029.9, 300 sec: 24187.2). Total num frames: 281690112. Throughput: 0: 6019.6. Samples: 70416470. Policy #0 lag: (min: 1.0, avg: 3.8, max: 8.0) [2024-08-05 17:54:38,119][15372] Avg episode reward: [(0, '42.810')] [2024-08-05 17:54:39,567][15444] Updated weights for policy 0, policy_version 34391 (0.0010) [2024-08-05 17:54:43,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 281804800. Throughput: 0: 6012.7. Samples: 70452570. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 17:54:43,126][15372] Avg episode reward: [(0, '41.877')] [2024-08-05 17:54:43,170][15444] Updated weights for policy 0, policy_version 34401 (0.0021) [2024-08-05 17:54:46,375][15444] Updated weights for policy 0, policy_version 34411 (0.0015) [2024-08-05 17:54:48,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24166.8, 300 sec: 24187.2). Total num frames: 281935872. Throughput: 0: 6033.3. Samples: 70489090. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 17:54:48,126][15372] Avg episode reward: [(0, '41.539')] [2024-08-05 17:54:49,876][15444] Updated weights for policy 0, policy_version 34421 (0.0012) [2024-08-05 17:54:53,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 282050560. Throughput: 0: 6053.3. Samples: 70507690. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 17:54:53,126][15372] Avg episode reward: [(0, '41.330')] [2024-08-05 17:54:53,205][15444] Updated weights for policy 0, policy_version 34431 (0.0013) [2024-08-05 17:54:56,505][15444] Updated weights for policy 0, policy_version 34441 (0.0019) [2024-08-05 17:54:57,533][15417] Signal inference workers to stop experience collection... (12800 times) [2024-08-05 17:54:57,541][15417] Signal inference workers to resume experience collection... (12800 times) [2024-08-05 17:54:57,598][15444] InferenceWorker_p0-w0: stopping experience collection (12800 times) [2024-08-05 17:54:57,604][15444] InferenceWorker_p0-w0: resuming experience collection (12800 times) [2024-08-05 17:54:58,119][15372] Fps is (10 sec: 23756.4, 60 sec: 24167.1, 300 sec: 24187.2). Total num frames: 282173440. Throughput: 0: 6045.7. Samples: 70543610. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 17:54:58,119][15372] Avg episode reward: [(0, '41.596')] [2024-08-05 17:55:00,070][15444] Updated weights for policy 0, policy_version 34451 (0.0023) [2024-08-05 17:55:03,119][15372] Fps is (10 sec: 24574.7, 60 sec: 24302.7, 300 sec: 24187.2). Total num frames: 282296320. Throughput: 0: 6053.5. Samples: 70580070. Policy #0 lag: (min: 2.0, avg: 4.7, max: 8.0) [2024-08-05 17:55:03,127][15372] Avg episode reward: [(0, '41.059')] [2024-08-05 17:55:03,308][15444] Updated weights for policy 0, policy_version 34461 (0.0022) [2024-08-05 17:55:06,813][15444] Updated weights for policy 0, policy_version 34471 (0.0017) [2024-08-05 17:55:08,118][15372] Fps is (10 sec: 24576.5, 60 sec: 24169.5, 300 sec: 24187.2). Total num frames: 282419200. Throughput: 0: 6079.3. Samples: 70598800. Policy #0 lag: (min: 2.0, avg: 4.7, max: 8.0) [2024-08-05 17:55:08,119][15372] Avg episode reward: [(0, '41.696')] [2024-08-05 17:55:10,106][15444] Updated weights for policy 0, policy_version 34481 (0.0015) [2024-08-05 17:55:13,118][15372] Fps is (10 sec: 23758.1, 60 sec: 24302.9, 300 sec: 24131.7). Total num frames: 282533888. Throughput: 0: 6073.1. Samples: 70635290. Policy #0 lag: (min: 2.0, avg: 4.7, max: 8.0) [2024-08-05 17:55:13,126][15372] Avg episode reward: [(0, '41.954')] [2024-08-05 17:55:13,580][15444] Updated weights for policy 0, policy_version 34491 (0.0011) [2024-08-05 17:55:16,780][15444] Updated weights for policy 0, policy_version 34501 (0.0024) [2024-08-05 17:55:18,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.5, 300 sec: 24159.5). Total num frames: 282656768. Throughput: 0: 6048.9. Samples: 70670780. Policy #0 lag: (min: 2.0, avg: 4.7, max: 8.0) [2024-08-05 17:55:18,119][15372] Avg episode reward: [(0, '41.890')] [2024-08-05 17:55:20,235][15444] Updated weights for policy 0, policy_version 34511 (0.0022) [2024-08-05 17:55:23,119][15372] Fps is (10 sec: 25394.0, 60 sec: 24302.8, 300 sec: 24187.6). Total num frames: 282787840. Throughput: 0: 6064.2. Samples: 70689360. Policy #0 lag: (min: 1.0, avg: 4.3, max: 8.0) [2024-08-05 17:55:23,127][15372] Avg episode reward: [(0, '41.657')] [2024-08-05 17:55:23,752][15444] Updated weights for policy 0, policy_version 34521 (0.0017) [2024-08-05 17:55:27,055][15444] Updated weights for policy 0, policy_version 34531 (0.0014) [2024-08-05 17:55:28,119][15372] Fps is (10 sec: 23755.7, 60 sec: 24166.2, 300 sec: 24131.7). Total num frames: 282894336. Throughput: 0: 6051.9. Samples: 70724910. Policy #0 lag: (min: 1.0, avg: 4.3, max: 8.0) [2024-08-05 17:55:28,120][15372] Avg episode reward: [(0, '41.836')] [2024-08-05 17:55:30,463][15444] Updated weights for policy 0, policy_version 34541 (0.0020) [2024-08-05 17:55:33,118][15372] Fps is (10 sec: 23757.7, 60 sec: 24302.9, 300 sec: 24159.5). Total num frames: 283025408. Throughput: 0: 6071.3. Samples: 70762300. Policy #0 lag: (min: 1.0, avg: 4.3, max: 8.0) [2024-08-05 17:55:33,119][15372] Avg episode reward: [(0, '41.336')] [2024-08-05 17:55:33,739][15444] Updated weights for policy 0, policy_version 34551 (0.0015) [2024-08-05 17:55:37,115][15444] Updated weights for policy 0, policy_version 34561 (0.0012) [2024-08-05 17:55:38,118][15372] Fps is (10 sec: 24577.1, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 283140096. Throughput: 0: 6052.7. Samples: 70780060. Policy #0 lag: (min: 1.0, avg: 4.3, max: 8.0) [2024-08-05 17:55:38,119][15372] Avg episode reward: [(0, '40.713')] [2024-08-05 17:55:39,054][15417] Signal inference workers to stop experience collection... (12850 times) [2024-08-05 17:55:39,055][15417] Signal inference workers to resume experience collection... (12850 times) [2024-08-05 17:55:39,128][15444] InferenceWorker_p0-w0: stopping experience collection (12850 times) [2024-08-05 17:55:39,134][15444] InferenceWorker_p0-w0: resuming experience collection (12850 times) [2024-08-05 17:55:40,582][15444] Updated weights for policy 0, policy_version 34571 (0.0014) [2024-08-05 17:55:43,119][15372] Fps is (10 sec: 23756.6, 60 sec: 24302.9, 300 sec: 24159.4). Total num frames: 283262976. Throughput: 0: 6061.6. Samples: 70816380. Policy #0 lag: (min: 1.0, avg: 4.3, max: 8.0) [2024-08-05 17:55:43,119][15372] Avg episode reward: [(0, '41.144')] [2024-08-05 17:55:44,127][15444] Updated weights for policy 0, policy_version 34581 (0.0030) [2024-08-05 17:55:47,465][15444] Updated weights for policy 0, policy_version 34591 (0.0019) [2024-08-05 17:55:48,119][15372] Fps is (10 sec: 24575.8, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 283385856. Throughput: 0: 6033.8. Samples: 70851590. Policy #0 lag: (min: 1.0, avg: 4.1, max: 9.0) [2024-08-05 17:55:48,119][15372] Avg episode reward: [(0, '39.828')] [2024-08-05 17:55:50,864][15444] Updated weights for policy 0, policy_version 34601 (0.0014) [2024-08-05 17:55:53,126][15372] Fps is (10 sec: 23738.7, 60 sec: 24163.3, 300 sec: 24131.1). Total num frames: 283500544. Throughput: 0: 6029.0. Samples: 70870150. Policy #0 lag: (min: 1.0, avg: 4.1, max: 9.0) [2024-08-05 17:55:53,128][15372] Avg episode reward: [(0, '40.201')] [2024-08-05 17:55:54,161][15444] Updated weights for policy 0, policy_version 34611 (0.0011) [2024-08-05 17:55:57,872][15444] Updated weights for policy 0, policy_version 34621 (0.0018) [2024-08-05 17:55:58,119][15372] Fps is (10 sec: 23756.1, 60 sec: 24166.3, 300 sec: 24159.4). Total num frames: 283623424. Throughput: 0: 6019.5. Samples: 70906170. Policy #0 lag: (min: 1.0, avg: 4.1, max: 9.0) [2024-08-05 17:55:58,120][15372] Avg episode reward: [(0, '40.150')] [2024-08-05 17:56:01,129][15444] Updated weights for policy 0, policy_version 34631 (0.0014) [2024-08-05 17:56:03,118][15372] Fps is (10 sec: 24595.2, 60 sec: 24166.6, 300 sec: 24187.2). Total num frames: 283746304. Throughput: 0: 6024.9. Samples: 70941900. Policy #0 lag: (min: 1.0, avg: 4.1, max: 9.0) [2024-08-05 17:56:03,126][15372] Avg episode reward: [(0, '40.398')] [2024-08-05 17:56:04,492][15444] Updated weights for policy 0, policy_version 34641 (0.0015) [2024-08-05 17:56:07,699][15444] Updated weights for policy 0, policy_version 34651 (0.0027) [2024-08-05 17:56:08,118][15372] Fps is (10 sec: 23757.7, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 283860992. Throughput: 0: 6018.3. Samples: 70960180. Policy #0 lag: (min: 0.0, avg: 4.1, max: 9.0) [2024-08-05 17:56:08,119][15372] Avg episode reward: [(0, '41.878')] [2024-08-05 17:56:11,080][15444] Updated weights for policy 0, policy_version 34661 (0.0024) [2024-08-05 17:56:13,119][15372] Fps is (10 sec: 23756.4, 60 sec: 24166.3, 300 sec: 24159.4). Total num frames: 283983872. Throughput: 0: 6030.7. Samples: 70996290. Policy #0 lag: (min: 0.0, avg: 4.1, max: 9.0) [2024-08-05 17:56:13,119][15372] Avg episode reward: [(0, '41.439')] [2024-08-05 17:56:14,790][15444] Updated weights for policy 0, policy_version 34671 (0.0025) [2024-08-05 17:56:17,946][15444] Updated weights for policy 0, policy_version 34681 (0.0021) [2024-08-05 17:56:18,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 284106752. Throughput: 0: 6004.0. Samples: 71032480. Policy #0 lag: (min: 0.0, avg: 4.1, max: 9.0) [2024-08-05 17:56:18,119][15372] Avg episode reward: [(0, '40.600')] [2024-08-05 17:56:18,122][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000034681_284106752.pth... [2024-08-05 17:56:18,270][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000033973_278306816.pth [2024-08-05 17:56:19,406][15417] Signal inference workers to stop experience collection... (12900 times) [2024-08-05 17:56:19,409][15417] Signal inference workers to resume experience collection... (12900 times) [2024-08-05 17:56:19,450][15444] InferenceWorker_p0-w0: stopping experience collection (12900 times) [2024-08-05 17:56:19,450][15444] InferenceWorker_p0-w0: resuming experience collection (12900 times) [2024-08-05 17:56:21,539][15444] Updated weights for policy 0, policy_version 34691 (0.0038) [2024-08-05 17:56:23,118][15372] Fps is (10 sec: 23757.2, 60 sec: 23893.5, 300 sec: 24131.7). Total num frames: 284221440. Throughput: 0: 6013.6. Samples: 71050670. Policy #0 lag: (min: 0.0, avg: 4.1, max: 9.0) [2024-08-05 17:56:23,119][15372] Avg episode reward: [(0, '40.507')] [2024-08-05 17:56:24,574][15444] Updated weights for policy 0, policy_version 34701 (0.0030) [2024-08-05 17:56:28,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.6, 300 sec: 24131.7). Total num frames: 284344320. Throughput: 0: 6022.5. Samples: 71087390. Policy #0 lag: (min: 0.0, avg: 4.1, max: 9.0) [2024-08-05 17:56:28,119][15372] Avg episode reward: [(0, '41.084')] [2024-08-05 17:56:28,128][15444] Updated weights for policy 0, policy_version 34711 (0.0014) [2024-08-05 17:56:31,815][15444] Updated weights for policy 0, policy_version 34721 (0.0013) [2024-08-05 17:56:33,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 284467200. Throughput: 0: 6046.7. Samples: 71123690. Policy #0 lag: (min: 0.0, avg: 3.2, max: 8.0) [2024-08-05 17:56:33,119][15372] Avg episode reward: [(0, '41.140')] [2024-08-05 17:56:34,737][15444] Updated weights for policy 0, policy_version 34731 (0.0013) [2024-08-05 17:56:38,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 284590080. Throughput: 0: 6050.4. Samples: 71142370. Policy #0 lag: (min: 0.0, avg: 3.2, max: 8.0) [2024-08-05 17:56:38,126][15372] Avg episode reward: [(0, '39.562')] [2024-08-05 17:56:38,305][15444] Updated weights for policy 0, policy_version 34741 (0.0012) [2024-08-05 17:56:41,564][15444] Updated weights for policy 0, policy_version 34751 (0.0018) [2024-08-05 17:56:43,119][15372] Fps is (10 sec: 25393.4, 60 sec: 24302.7, 300 sec: 24187.2). Total num frames: 284721152. Throughput: 0: 6051.1. Samples: 71178470. Policy #0 lag: (min: 0.0, avg: 3.2, max: 8.0) [2024-08-05 17:56:43,128][15372] Avg episode reward: [(0, '39.070')] [2024-08-05 17:56:44,967][15444] Updated weights for policy 0, policy_version 34761 (0.0025) [2024-08-05 17:56:48,119][15372] Fps is (10 sec: 23756.5, 60 sec: 24029.8, 300 sec: 24131.7). Total num frames: 284827648. Throughput: 0: 6063.5. Samples: 71214760. Policy #0 lag: (min: 0.0, avg: 3.2, max: 8.0) [2024-08-05 17:56:48,126][15372] Avg episode reward: [(0, '40.211')] [2024-08-05 17:56:48,530][15444] Updated weights for policy 0, policy_version 34771 (0.0011) [2024-08-05 17:56:51,516][15444] Updated weights for policy 0, policy_version 34781 (0.0016) [2024-08-05 17:56:53,075][15417] Signal inference workers to stop experience collection... (12950 times) [2024-08-05 17:56:53,083][15417] Signal inference workers to resume experience collection... (12950 times) [2024-08-05 17:56:53,119][15372] Fps is (10 sec: 22937.5, 60 sec: 24169.2, 300 sec: 24159.4). Total num frames: 284950528. Throughput: 0: 6079.9. Samples: 71233780. Policy #0 lag: (min: 0.0, avg: 3.2, max: 7.0) [2024-08-05 17:56:53,120][15372] Avg episode reward: [(0, '40.910')] [2024-08-05 17:56:53,126][15444] InferenceWorker_p0-w0: stopping experience collection (12950 times) [2024-08-05 17:56:53,126][15444] InferenceWorker_p0-w0: resuming experience collection (12950 times) [2024-08-05 17:56:55,186][15444] Updated weights for policy 0, policy_version 34791 (0.0021) [2024-08-05 17:56:58,118][15372] Fps is (10 sec: 25395.5, 60 sec: 24303.1, 300 sec: 24159.5). Total num frames: 285081600. Throughput: 0: 6086.2. Samples: 71270170. Policy #0 lag: (min: 0.0, avg: 3.2, max: 7.0) [2024-08-05 17:56:58,126][15372] Avg episode reward: [(0, '41.150')] [2024-08-05 17:56:58,845][15444] Updated weights for policy 0, policy_version 34801 (0.0024) [2024-08-05 17:57:01,957][15444] Updated weights for policy 0, policy_version 34811 (0.0040) [2024-08-05 17:57:03,118][15372] Fps is (10 sec: 23758.5, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 285188096. Throughput: 0: 6059.3. Samples: 71305150. Policy #0 lag: (min: 0.0, avg: 3.2, max: 7.0) [2024-08-05 17:57:03,119][15372] Avg episode reward: [(0, '40.763')] [2024-08-05 17:57:05,421][15444] Updated weights for policy 0, policy_version 34821 (0.0021) [2024-08-05 17:57:08,119][15372] Fps is (10 sec: 23755.2, 60 sec: 24302.6, 300 sec: 24159.4). Total num frames: 285319168. Throughput: 0: 6073.9. Samples: 71324000. Policy #0 lag: (min: 0.0, avg: 3.2, max: 7.0) [2024-08-05 17:57:08,119][15372] Avg episode reward: [(0, '41.833')] [2024-08-05 17:57:08,533][15444] Updated weights for policy 0, policy_version 34831 (0.0014) [2024-08-05 17:57:12,205][15444] Updated weights for policy 0, policy_version 34841 (0.0015) [2024-08-05 17:57:13,118][15372] Fps is (10 sec: 25395.2, 60 sec: 24303.0, 300 sec: 24187.2). Total num frames: 285442048. Throughput: 0: 6054.9. Samples: 71359860. Policy #0 lag: (min: 0.0, avg: 3.2, max: 7.0) [2024-08-05 17:57:13,119][15372] Avg episode reward: [(0, '41.690')] [2024-08-05 17:57:15,519][15444] Updated weights for policy 0, policy_version 34851 (0.0020) [2024-08-05 17:57:18,118][15372] Fps is (10 sec: 23758.4, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 285556736. Throughput: 0: 6071.8. Samples: 71396920. Policy #0 lag: (min: 0.0, avg: 4.4, max: 9.0) [2024-08-05 17:57:18,119][15372] Avg episode reward: [(0, '41.286')] [2024-08-05 17:57:18,838][15444] Updated weights for policy 0, policy_version 34861 (0.0025) [2024-08-05 17:57:22,238][15444] Updated weights for policy 0, policy_version 34871 (0.0012) [2024-08-05 17:57:23,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24302.9, 300 sec: 24159.5). Total num frames: 285679616. Throughput: 0: 6059.6. Samples: 71415050. Policy #0 lag: (min: 0.0, avg: 4.4, max: 9.0) [2024-08-05 17:57:23,119][15372] Avg episode reward: [(0, '41.247')] [2024-08-05 17:57:25,542][15444] Updated weights for policy 0, policy_version 34881 (0.0013) [2024-08-05 17:57:28,119][15372] Fps is (10 sec: 24575.4, 60 sec: 24302.8, 300 sec: 24187.2). Total num frames: 285802496. Throughput: 0: 6055.8. Samples: 71450980. Policy #0 lag: (min: 0.0, avg: 4.4, max: 9.0) [2024-08-05 17:57:28,119][15372] Avg episode reward: [(0, '40.910')] [2024-08-05 17:57:29,212][15444] Updated weights for policy 0, policy_version 34891 (0.0024) [2024-08-05 17:57:31,791][15417] Signal inference workers to stop experience collection... (13000 times) [2024-08-05 17:57:31,791][15417] Signal inference workers to resume experience collection... (13000 times) [2024-08-05 17:57:31,867][15444] InferenceWorker_p0-w0: stopping experience collection (13000 times) [2024-08-05 17:57:31,867][15444] InferenceWorker_p0-w0: resuming experience collection (13000 times) [2024-08-05 17:57:32,381][15444] Updated weights for policy 0, policy_version 34901 (0.0023) [2024-08-05 17:57:33,119][15372] Fps is (10 sec: 24574.9, 60 sec: 24302.7, 300 sec: 24187.2). Total num frames: 285925376. Throughput: 0: 6049.5. Samples: 71486990. Policy #0 lag: (min: 0.0, avg: 4.4, max: 9.0) [2024-08-05 17:57:33,127][15372] Avg episode reward: [(0, '40.935')] [2024-08-05 17:57:35,775][15444] Updated weights for policy 0, policy_version 34911 (0.0032) [2024-08-05 17:57:38,119][15372] Fps is (10 sec: 23757.1, 60 sec: 24166.3, 300 sec: 24159.5). Total num frames: 286040064. Throughput: 0: 6033.2. Samples: 71505270. Policy #0 lag: (min: 0.0, avg: 4.4, max: 9.0) [2024-08-05 17:57:38,126][15372] Avg episode reward: [(0, '41.733')] [2024-08-05 17:57:39,215][15444] Updated weights for policy 0, policy_version 34921 (0.0011) [2024-08-05 17:57:42,472][15444] Updated weights for policy 0, policy_version 34931 (0.0032) [2024-08-05 17:57:43,118][15372] Fps is (10 sec: 23758.0, 60 sec: 24030.2, 300 sec: 24187.2). Total num frames: 286162944. Throughput: 0: 6028.7. Samples: 71541460. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 17:57:43,126][15372] Avg episode reward: [(0, '42.024')] [2024-08-05 17:57:46,153][15444] Updated weights for policy 0, policy_version 34941 (0.0014) [2024-08-05 17:57:48,120][15372] Fps is (10 sec: 23753.4, 60 sec: 24165.8, 300 sec: 24159.4). Total num frames: 286277632. Throughput: 0: 6047.6. Samples: 71577300. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 17:57:48,121][15372] Avg episode reward: [(0, '42.792')] [2024-08-05 17:57:49,210][15444] Updated weights for policy 0, policy_version 34951 (0.0021) [2024-08-05 17:57:52,596][15444] Updated weights for policy 0, policy_version 34961 (0.0018) [2024-08-05 17:57:53,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24303.2, 300 sec: 24215.0). Total num frames: 286408704. Throughput: 0: 6043.4. Samples: 71595950. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 17:57:53,119][15372] Avg episode reward: [(0, '41.892')] [2024-08-05 17:57:56,027][15444] Updated weights for policy 0, policy_version 34971 (0.0020) [2024-08-05 17:57:58,120][15372] Fps is (10 sec: 25396.3, 60 sec: 24165.9, 300 sec: 24214.9). Total num frames: 286531584. Throughput: 0: 6049.8. Samples: 71632110. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 17:57:58,120][15372] Avg episode reward: [(0, '41.232')] [2024-08-05 17:57:59,541][15444] Updated weights for policy 0, policy_version 34981 (0.0013) [2024-08-05 17:58:03,014][15444] Updated weights for policy 0, policy_version 34991 (0.0020) [2024-08-05 17:58:03,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24302.9, 300 sec: 24187.3). Total num frames: 286646272. Throughput: 0: 6015.6. Samples: 71667620. Policy #0 lag: (min: 1.0, avg: 3.8, max: 8.0) [2024-08-05 17:58:03,119][15372] Avg episode reward: [(0, '40.253')] [2024-08-05 17:58:06,510][15444] Updated weights for policy 0, policy_version 35001 (0.0017) [2024-08-05 17:58:08,118][15372] Fps is (10 sec: 23759.5, 60 sec: 24166.7, 300 sec: 24187.5). Total num frames: 286769152. Throughput: 0: 6014.2. Samples: 71685690. Policy #0 lag: (min: 1.0, avg: 3.8, max: 8.0) [2024-08-05 17:58:08,126][15372] Avg episode reward: [(0, '41.666')] [2024-08-05 17:58:09,737][15444] Updated weights for policy 0, policy_version 35011 (0.0013) [2024-08-05 17:58:13,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 286883840. Throughput: 0: 6025.1. Samples: 71722110. Policy #0 lag: (min: 1.0, avg: 3.8, max: 8.0) [2024-08-05 17:58:13,126][15372] Avg episode reward: [(0, '41.646')] [2024-08-05 17:58:13,261][15444] Updated weights for policy 0, policy_version 35021 (0.0027) [2024-08-05 17:58:16,298][15444] Updated weights for policy 0, policy_version 35031 (0.0019) [2024-08-05 17:58:18,119][15372] Fps is (10 sec: 24575.7, 60 sec: 24302.9, 300 sec: 24188.4). Total num frames: 287014912. Throughput: 0: 6030.3. Samples: 71758350. Policy #0 lag: (min: 1.0, avg: 3.8, max: 8.0) [2024-08-05 17:58:18,127][15372] Avg episode reward: [(0, '41.767')] [2024-08-05 17:58:18,131][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000035036_287014912.pth... [2024-08-05 17:58:18,259][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000034327_281206784.pth [2024-08-05 17:58:19,978][15444] Updated weights for policy 0, policy_version 35041 (0.0022) [2024-08-05 17:58:23,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 287129600. Throughput: 0: 6014.9. Samples: 71775940. Policy #0 lag: (min: 1.0, avg: 3.8, max: 8.0) [2024-08-05 17:58:23,126][15372] Avg episode reward: [(0, '41.319')] [2024-08-05 17:58:23,750][15444] Updated weights for policy 0, policy_version 35051 (0.0031) [2024-08-05 17:58:26,570][15417] Signal inference workers to stop experience collection... (13050 times) [2024-08-05 17:58:26,571][15417] Signal inference workers to resume experience collection... (13050 times) [2024-08-05 17:58:26,650][15444] InferenceWorker_p0-w0: stopping experience collection (13050 times) [2024-08-05 17:58:26,650][15444] InferenceWorker_p0-w0: resuming experience collection (13050 times) [2024-08-05 17:58:26,671][15444] Updated weights for policy 0, policy_version 35061 (0.0012) [2024-08-05 17:58:28,118][15372] Fps is (10 sec: 22937.9, 60 sec: 24030.0, 300 sec: 24159.5). Total num frames: 287244288. Throughput: 0: 6007.1. Samples: 71811780. Policy #0 lag: (min: 0.0, avg: 3.4, max: 8.0) [2024-08-05 17:58:28,126][15372] Avg episode reward: [(0, '40.843')] [2024-08-05 17:58:30,294][15444] Updated weights for policy 0, policy_version 35071 (0.0017) [2024-08-05 17:58:33,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24166.6, 300 sec: 24159.5). Total num frames: 287375360. Throughput: 0: 6020.2. Samples: 71848200. Policy #0 lag: (min: 0.0, avg: 3.4, max: 8.0) [2024-08-05 17:58:33,119][15372] Avg episode reward: [(0, '41.550')] [2024-08-05 17:58:33,413][15444] Updated weights for policy 0, policy_version 35081 (0.0029) [2024-08-05 17:58:37,145][15444] Updated weights for policy 0, policy_version 35091 (0.0013) [2024-08-05 17:58:38,119][15372] Fps is (10 sec: 24574.9, 60 sec: 24166.3, 300 sec: 24159.4). Total num frames: 287490048. Throughput: 0: 6003.5. Samples: 71866110. Policy #0 lag: (min: 0.0, avg: 3.4, max: 8.0) [2024-08-05 17:58:38,119][15372] Avg episode reward: [(0, '42.345')] [2024-08-05 17:58:40,519][15444] Updated weights for policy 0, policy_version 35101 (0.0010) [2024-08-05 17:58:43,119][15372] Fps is (10 sec: 22936.9, 60 sec: 24029.8, 300 sec: 24131.7). Total num frames: 287604736. Throughput: 0: 5995.7. Samples: 71901910. Policy #0 lag: (min: 0.0, avg: 3.4, max: 8.0) [2024-08-05 17:58:43,127][15372] Avg episode reward: [(0, '41.375')] [2024-08-05 17:58:43,979][15444] Updated weights for policy 0, policy_version 35111 (0.0016) [2024-08-05 17:58:47,421][15444] Updated weights for policy 0, policy_version 35121 (0.0016) [2024-08-05 17:58:48,119][15372] Fps is (10 sec: 23757.5, 60 sec: 24167.0, 300 sec: 24131.7). Total num frames: 287727616. Throughput: 0: 6001.3. Samples: 71937680. Policy #0 lag: (min: 1.0, avg: 4.0, max: 8.0) [2024-08-05 17:58:48,119][15372] Avg episode reward: [(0, '41.103')] [2024-08-05 17:58:50,679][15444] Updated weights for policy 0, policy_version 35131 (0.0021) [2024-08-05 17:58:53,119][15372] Fps is (10 sec: 24575.6, 60 sec: 24029.7, 300 sec: 24159.6). Total num frames: 287850496. Throughput: 0: 6006.6. Samples: 71955990. Policy #0 lag: (min: 1.0, avg: 4.0, max: 8.0) [2024-08-05 17:58:53,127][15372] Avg episode reward: [(0, '42.335')] [2024-08-05 17:58:54,230][15444] Updated weights for policy 0, policy_version 35141 (0.0009) [2024-08-05 17:58:57,408][15444] Updated weights for policy 0, policy_version 35151 (0.0018) [2024-08-05 17:58:58,118][15372] Fps is (10 sec: 23757.1, 60 sec: 23893.8, 300 sec: 24159.5). Total num frames: 287965184. Throughput: 0: 6006.4. Samples: 71992400. Policy #0 lag: (min: 1.0, avg: 4.0, max: 8.0) [2024-08-05 17:58:58,126][15372] Avg episode reward: [(0, '41.933')] [2024-08-05 17:59:00,960][15444] Updated weights for policy 0, policy_version 35161 (0.0028) [2024-08-05 17:59:03,119][15372] Fps is (10 sec: 23757.3, 60 sec: 24029.8, 300 sec: 24132.3). Total num frames: 288088064. Throughput: 0: 5993.1. Samples: 72028040. Policy #0 lag: (min: 1.0, avg: 4.0, max: 8.0) [2024-08-05 17:59:03,127][15372] Avg episode reward: [(0, '41.802')] [2024-08-05 17:59:04,546][15444] Updated weights for policy 0, policy_version 35171 (0.0020) [2024-08-05 17:59:08,119][15372] Fps is (10 sec: 22936.0, 60 sec: 23756.5, 300 sec: 24131.6). Total num frames: 288194560. Throughput: 0: 5986.6. Samples: 72045340. Policy #0 lag: (min: 1.0, avg: 4.0, max: 8.0) [2024-08-05 17:59:08,127][15372] Avg episode reward: [(0, '41.556')] [2024-08-05 17:59:08,174][15444] Updated weights for policy 0, policy_version 35181 (0.0013) [2024-08-05 17:59:11,262][15444] Updated weights for policy 0, policy_version 35191 (0.0014) [2024-08-05 17:59:13,118][15372] Fps is (10 sec: 23757.4, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 288325632. Throughput: 0: 5992.4. Samples: 72081440. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 17:59:13,126][15372] Avg episode reward: [(0, '40.958')] [2024-08-05 17:59:14,511][15444] Updated weights for policy 0, policy_version 35201 (0.0013) [2024-08-05 17:59:17,961][15444] Updated weights for policy 0, policy_version 35211 (0.0017) [2024-08-05 17:59:18,119][15372] Fps is (10 sec: 25396.8, 60 sec: 23893.3, 300 sec: 24131.7). Total num frames: 288448512. Throughput: 0: 5996.4. Samples: 72118040. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 17:59:18,119][15372] Avg episode reward: [(0, '40.417')] [2024-08-05 17:59:21,587][15444] Updated weights for policy 0, policy_version 35221 (0.0012) [2024-08-05 17:59:23,119][15372] Fps is (10 sec: 24575.8, 60 sec: 24029.8, 300 sec: 24159.5). Total num frames: 288571392. Throughput: 0: 6005.8. Samples: 72136370. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 17:59:23,119][15372] Avg episode reward: [(0, '41.200')] [2024-08-05 17:59:24,102][15417] Signal inference workers to stop experience collection... (13100 times) [2024-08-05 17:59:24,103][15417] Signal inference workers to resume experience collection... (13100 times) [2024-08-05 17:59:24,149][15444] InferenceWorker_p0-w0: stopping experience collection (13100 times) [2024-08-05 17:59:24,150][15444] InferenceWorker_p0-w0: resuming experience collection (13100 times) [2024-08-05 17:59:24,785][15444] Updated weights for policy 0, policy_version 35231 (0.0021) [2024-08-05 17:59:28,119][15372] Fps is (10 sec: 23755.5, 60 sec: 24029.6, 300 sec: 24131.6). Total num frames: 288686080. Throughput: 0: 6020.2. Samples: 72172820. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 17:59:28,127][15372] Avg episode reward: [(0, '42.123')] [2024-08-05 17:59:28,433][15444] Updated weights for policy 0, policy_version 35241 (0.0024) [2024-08-05 17:59:31,342][15444] Updated weights for policy 0, policy_version 35251 (0.0024) [2024-08-05 17:59:33,118][15372] Fps is (10 sec: 23757.0, 60 sec: 23893.3, 300 sec: 24131.7). Total num frames: 288808960. Throughput: 0: 6010.5. Samples: 72208150. Policy #0 lag: (min: 0.0, avg: 4.1, max: 9.0) [2024-08-05 17:59:33,126][15372] Avg episode reward: [(0, '42.493')] [2024-08-05 17:59:35,076][15444] Updated weights for policy 0, policy_version 35261 (0.0037) [2024-08-05 17:59:38,121][15372] Fps is (10 sec: 24572.5, 60 sec: 24029.2, 300 sec: 24159.3). Total num frames: 288931840. Throughput: 0: 6002.4. Samples: 72226110. Policy #0 lag: (min: 0.0, avg: 4.1, max: 9.0) [2024-08-05 17:59:38,128][15372] Avg episode reward: [(0, '41.777')] [2024-08-05 17:59:38,734][15444] Updated weights for policy 0, policy_version 35271 (0.0011) [2024-08-05 17:59:41,631][15444] Updated weights for policy 0, policy_version 35281 (0.0039) [2024-08-05 17:59:43,119][15372] Fps is (10 sec: 23756.6, 60 sec: 24029.9, 300 sec: 24103.9). Total num frames: 289046528. Throughput: 0: 5998.2. Samples: 72262320. Policy #0 lag: (min: 0.0, avg: 4.1, max: 9.0) [2024-08-05 17:59:43,126][15372] Avg episode reward: [(0, '40.795')] [2024-08-05 17:59:45,184][15444] Updated weights for policy 0, policy_version 35291 (0.0011) [2024-08-05 17:59:48,119][15372] Fps is (10 sec: 23760.0, 60 sec: 24029.6, 300 sec: 24131.6). Total num frames: 289169408. Throughput: 0: 6012.6. Samples: 72298610. Policy #0 lag: (min: 0.0, avg: 4.1, max: 9.0) [2024-08-05 17:59:48,127][15372] Avg episode reward: [(0, '41.168')] [2024-08-05 17:59:48,553][15444] Updated weights for policy 0, policy_version 35301 (0.0023) [2024-08-05 17:59:52,012][15444] Updated weights for policy 0, policy_version 35311 (0.0013) [2024-08-05 17:59:53,119][15372] Fps is (10 sec: 24576.0, 60 sec: 24030.0, 300 sec: 24131.7). Total num frames: 289292288. Throughput: 0: 6037.6. Samples: 72317030. Policy #0 lag: (min: 0.0, avg: 4.1, max: 9.0) [2024-08-05 17:59:53,119][15372] Avg episode reward: [(0, '41.518')] [2024-08-05 17:59:55,724][15444] Updated weights for policy 0, policy_version 35321 (0.0019) [2024-08-05 17:59:58,124][15372] Fps is (10 sec: 24565.1, 60 sec: 24164.3, 300 sec: 24131.3). Total num frames: 289415168. Throughput: 0: 6025.5. Samples: 72352620. Policy #0 lag: (min: 0.0, avg: 4.5, max: 8.0) [2024-08-05 17:59:58,131][15372] Avg episode reward: [(0, '41.046')] [2024-08-05 17:59:58,891][15444] Updated weights for policy 0, policy_version 35331 (0.0011) [2024-08-05 18:00:02,250][15444] Updated weights for policy 0, policy_version 35341 (0.0023) [2024-08-05 18:00:03,118][15372] Fps is (10 sec: 23757.1, 60 sec: 24030.0, 300 sec: 24103.9). Total num frames: 289529856. Throughput: 0: 6016.7. Samples: 72388790. Policy #0 lag: (min: 0.0, avg: 4.5, max: 8.0) [2024-08-05 18:00:03,119][15372] Avg episode reward: [(0, '40.572')] [2024-08-05 18:00:05,549][15444] Updated weights for policy 0, policy_version 35351 (0.0028) [2024-08-05 18:00:08,118][15372] Fps is (10 sec: 23769.1, 60 sec: 24303.2, 300 sec: 24131.7). Total num frames: 289652736. Throughput: 0: 6008.7. Samples: 72406760. Policy #0 lag: (min: 0.0, avg: 4.5, max: 8.0) [2024-08-05 18:00:08,126][15372] Avg episode reward: [(0, '41.052')] [2024-08-05 18:00:09,110][15444] Updated weights for policy 0, policy_version 35361 (0.0013) [2024-08-05 18:00:09,910][15417] Signal inference workers to stop experience collection... (13150 times) [2024-08-05 18:00:09,927][15417] Signal inference workers to resume experience collection... (13150 times) [2024-08-05 18:00:09,983][15444] InferenceWorker_p0-w0: stopping experience collection (13150 times) [2024-08-05 18:00:09,991][15444] InferenceWorker_p0-w0: resuming experience collection (13150 times) [2024-08-05 18:00:12,464][15444] Updated weights for policy 0, policy_version 35371 (0.0016) [2024-08-05 18:00:13,119][15372] Fps is (10 sec: 24574.9, 60 sec: 24166.2, 300 sec: 24131.7). Total num frames: 289775616. Throughput: 0: 6010.7. Samples: 72443300. Policy #0 lag: (min: 0.0, avg: 4.5, max: 8.0) [2024-08-05 18:00:13,119][15372] Avg episode reward: [(0, '42.122')] [2024-08-05 18:00:15,739][15444] Updated weights for policy 0, policy_version 35381 (0.0021) [2024-08-05 18:00:18,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.9, 300 sec: 24076.2). Total num frames: 289890304. Throughput: 0: 6037.3. Samples: 72479830. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 18:00:18,126][15372] Avg episode reward: [(0, '42.340')] [2024-08-05 18:00:18,138][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000035388_289898496.pth... [2024-08-05 18:00:18,282][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000034681_284106752.pth [2024-08-05 18:00:19,222][15444] Updated weights for policy 0, policy_version 35391 (0.0020) [2024-08-05 18:00:22,642][15444] Updated weights for policy 0, policy_version 35401 (0.0041) [2024-08-05 18:00:23,118][15372] Fps is (10 sec: 23757.9, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 290013184. Throughput: 0: 6036.3. Samples: 72497730. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 18:00:23,119][15372] Avg episode reward: [(0, '42.266')] [2024-08-05 18:00:25,788][15444] Updated weights for policy 0, policy_version 35411 (0.0010) [2024-08-05 18:00:28,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.7, 300 sec: 24103.9). Total num frames: 290136064. Throughput: 0: 6037.6. Samples: 72534010. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 18:00:28,126][15372] Avg episode reward: [(0, '41.177')] [2024-08-05 18:00:29,360][15444] Updated weights for policy 0, policy_version 35421 (0.0017) [2024-08-05 18:00:32,865][15444] Updated weights for policy 0, policy_version 35431 (0.0031) [2024-08-05 18:00:33,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 290258944. Throughput: 0: 6035.7. Samples: 72570210. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 18:00:33,120][15372] Avg episode reward: [(0, '43.100')] [2024-08-05 18:00:33,121][15417] Saving new best policy, reward=43.100! [2024-08-05 18:00:35,947][15444] Updated weights for policy 0, policy_version 35441 (0.0013) [2024-08-05 18:00:38,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24167.2, 300 sec: 24131.7). Total num frames: 290381824. Throughput: 0: 6030.9. Samples: 72588420. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 18:00:38,119][15372] Avg episode reward: [(0, '42.988')] [2024-08-05 18:00:39,705][15444] Updated weights for policy 0, policy_version 35451 (0.0013) [2024-08-05 18:00:42,977][15444] Updated weights for policy 0, policy_version 35461 (0.0024) [2024-08-05 18:00:43,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24103.9). Total num frames: 290496512. Throughput: 0: 6039.4. Samples: 72624360. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 18:00:43,119][15372] Avg episode reward: [(0, '41.995')] [2024-08-05 18:00:46,326][15444] Updated weights for policy 0, policy_version 35471 (0.0024) [2024-08-05 18:00:48,119][15372] Fps is (10 sec: 22936.2, 60 sec: 24029.9, 300 sec: 24104.5). Total num frames: 290611200. Throughput: 0: 6034.6. Samples: 72660350. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 18:00:48,127][15372] Avg episode reward: [(0, '42.055')] [2024-08-05 18:00:49,804][15444] Updated weights for policy 0, policy_version 35481 (0.0014) [2024-08-05 18:00:53,119][15372] Fps is (10 sec: 23756.4, 60 sec: 24029.9, 300 sec: 24103.9). Total num frames: 290734080. Throughput: 0: 6046.2. Samples: 72678840. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 18:00:53,127][15372] Avg episode reward: [(0, '41.025')] [2024-08-05 18:00:53,153][15444] Updated weights for policy 0, policy_version 35491 (0.0029) [2024-08-05 18:00:56,516][15444] Updated weights for policy 0, policy_version 35501 (0.0013) [2024-08-05 18:00:58,118][15372] Fps is (10 sec: 25396.7, 60 sec: 24168.5, 300 sec: 24131.7). Total num frames: 290865152. Throughput: 0: 6039.8. Samples: 72715090. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 18:00:58,126][15372] Avg episode reward: [(0, '40.218')] [2024-08-05 18:00:59,860][15444] Updated weights for policy 0, policy_version 35511 (0.0012) [2024-08-05 18:01:03,118][15372] Fps is (10 sec: 24576.3, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 290979840. Throughput: 0: 6042.9. Samples: 72751760. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 18:01:03,126][15372] Avg episode reward: [(0, '40.783')] [2024-08-05 18:01:03,223][15444] Updated weights for policy 0, policy_version 35521 (0.0018) [2024-08-05 18:01:06,612][15444] Updated weights for policy 0, policy_version 35531 (0.0031) [2024-08-05 18:01:08,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 291102720. Throughput: 0: 6052.4. Samples: 72770090. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 18:01:08,126][15372] Avg episode reward: [(0, '41.274')] [2024-08-05 18:01:09,940][15444] Updated weights for policy 0, policy_version 35541 (0.0014) [2024-08-05 18:01:13,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.6, 300 sec: 24131.7). Total num frames: 291225600. Throughput: 0: 6055.8. Samples: 72806520. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 18:01:13,126][15372] Avg episode reward: [(0, '40.718')] [2024-08-05 18:01:13,457][15444] Updated weights for policy 0, policy_version 35551 (0.0026) [2024-08-05 18:01:16,669][15444] Updated weights for policy 0, policy_version 35561 (0.0016) [2024-08-05 18:01:18,119][15372] Fps is (10 sec: 24575.2, 60 sec: 24302.8, 300 sec: 24159.4). Total num frames: 291348480. Throughput: 0: 6043.5. Samples: 72842170. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 18:01:18,119][15372] Avg episode reward: [(0, '41.251')] [2024-08-05 18:01:19,426][15417] Signal inference workers to stop experience collection... (13200 times) [2024-08-05 18:01:19,427][15417] Signal inference workers to resume experience collection... (13200 times) [2024-08-05 18:01:19,473][15444] InferenceWorker_p0-w0: stopping experience collection (13200 times) [2024-08-05 18:01:19,479][15444] InferenceWorker_p0-w0: resuming experience collection (13200 times) [2024-08-05 18:01:20,036][15444] Updated weights for policy 0, policy_version 35571 (0.0021) [2024-08-05 18:01:23,120][15372] Fps is (10 sec: 23752.9, 60 sec: 24165.7, 300 sec: 24131.6). Total num frames: 291463168. Throughput: 0: 6048.9. Samples: 72860630. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 18:01:23,128][15372] Avg episode reward: [(0, '41.073')] [2024-08-05 18:01:23,632][15444] Updated weights for policy 0, policy_version 35581 (0.0017) [2024-08-05 18:01:26,969][15444] Updated weights for policy 0, policy_version 35591 (0.0015) [2024-08-05 18:01:28,118][15372] Fps is (10 sec: 23757.5, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 291586048. Throughput: 0: 6050.7. Samples: 72896640. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 18:01:28,119][15372] Avg episode reward: [(0, '40.563')] [2024-08-05 18:01:30,252][15444] Updated weights for policy 0, policy_version 35601 (0.0013) [2024-08-05 18:01:33,119][15372] Fps is (10 sec: 24578.5, 60 sec: 24166.1, 300 sec: 24131.6). Total num frames: 291708928. Throughput: 0: 6068.9. Samples: 72933450. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 18:01:33,127][15372] Avg episode reward: [(0, '40.341')] [2024-08-05 18:01:33,512][15444] Updated weights for policy 0, policy_version 35611 (0.0017) [2024-08-05 18:01:37,081][15444] Updated weights for policy 0, policy_version 35621 (0.0012) [2024-08-05 18:01:38,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.4, 300 sec: 24104.0). Total num frames: 291831808. Throughput: 0: 6060.2. Samples: 72951550. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 18:01:38,119][15372] Avg episode reward: [(0, '41.623')] [2024-08-05 18:01:40,444][15444] Updated weights for policy 0, policy_version 35631 (0.0020) [2024-08-05 18:01:43,118][15372] Fps is (10 sec: 24577.6, 60 sec: 24302.9, 300 sec: 24159.5). Total num frames: 291954688. Throughput: 0: 6073.3. Samples: 72988390. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 18:01:43,119][15372] Avg episode reward: [(0, '41.705')] [2024-08-05 18:01:43,651][15444] Updated weights for policy 0, policy_version 35641 (0.0024) [2024-08-05 18:01:47,362][15444] Updated weights for policy 0, policy_version 35651 (0.0013) [2024-08-05 18:01:48,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24439.7, 300 sec: 24159.5). Total num frames: 292077568. Throughput: 0: 6056.7. Samples: 73024310. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 18:01:48,119][15372] Avg episode reward: [(0, '41.991')] [2024-08-05 18:01:50,499][15444] Updated weights for policy 0, policy_version 35661 (0.0011) [2024-08-05 18:01:53,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24303.0, 300 sec: 24103.9). Total num frames: 292192256. Throughput: 0: 6058.4. Samples: 73042720. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 18:01:53,126][15372] Avg episode reward: [(0, '41.514')] [2024-08-05 18:01:53,825][15444] Updated weights for policy 0, policy_version 35671 (0.0011) [2024-08-05 18:01:57,399][15444] Updated weights for policy 0, policy_version 35681 (0.0013) [2024-08-05 18:01:58,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 292315136. Throughput: 0: 6041.8. Samples: 73078400. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 18:01:58,119][15372] Avg episode reward: [(0, '42.206')] [2024-08-05 18:02:00,689][15444] Updated weights for policy 0, policy_version 35691 (0.0014) [2024-08-05 18:02:03,119][15372] Fps is (10 sec: 24575.8, 60 sec: 24302.9, 300 sec: 24131.7). Total num frames: 292438016. Throughput: 0: 6061.6. Samples: 73114940. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 18:02:03,126][15372] Avg episode reward: [(0, '42.883')] [2024-08-05 18:02:04,382][15444] Updated weights for policy 0, policy_version 35701 (0.0013) [2024-08-05 18:02:07,592][15444] Updated weights for policy 0, policy_version 35711 (0.0018) [2024-08-05 18:02:08,119][15372] Fps is (10 sec: 23756.5, 60 sec: 24166.4, 300 sec: 24103.9). Total num frames: 292552704. Throughput: 0: 6048.4. Samples: 73132800. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 18:02:08,119][15372] Avg episode reward: [(0, '42.054')] [2024-08-05 18:02:10,938][15444] Updated weights for policy 0, policy_version 35721 (0.0015) [2024-08-05 18:02:13,119][15372] Fps is (10 sec: 23756.5, 60 sec: 24166.3, 300 sec: 24131.7). Total num frames: 292675584. Throughput: 0: 6038.6. Samples: 73168380. Policy #0 lag: (min: 1.0, avg: 3.6, max: 7.0) [2024-08-05 18:02:13,126][15372] Avg episode reward: [(0, '41.252')] [2024-08-05 18:02:14,362][15444] Updated weights for policy 0, policy_version 35731 (0.0015) [2024-08-05 18:02:17,736][15444] Updated weights for policy 0, policy_version 35741 (0.0020) [2024-08-05 18:02:18,119][15372] Fps is (10 sec: 24575.3, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 292798464. Throughput: 0: 6028.0. Samples: 73204710. Policy #0 lag: (min: 1.0, avg: 3.6, max: 7.0) [2024-08-05 18:02:18,119][15372] Avg episode reward: [(0, '42.304')] [2024-08-05 18:02:18,122][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000035742_292798464.pth... [2024-08-05 18:02:18,255][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000035036_287014912.pth [2024-08-05 18:02:20,048][15417] Signal inference workers to stop experience collection... (13250 times) [2024-08-05 18:02:20,049][15417] Signal inference workers to resume experience collection... (13250 times) [2024-08-05 18:02:20,125][15444] InferenceWorker_p0-w0: stopping experience collection (13250 times) [2024-08-05 18:02:20,130][15444] InferenceWorker_p0-w0: resuming experience collection (13250 times) [2024-08-05 18:02:21,290][15444] Updated weights for policy 0, policy_version 35751 (0.0017) [2024-08-05 18:02:23,118][15372] Fps is (10 sec: 24576.6, 60 sec: 24303.6, 300 sec: 24131.7). Total num frames: 292921344. Throughput: 0: 6034.0. Samples: 73223080. Policy #0 lag: (min: 1.0, avg: 3.6, max: 7.0) [2024-08-05 18:02:23,119][15372] Avg episode reward: [(0, '41.918')] [2024-08-05 18:02:24,520][15444] Updated weights for policy 0, policy_version 35761 (0.0029) [2024-08-05 18:02:27,935][15444] Updated weights for policy 0, policy_version 35771 (0.0011) [2024-08-05 18:02:28,118][15372] Fps is (10 sec: 23757.7, 60 sec: 24166.4, 300 sec: 24104.0). Total num frames: 293036032. Throughput: 0: 6038.0. Samples: 73260100. Policy #0 lag: (min: 1.0, avg: 3.6, max: 7.0) [2024-08-05 18:02:28,119][15372] Avg episode reward: [(0, '42.037')] [2024-08-05 18:02:31,118][15444] Updated weights for policy 0, policy_version 35781 (0.0027) [2024-08-05 18:02:33,119][15372] Fps is (10 sec: 23756.3, 60 sec: 24166.6, 300 sec: 24131.7). Total num frames: 293158912. Throughput: 0: 6042.2. Samples: 73296210. Policy #0 lag: (min: 1.0, avg: 3.6, max: 7.0) [2024-08-05 18:02:33,126][15372] Avg episode reward: [(0, '41.333')] [2024-08-05 18:02:34,663][15444] Updated weights for policy 0, policy_version 35791 (0.0016) [2024-08-05 18:02:37,888][15444] Updated weights for policy 0, policy_version 35801 (0.0013) [2024-08-05 18:02:38,119][15372] Fps is (10 sec: 24575.7, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 293281792. Throughput: 0: 6051.5. Samples: 73315040. Policy #0 lag: (min: 1.0, avg: 3.6, max: 7.0) [2024-08-05 18:02:38,119][15372] Avg episode reward: [(0, '41.243')] [2024-08-05 18:02:41,232][15444] Updated weights for policy 0, policy_version 35811 (0.0014) [2024-08-05 18:02:43,118][15372] Fps is (10 sec: 24576.5, 60 sec: 24166.4, 300 sec: 24159.6). Total num frames: 293404672. Throughput: 0: 6048.9. Samples: 73350600. Policy #0 lag: (min: 1.0, avg: 3.6, max: 7.0) [2024-08-05 18:02:43,126][15372] Avg episode reward: [(0, '41.762')] [2024-08-05 18:02:44,888][15444] Updated weights for policy 0, policy_version 35821 (0.0013) [2024-08-05 18:02:48,005][15444] Updated weights for policy 0, policy_version 35831 (0.0029) [2024-08-05 18:02:48,119][15372] Fps is (10 sec: 24575.9, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 293527552. Throughput: 0: 6045.8. Samples: 73387000. Policy #0 lag: (min: 1.0, avg: 3.6, max: 7.0) [2024-08-05 18:02:48,119][15372] Avg episode reward: [(0, '41.859')] [2024-08-05 18:02:51,578][15444] Updated weights for policy 0, policy_version 35841 (0.0014) [2024-08-05 18:02:53,119][15372] Fps is (10 sec: 23756.2, 60 sec: 24166.3, 300 sec: 24104.0). Total num frames: 293642240. Throughput: 0: 6064.0. Samples: 73405680. Policy #0 lag: (min: 1.0, avg: 3.6, max: 7.0) [2024-08-05 18:02:53,119][15372] Avg episode reward: [(0, '41.000')] [2024-08-05 18:02:54,836][15444] Updated weights for policy 0, policy_version 35851 (0.0013) [2024-08-05 18:02:58,119][15372] Fps is (10 sec: 24575.8, 60 sec: 24302.8, 300 sec: 24159.4). Total num frames: 293773312. Throughput: 0: 6084.4. Samples: 73442180. Policy #0 lag: (min: 1.0, avg: 3.3, max: 7.0) [2024-08-05 18:02:58,119][15444] Updated weights for policy 0, policy_version 35861 (0.0015) [2024-08-05 18:02:58,127][15372] Avg episode reward: [(0, '40.849')] [2024-08-05 18:03:01,027][15417] Signal inference workers to stop experience collection... (13300 times) [2024-08-05 18:03:01,028][15417] Signal inference workers to resume experience collection... (13300 times) [2024-08-05 18:03:01,086][15444] InferenceWorker_p0-w0: stopping experience collection (13300 times) [2024-08-05 18:03:01,086][15444] InferenceWorker_p0-w0: resuming experience collection (13300 times) [2024-08-05 18:03:01,807][15444] Updated weights for policy 0, policy_version 35871 (0.0012) [2024-08-05 18:03:03,118][15372] Fps is (10 sec: 24576.7, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 293888000. Throughput: 0: 6066.5. Samples: 73477700. Policy #0 lag: (min: 1.0, avg: 3.3, max: 7.0) [2024-08-05 18:03:03,119][15372] Avg episode reward: [(0, '40.789')] [2024-08-05 18:03:04,817][15444] Updated weights for policy 0, policy_version 35881 (0.0016) [2024-08-05 18:03:08,118][15372] Fps is (10 sec: 22938.1, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 294002688. Throughput: 0: 6088.4. Samples: 73497060. Policy #0 lag: (min: 1.0, avg: 3.3, max: 7.0) [2024-08-05 18:03:08,126][15372] Avg episode reward: [(0, '40.570')] [2024-08-05 18:03:08,390][15444] Updated weights for policy 0, policy_version 35891 (0.0014) [2024-08-05 18:03:11,972][15444] Updated weights for policy 0, policy_version 35901 (0.0012) [2024-08-05 18:03:13,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24303.0, 300 sec: 24131.7). Total num frames: 294133760. Throughput: 0: 6070.9. Samples: 73533290. Policy #0 lag: (min: 1.0, avg: 3.3, max: 7.0) [2024-08-05 18:03:13,119][15372] Avg episode reward: [(0, '39.883')] [2024-08-05 18:03:15,027][15444] Updated weights for policy 0, policy_version 35911 (0.0018) [2024-08-05 18:03:18,127][15372] Fps is (10 sec: 24554.7, 60 sec: 24163.1, 300 sec: 24131.0). Total num frames: 294248448. Throughput: 0: 6071.7. Samples: 73569490. Policy #0 lag: (min: 1.0, avg: 3.3, max: 7.0) [2024-08-05 18:03:18,135][15372] Avg episode reward: [(0, '40.496')] [2024-08-05 18:03:18,646][15444] Updated weights for policy 0, policy_version 35921 (0.0024) [2024-08-05 18:03:21,828][15444] Updated weights for policy 0, policy_version 35931 (0.0024) [2024-08-05 18:03:23,119][15372] Fps is (10 sec: 23756.0, 60 sec: 24166.2, 300 sec: 24159.4). Total num frames: 294371328. Throughput: 0: 6065.1. Samples: 73587970. Policy #0 lag: (min: 1.0, avg: 3.6, max: 8.0) [2024-08-05 18:03:23,120][15372] Avg episode reward: [(0, '41.518')] [2024-08-05 18:03:25,133][15444] Updated weights for policy 0, policy_version 35941 (0.0017) [2024-08-05 18:03:28,119][15372] Fps is (10 sec: 24595.8, 60 sec: 24302.7, 300 sec: 24131.6). Total num frames: 294494208. Throughput: 0: 6076.6. Samples: 73624050. Policy #0 lag: (min: 1.0, avg: 3.6, max: 8.0) [2024-08-05 18:03:28,127][15372] Avg episode reward: [(0, '40.644')] [2024-08-05 18:03:28,782][15444] Updated weights for policy 0, policy_version 35951 (0.0028) [2024-08-05 18:03:32,088][15444] Updated weights for policy 0, policy_version 35961 (0.0017) [2024-08-05 18:03:33,119][15372] Fps is (10 sec: 24576.6, 60 sec: 24303.0, 300 sec: 24159.5). Total num frames: 294617088. Throughput: 0: 6062.9. Samples: 73659830. Policy #0 lag: (min: 1.0, avg: 3.6, max: 8.0) [2024-08-05 18:03:33,119][15372] Avg episode reward: [(0, '39.739')] [2024-08-05 18:03:35,474][15444] Updated weights for policy 0, policy_version 35971 (0.0025) [2024-08-05 18:03:38,119][15372] Fps is (10 sec: 24577.0, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 294739968. Throughput: 0: 6059.3. Samples: 73678350. Policy #0 lag: (min: 1.0, avg: 3.6, max: 8.0) [2024-08-05 18:03:38,119][15372] Avg episode reward: [(0, '41.490')] [2024-08-05 18:03:38,760][15444] Updated weights for policy 0, policy_version 35981 (0.0018) [2024-08-05 18:03:42,261][15444] Updated weights for policy 0, policy_version 35991 (0.0011) [2024-08-05 18:03:43,118][15372] Fps is (10 sec: 23757.2, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 294854656. Throughput: 0: 6053.4. Samples: 73714580. Policy #0 lag: (min: 1.0, avg: 3.6, max: 8.0) [2024-08-05 18:03:43,119][15372] Avg episode reward: [(0, '40.690')] [2024-08-05 18:03:45,407][15444] Updated weights for policy 0, policy_version 36001 (0.0011) [2024-08-05 18:03:48,119][15372] Fps is (10 sec: 24576.3, 60 sec: 24302.9, 300 sec: 24187.3). Total num frames: 294985728. Throughput: 0: 6089.8. Samples: 73751740. Policy #0 lag: (min: 0.0, avg: 4.3, max: 7.0) [2024-08-05 18:03:48,119][15372] Avg episode reward: [(0, '40.424')] [2024-08-05 18:03:48,868][15444] Updated weights for policy 0, policy_version 36011 (0.0020) [2024-08-05 18:03:52,359][15444] Updated weights for policy 0, policy_version 36021 (0.0022) [2024-08-05 18:03:53,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24303.0, 300 sec: 24187.2). Total num frames: 295100416. Throughput: 0: 6069.3. Samples: 73770180. Policy #0 lag: (min: 0.0, avg: 4.3, max: 7.0) [2024-08-05 18:03:53,119][15372] Avg episode reward: [(0, '42.278')] [2024-08-05 18:03:53,175][15417] Signal inference workers to stop experience collection... (13350 times) [2024-08-05 18:03:53,177][15417] Signal inference workers to resume experience collection... (13350 times) [2024-08-05 18:03:53,240][15444] InferenceWorker_p0-w0: stopping experience collection (13350 times) [2024-08-05 18:03:53,247][15444] InferenceWorker_p0-w0: resuming experience collection (13350 times) [2024-08-05 18:03:55,505][15444] Updated weights for policy 0, policy_version 36031 (0.0021) [2024-08-05 18:03:58,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24166.5, 300 sec: 24187.2). Total num frames: 295223296. Throughput: 0: 6077.6. Samples: 73806780. Policy #0 lag: (min: 0.0, avg: 4.3, max: 7.0) [2024-08-05 18:03:58,119][15372] Avg episode reward: [(0, '41.859')] [2024-08-05 18:03:58,971][15444] Updated weights for policy 0, policy_version 36041 (0.0017) [2024-08-05 18:04:02,192][15444] Updated weights for policy 0, policy_version 36051 (0.0020) [2024-08-05 18:04:03,119][15372] Fps is (10 sec: 23756.2, 60 sec: 24166.3, 300 sec: 24215.0). Total num frames: 295337984. Throughput: 0: 6064.5. Samples: 73842340. Policy #0 lag: (min: 0.0, avg: 4.3, max: 7.0) [2024-08-05 18:04:03,126][15372] Avg episode reward: [(0, '41.163')] [2024-08-05 18:04:05,743][15444] Updated weights for policy 0, policy_version 36061 (0.0020) [2024-08-05 18:04:08,119][15372] Fps is (10 sec: 24575.4, 60 sec: 24439.4, 300 sec: 24215.0). Total num frames: 295469056. Throughput: 0: 6072.2. Samples: 73861220. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 18:04:08,119][15372] Avg episode reward: [(0, '41.746')] [2024-08-05 18:04:09,240][15444] Updated weights for policy 0, policy_version 36071 (0.0029) [2024-08-05 18:04:12,558][15444] Updated weights for policy 0, policy_version 36081 (0.0018) [2024-08-05 18:04:13,118][15372] Fps is (10 sec: 24576.7, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 295583744. Throughput: 0: 6072.1. Samples: 73897290. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 18:04:13,119][15372] Avg episode reward: [(0, '41.739')] [2024-08-05 18:04:15,962][15444] Updated weights for policy 0, policy_version 36091 (0.0025) [2024-08-05 18:04:18,118][15372] Fps is (10 sec: 23757.4, 60 sec: 24306.4, 300 sec: 24187.2). Total num frames: 295706624. Throughput: 0: 6077.6. Samples: 73933320. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 18:04:18,125][15372] Avg episode reward: [(0, '41.280')] [2024-08-05 18:04:18,129][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000036097_295706624.pth... [2024-08-05 18:04:18,241][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000035388_289898496.pth [2024-08-05 18:04:19,397][15444] Updated weights for policy 0, policy_version 36101 (0.0011) [2024-08-05 18:04:22,567][15444] Updated weights for policy 0, policy_version 36111 (0.0024) [2024-08-05 18:04:23,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.6, 300 sec: 24187.3). Total num frames: 295821312. Throughput: 0: 6058.7. Samples: 73950990. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 18:04:23,119][15372] Avg episode reward: [(0, '40.206')] [2024-08-05 18:04:25,987][15444] Updated weights for policy 0, policy_version 36121 (0.0016) [2024-08-05 18:04:28,119][15372] Fps is (10 sec: 23755.8, 60 sec: 24166.5, 300 sec: 24187.2). Total num frames: 295944192. Throughput: 0: 6050.8. Samples: 73986870. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 18:04:28,120][15372] Avg episode reward: [(0, '41.092')] [2024-08-05 18:04:29,738][15444] Updated weights for policy 0, policy_version 36131 (0.0014) [2024-08-05 18:04:32,850][15444] Updated weights for policy 0, policy_version 36141 (0.0013) [2024-08-05 18:04:33,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24166.4, 300 sec: 24187.4). Total num frames: 296067072. Throughput: 0: 6029.6. Samples: 74023070. Policy #0 lag: (min: 1.0, avg: 3.3, max: 7.0) [2024-08-05 18:04:33,119][15372] Avg episode reward: [(0, '40.756')] [2024-08-05 18:04:36,295][15444] Updated weights for policy 0, policy_version 36151 (0.0031) [2024-08-05 18:04:38,118][15372] Fps is (10 sec: 24577.0, 60 sec: 24166.5, 300 sec: 24215.0). Total num frames: 296189952. Throughput: 0: 6037.1. Samples: 74041850. Policy #0 lag: (min: 1.0, avg: 3.3, max: 7.0) [2024-08-05 18:04:38,126][15372] Avg episode reward: [(0, '40.870')] [2024-08-05 18:04:39,752][15444] Updated weights for policy 0, policy_version 36161 (0.0011) [2024-08-05 18:04:43,072][15444] Updated weights for policy 0, policy_version 36171 (0.0011) [2024-08-05 18:04:43,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24302.9, 300 sec: 24215.1). Total num frames: 296312832. Throughput: 0: 6045.3. Samples: 74078820. Policy #0 lag: (min: 1.0, avg: 3.3, max: 7.0) [2024-08-05 18:04:43,119][15372] Avg episode reward: [(0, '42.094')] [2024-08-05 18:04:46,582][15444] Updated weights for policy 0, policy_version 36181 (0.0014) [2024-08-05 18:04:48,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24029.9, 300 sec: 24187.2). Total num frames: 296427520. Throughput: 0: 6029.4. Samples: 74113660. Policy #0 lag: (min: 1.0, avg: 3.3, max: 7.0) [2024-08-05 18:04:48,126][15372] Avg episode reward: [(0, '42.528')] [2024-08-05 18:04:49,311][15417] Signal inference workers to stop experience collection... (13400 times) [2024-08-05 18:04:49,312][15417] Signal inference workers to resume experience collection... (13400 times) [2024-08-05 18:04:49,355][15444] InferenceWorker_p0-w0: stopping experience collection (13400 times) [2024-08-05 18:04:49,363][15444] InferenceWorker_p0-w0: resuming experience collection (13400 times) [2024-08-05 18:04:49,784][15444] Updated weights for policy 0, policy_version 36191 (0.0024) [2024-08-05 18:04:53,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.4, 300 sec: 24187.7). Total num frames: 296550400. Throughput: 0: 6035.8. Samples: 74132830. Policy #0 lag: (min: 1.0, avg: 3.3, max: 7.0) [2024-08-05 18:04:53,119][15372] Avg episode reward: [(0, '41.911')] [2024-08-05 18:04:53,232][15444] Updated weights for policy 0, policy_version 36201 (0.0022) [2024-08-05 18:04:56,678][15444] Updated weights for policy 0, policy_version 36211 (0.0033) [2024-08-05 18:04:58,119][15372] Fps is (10 sec: 25394.7, 60 sec: 24302.9, 300 sec: 24242.7). Total num frames: 296681472. Throughput: 0: 6039.1. Samples: 74169050. Policy #0 lag: (min: 1.0, avg: 3.5, max: 7.0) [2024-08-05 18:04:58,126][15372] Avg episode reward: [(0, '41.305')] [2024-08-05 18:04:59,935][15444] Updated weights for policy 0, policy_version 36221 (0.0017) [2024-08-05 18:05:03,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24303.0, 300 sec: 24215.0). Total num frames: 296796160. Throughput: 0: 6043.8. Samples: 74205290. Policy #0 lag: (min: 1.0, avg: 3.5, max: 7.0) [2024-08-05 18:05:03,126][15372] Avg episode reward: [(0, '40.907')] [2024-08-05 18:05:03,623][15444] Updated weights for policy 0, policy_version 36231 (0.0027) [2024-08-05 18:05:06,496][15444] Updated weights for policy 0, policy_version 36241 (0.0013) [2024-08-05 18:05:08,119][15372] Fps is (10 sec: 22937.1, 60 sec: 24029.8, 300 sec: 24187.2). Total num frames: 296910848. Throughput: 0: 6065.9. Samples: 74223960. Policy #0 lag: (min: 1.0, avg: 3.5, max: 7.0) [2024-08-05 18:05:08,127][15372] Avg episode reward: [(0, '40.372')] [2024-08-05 18:05:10,144][15444] Updated weights for policy 0, policy_version 36251 (0.0019) [2024-08-05 18:05:13,119][15372] Fps is (10 sec: 24575.0, 60 sec: 24302.8, 300 sec: 24242.7). Total num frames: 297041920. Throughput: 0: 6076.5. Samples: 74260310. Policy #0 lag: (min: 1.0, avg: 3.5, max: 7.0) [2024-08-05 18:05:13,127][15372] Avg episode reward: [(0, '41.309')] [2024-08-05 18:05:13,699][15444] Updated weights for policy 0, policy_version 36261 (0.0020) [2024-08-05 18:05:16,778][15444] Updated weights for policy 0, policy_version 36271 (0.0020) [2024-08-05 18:05:18,119][15372] Fps is (10 sec: 24576.3, 60 sec: 24166.3, 300 sec: 24215.0). Total num frames: 297156608. Throughput: 0: 6070.4. Samples: 74296240. Policy #0 lag: (min: 1.0, avg: 3.7, max: 7.0) [2024-08-05 18:05:18,119][15372] Avg episode reward: [(0, '43.182')] [2024-08-05 18:05:18,122][15417] Saving new best policy, reward=43.182! [2024-08-05 18:05:18,900][15417] Signal inference workers to stop experience collection... (13450 times) [2024-08-05 18:05:18,907][15417] Signal inference workers to resume experience collection... (13450 times) [2024-08-05 18:05:18,949][15444] InferenceWorker_p0-w0: stopping experience collection (13450 times) [2024-08-05 18:05:18,949][15444] InferenceWorker_p0-w0: resuming experience collection (13450 times) [2024-08-05 18:05:20,285][15444] Updated weights for policy 0, policy_version 36281 (0.0023) [2024-08-05 18:05:23,118][15372] Fps is (10 sec: 24577.0, 60 sec: 24439.5, 300 sec: 24242.8). Total num frames: 297287680. Throughput: 0: 6066.9. Samples: 74314860. Policy #0 lag: (min: 1.0, avg: 3.7, max: 7.0) [2024-08-05 18:05:23,119][15372] Avg episode reward: [(0, '42.610')] [2024-08-05 18:05:23,291][15444] Updated weights for policy 0, policy_version 36291 (0.0014) [2024-08-05 18:05:27,016][15444] Updated weights for policy 0, policy_version 36301 (0.0013) [2024-08-05 18:05:28,118][15372] Fps is (10 sec: 24576.8, 60 sec: 24303.1, 300 sec: 24215.0). Total num frames: 297402368. Throughput: 0: 6055.1. Samples: 74351300. Policy #0 lag: (min: 1.0, avg: 3.7, max: 7.0) [2024-08-05 18:05:28,119][15372] Avg episode reward: [(0, '41.727')] [2024-08-05 18:05:30,264][15444] Updated weights for policy 0, policy_version 36311 (0.0013) [2024-08-05 18:05:33,119][15372] Fps is (10 sec: 22937.4, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 297517056. Throughput: 0: 6072.7. Samples: 74386930. Policy #0 lag: (min: 1.0, avg: 3.7, max: 7.0) [2024-08-05 18:05:33,119][15372] Avg episode reward: [(0, '41.381')] [2024-08-05 18:05:33,822][15444] Updated weights for policy 0, policy_version 36321 (0.0016) [2024-08-05 18:05:37,436][15444] Updated weights for policy 0, policy_version 36331 (0.0010) [2024-08-05 18:05:38,119][15372] Fps is (10 sec: 23756.4, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 297639936. Throughput: 0: 6053.8. Samples: 74405250. Policy #0 lag: (min: 1.0, avg: 3.7, max: 7.0) [2024-08-05 18:05:38,119][15372] Avg episode reward: [(0, '42.550')] [2024-08-05 18:05:40,433][15444] Updated weights for policy 0, policy_version 36341 (0.0015) [2024-08-05 18:05:43,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24166.4, 300 sec: 24242.8). Total num frames: 297762816. Throughput: 0: 6042.7. Samples: 74440970. Policy #0 lag: (min: 1.0, avg: 3.4, max: 7.0) [2024-08-05 18:05:43,126][15372] Avg episode reward: [(0, '41.432')] [2024-08-05 18:05:44,290][15444] Updated weights for policy 0, policy_version 36351 (0.0033) [2024-08-05 18:05:47,656][15444] Updated weights for policy 0, policy_version 36361 (0.0028) [2024-08-05 18:05:48,118][15372] Fps is (10 sec: 23757.1, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 297877504. Throughput: 0: 6026.7. Samples: 74476490. Policy #0 lag: (min: 1.0, avg: 3.4, max: 7.0) [2024-08-05 18:05:48,119][15372] Avg episode reward: [(0, '41.191')] [2024-08-05 18:05:50,877][15444] Updated weights for policy 0, policy_version 36371 (0.0013) [2024-08-05 18:05:53,119][15372] Fps is (10 sec: 23756.5, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 298000384. Throughput: 0: 6012.0. Samples: 74494500. Policy #0 lag: (min: 1.0, avg: 3.4, max: 7.0) [2024-08-05 18:05:53,126][15372] Avg episode reward: [(0, '41.930')] [2024-08-05 18:05:54,680][15444] Updated weights for policy 0, policy_version 36381 (0.0014) [2024-08-05 18:05:54,796][15417] Signal inference workers to stop experience collection... (13500 times) [2024-08-05 18:05:54,797][15417] Signal inference workers to resume experience collection... (13500 times) [2024-08-05 18:05:54,824][15444] InferenceWorker_p0-w0: stopping experience collection (13500 times) [2024-08-05 18:05:54,852][15444] InferenceWorker_p0-w0: resuming experience collection (13500 times) [2024-08-05 18:05:57,581][15444] Updated weights for policy 0, policy_version 36391 (0.0050) [2024-08-05 18:05:58,118][15372] Fps is (10 sec: 23756.8, 60 sec: 23893.4, 300 sec: 24187.2). Total num frames: 298115072. Throughput: 0: 6002.0. Samples: 74530400. Policy #0 lag: (min: 1.0, avg: 3.4, max: 7.0) [2024-08-05 18:05:58,119][15372] Avg episode reward: [(0, '41.958')] [2024-08-05 18:06:01,356][15444] Updated weights for policy 0, policy_version 36401 (0.0029) [2024-08-05 18:06:03,118][15372] Fps is (10 sec: 23757.2, 60 sec: 24029.9, 300 sec: 24187.2). Total num frames: 298237952. Throughput: 0: 5986.9. Samples: 74565650. Policy #0 lag: (min: 0.0, avg: 3.2, max: 8.0) [2024-08-05 18:06:03,119][15372] Avg episode reward: [(0, '41.787')] [2024-08-05 18:06:04,692][15444] Updated weights for policy 0, policy_version 36411 (0.0027) [2024-08-05 18:06:08,037][15444] Updated weights for policy 0, policy_version 36421 (0.0028) [2024-08-05 18:06:08,119][15372] Fps is (10 sec: 24574.9, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 298360832. Throughput: 0: 5985.0. Samples: 74584190. Policy #0 lag: (min: 0.0, avg: 3.2, max: 8.0) [2024-08-05 18:06:08,119][15372] Avg episode reward: [(0, '40.934')] [2024-08-05 18:06:11,634][15444] Updated weights for policy 0, policy_version 36431 (0.0028) [2024-08-05 18:06:13,118][15372] Fps is (10 sec: 23756.8, 60 sec: 23893.5, 300 sec: 24159.5). Total num frames: 298475520. Throughput: 0: 5971.6. Samples: 74620020. Policy #0 lag: (min: 0.0, avg: 3.2, max: 8.0) [2024-08-05 18:06:13,119][15372] Avg episode reward: [(0, '41.368')] [2024-08-05 18:06:14,597][15444] Updated weights for policy 0, policy_version 36441 (0.0016) [2024-08-05 18:06:18,119][15372] Fps is (10 sec: 23757.5, 60 sec: 24029.9, 300 sec: 24187.3). Total num frames: 298598400. Throughput: 0: 5989.1. Samples: 74656440. Policy #0 lag: (min: 0.0, avg: 3.2, max: 8.0) [2024-08-05 18:06:18,126][15372] Avg episode reward: [(0, '41.561')] [2024-08-05 18:06:18,130][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000036450_298598400.pth... [2024-08-05 18:06:18,278][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000035742_292798464.pth [2024-08-05 18:06:18,374][15444] Updated weights for policy 0, policy_version 36451 (0.0013) [2024-08-05 18:06:21,776][15444] Updated weights for policy 0, policy_version 36461 (0.0021) [2024-08-05 18:06:23,118][15372] Fps is (10 sec: 24575.9, 60 sec: 23893.3, 300 sec: 24187.2). Total num frames: 298721280. Throughput: 0: 5979.6. Samples: 74674330. Policy #0 lag: (min: 0.0, avg: 3.2, max: 8.0) [2024-08-05 18:06:23,119][15372] Avg episode reward: [(0, '41.542')] [2024-08-05 18:06:25,062][15444] Updated weights for policy 0, policy_version 36471 (0.0023) [2024-08-05 18:06:28,118][15372] Fps is (10 sec: 24576.4, 60 sec: 24029.8, 300 sec: 24187.3). Total num frames: 298844160. Throughput: 0: 6007.8. Samples: 74711320. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 18:06:28,126][15372] Avg episode reward: [(0, '42.248')] [2024-08-05 18:06:28,294][15444] Updated weights for policy 0, policy_version 36481 (0.0011) [2024-08-05 18:06:28,990][15417] Signal inference workers to stop experience collection... (13550 times) [2024-08-05 18:06:28,990][15417] Signal inference workers to resume experience collection... (13550 times) [2024-08-05 18:06:29,056][15444] InferenceWorker_p0-w0: stopping experience collection (13550 times) [2024-08-05 18:06:29,056][15444] InferenceWorker_p0-w0: resuming experience collection (13550 times) [2024-08-05 18:06:31,641][15444] Updated weights for policy 0, policy_version 36491 (0.0027) [2024-08-05 18:06:33,118][15372] Fps is (10 sec: 24575.8, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 298967040. Throughput: 0: 6030.9. Samples: 74747880. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 18:06:33,119][15372] Avg episode reward: [(0, '41.759')] [2024-08-05 18:06:35,125][15444] Updated weights for policy 0, policy_version 36501 (0.0022) [2024-08-05 18:06:38,119][15372] Fps is (10 sec: 24575.7, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 299089920. Throughput: 0: 6027.3. Samples: 74765730. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 18:06:38,126][15372] Avg episode reward: [(0, '41.829')] [2024-08-05 18:06:38,610][15444] Updated weights for policy 0, policy_version 36511 (0.0029) [2024-08-05 18:06:42,063][15444] Updated weights for policy 0, policy_version 36521 (0.0018) [2024-08-05 18:06:43,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 299204608. Throughput: 0: 6032.7. Samples: 74801870. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 18:06:43,119][15372] Avg episode reward: [(0, '43.187')] [2024-08-05 18:06:43,126][15417] Saving new best policy, reward=43.187! [2024-08-05 18:06:45,242][15444] Updated weights for policy 0, policy_version 36531 (0.0011) [2024-08-05 18:06:48,119][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 299327488. Throughput: 0: 6058.7. Samples: 74838290. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 18:06:48,126][15372] Avg episode reward: [(0, '41.503')] [2024-08-05 18:06:48,576][15444] Updated weights for policy 0, policy_version 36541 (0.0017) [2024-08-05 18:06:52,143][15444] Updated weights for policy 0, policy_version 36551 (0.0019) [2024-08-05 18:06:53,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.5, 300 sec: 24187.2). Total num frames: 299450368. Throughput: 0: 6042.3. Samples: 74856090. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 18:06:53,119][15372] Avg episode reward: [(0, '41.317')] [2024-08-05 18:06:55,574][15444] Updated weights for policy 0, policy_version 36561 (0.0013) [2024-08-05 18:06:58,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 299565056. Throughput: 0: 6061.8. Samples: 74892800. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 18:06:58,119][15372] Avg episode reward: [(0, '41.564')] [2024-08-05 18:06:58,779][15444] Updated weights for policy 0, policy_version 36571 (0.0017) [2024-08-05 18:07:02,283][15444] Updated weights for policy 0, policy_version 36581 (0.0017) [2024-08-05 18:07:03,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 299687936. Throughput: 0: 6048.0. Samples: 74928600. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 18:07:03,119][15372] Avg episode reward: [(0, '42.273')] [2024-08-05 18:07:05,869][15444] Updated weights for policy 0, policy_version 36591 (0.0011) [2024-08-05 18:07:08,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.6, 300 sec: 24187.2). Total num frames: 299810816. Throughput: 0: 6069.8. Samples: 74947470. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 18:07:08,119][15372] Avg episode reward: [(0, '41.709')] [2024-08-05 18:07:08,951][15444] Updated weights for policy 0, policy_version 36601 (0.0022) [2024-08-05 18:07:12,383][15444] Updated weights for policy 0, policy_version 36611 (0.0030) [2024-08-05 18:07:13,119][15372] Fps is (10 sec: 24575.9, 60 sec: 24302.9, 300 sec: 24187.3). Total num frames: 299933696. Throughput: 0: 6055.3. Samples: 74983810. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 18:07:13,126][15372] Avg episode reward: [(0, '42.031')] [2024-08-05 18:07:15,915][15444] Updated weights for policy 0, policy_version 36621 (0.0024) [2024-08-05 18:07:18,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.5, 300 sec: 24159.5). Total num frames: 300048384. Throughput: 0: 6030.9. Samples: 75019270. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 18:07:18,129][15372] Avg episode reward: [(0, '41.628')] [2024-08-05 18:07:18,477][15417] Signal inference workers to stop experience collection... (13600 times) [2024-08-05 18:07:18,477][15417] Signal inference workers to resume experience collection... (13600 times) [2024-08-05 18:07:18,527][15444] InferenceWorker_p0-w0: stopping experience collection (13600 times) [2024-08-05 18:07:18,527][15444] InferenceWorker_p0-w0: resuming experience collection (13600 times) [2024-08-05 18:07:19,287][15444] Updated weights for policy 0, policy_version 36631 (0.0019) [2024-08-05 18:07:22,614][15444] Updated weights for policy 0, policy_version 36641 (0.0018) [2024-08-05 18:07:23,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 300171264. Throughput: 0: 6027.8. Samples: 75036980. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 18:07:23,119][15372] Avg episode reward: [(0, '41.469')] [2024-08-05 18:07:25,962][15444] Updated weights for policy 0, policy_version 36651 (0.0038) [2024-08-05 18:07:28,119][15372] Fps is (10 sec: 24575.6, 60 sec: 24166.3, 300 sec: 24187.2). Total num frames: 300294144. Throughput: 0: 6023.5. Samples: 75072930. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 18:07:28,126][15372] Avg episode reward: [(0, '41.254')] [2024-08-05 18:07:29,558][15444] Updated weights for policy 0, policy_version 36661 (0.0013) [2024-08-05 18:07:32,859][15444] Updated weights for policy 0, policy_version 36671 (0.0013) [2024-08-05 18:07:33,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 300408832. Throughput: 0: 6008.0. Samples: 75108650. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 18:07:33,119][15372] Avg episode reward: [(0, '40.941')] [2024-08-05 18:07:36,128][15444] Updated weights for policy 0, policy_version 36681 (0.0019) [2024-08-05 18:07:38,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 300531712. Throughput: 0: 6030.2. Samples: 75127450. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 18:07:38,126][15372] Avg episode reward: [(0, '42.388')] [2024-08-05 18:07:39,799][15444] Updated weights for policy 0, policy_version 36691 (0.0024) [2024-08-05 18:07:43,119][15372] Fps is (10 sec: 23756.0, 60 sec: 24029.7, 300 sec: 24131.7). Total num frames: 300646400. Throughput: 0: 6028.0. Samples: 75164060. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 18:07:43,126][15372] Avg episode reward: [(0, '42.063')] [2024-08-05 18:07:43,156][15444] Updated weights for policy 0, policy_version 36701 (0.0022) [2024-08-05 18:07:46,414][15444] Updated weights for policy 0, policy_version 36711 (0.0016) [2024-08-05 18:07:48,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 300769280. Throughput: 0: 6017.1. Samples: 75199370. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 18:07:48,136][15372] Avg episode reward: [(0, '40.907')] [2024-08-05 18:07:49,795][15444] Updated weights for policy 0, policy_version 36721 (0.0021) [2024-08-05 18:07:53,118][15372] Fps is (10 sec: 24576.6, 60 sec: 24029.8, 300 sec: 24131.7). Total num frames: 300892160. Throughput: 0: 6009.1. Samples: 75217880. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 18:07:53,126][15372] Avg episode reward: [(0, '40.861')] [2024-08-05 18:07:53,312][15444] Updated weights for policy 0, policy_version 36731 (0.0016) [2024-08-05 18:07:56,787][15444] Updated weights for policy 0, policy_version 36741 (0.0012) [2024-08-05 18:07:58,119][15372] Fps is (10 sec: 24575.1, 60 sec: 24166.3, 300 sec: 24159.4). Total num frames: 301015040. Throughput: 0: 6006.8. Samples: 75254120. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 18:07:58,119][15372] Avg episode reward: [(0, '41.302')] [2024-08-05 18:07:59,945][15444] Updated weights for policy 0, policy_version 36751 (0.0012) [2024-08-05 18:08:03,119][15372] Fps is (10 sec: 23756.2, 60 sec: 24029.7, 300 sec: 24159.4). Total num frames: 301129728. Throughput: 0: 6017.1. Samples: 75290040. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 18:08:03,126][15372] Avg episode reward: [(0, '41.229')] [2024-08-05 18:08:03,500][15444] Updated weights for policy 0, policy_version 36761 (0.0033) [2024-08-05 18:08:05,965][15417] Signal inference workers to stop experience collection... (13650 times) [2024-08-05 18:08:05,966][15417] Signal inference workers to resume experience collection... (13650 times) [2024-08-05 18:08:06,044][15444] InferenceWorker_p0-w0: stopping experience collection (13650 times) [2024-08-05 18:08:06,051][15444] InferenceWorker_p0-w0: resuming experience collection (13650 times) [2024-08-05 18:08:07,039][15444] Updated weights for policy 0, policy_version 36771 (0.0029) [2024-08-05 18:08:08,118][15372] Fps is (10 sec: 23757.7, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 301252608. Throughput: 0: 6026.2. Samples: 75308160. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 18:08:08,119][15372] Avg episode reward: [(0, '41.487')] [2024-08-05 18:08:10,379][15444] Updated weights for policy 0, policy_version 36781 (0.0014) [2024-08-05 18:08:13,122][15372] Fps is (10 sec: 23749.5, 60 sec: 23892.0, 300 sec: 24132.1). Total num frames: 301367296. Throughput: 0: 6021.1. Samples: 75343900. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 18:08:13,122][15372] Avg episode reward: [(0, '40.980')] [2024-08-05 18:08:14,015][15444] Updated weights for policy 0, policy_version 36791 (0.0013) [2024-08-05 18:08:17,281][15444] Updated weights for policy 0, policy_version 36801 (0.0019) [2024-08-05 18:08:18,118][15372] Fps is (10 sec: 23756.6, 60 sec: 24029.8, 300 sec: 24131.7). Total num frames: 301490176. Throughput: 0: 6022.2. Samples: 75379650. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 18:08:18,126][15372] Avg episode reward: [(0, '41.384')] [2024-08-05 18:08:18,130][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000036803_301490176.pth... [2024-08-05 18:08:18,268][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000036097_295706624.pth [2024-08-05 18:08:20,541][15444] Updated weights for policy 0, policy_version 36811 (0.0014) [2024-08-05 18:08:23,118][15372] Fps is (10 sec: 25403.8, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 301621248. Throughput: 0: 6000.0. Samples: 75397450. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 18:08:23,119][15372] Avg episode reward: [(0, '42.431')] [2024-08-05 18:08:24,275][15444] Updated weights for policy 0, policy_version 36821 (0.0032) [2024-08-05 18:08:27,221][15444] Updated weights for policy 0, policy_version 36831 (0.0014) [2024-08-05 18:08:28,118][15372] Fps is (10 sec: 23757.0, 60 sec: 23893.4, 300 sec: 24103.9). Total num frames: 301727744. Throughput: 0: 5999.2. Samples: 75434020. Policy #0 lag: (min: 1.0, avg: 3.4, max: 8.0) [2024-08-05 18:08:28,126][15372] Avg episode reward: [(0, '41.573')] [2024-08-05 18:08:30,849][15444] Updated weights for policy 0, policy_version 36841 (0.0011) [2024-08-05 18:08:33,119][15372] Fps is (10 sec: 22937.1, 60 sec: 24029.8, 300 sec: 24103.9). Total num frames: 301850624. Throughput: 0: 6028.9. Samples: 75470670. Policy #0 lag: (min: 1.0, avg: 3.4, max: 8.0) [2024-08-05 18:08:33,120][15372] Avg episode reward: [(0, '41.555')] [2024-08-05 18:08:34,356][15444] Updated weights for policy 0, policy_version 36851 (0.0022) [2024-08-05 18:08:37,517][15444] Updated weights for policy 0, policy_version 36861 (0.0013) [2024-08-05 18:08:38,119][15372] Fps is (10 sec: 24575.7, 60 sec: 24029.8, 300 sec: 24131.7). Total num frames: 301973504. Throughput: 0: 5998.4. Samples: 75487810. Policy #0 lag: (min: 1.0, avg: 3.4, max: 8.0) [2024-08-05 18:08:38,119][15372] Avg episode reward: [(0, '40.977')] [2024-08-05 18:08:39,392][15417] Signal inference workers to stop experience collection... (13700 times) [2024-08-05 18:08:39,392][15417] Signal inference workers to resume experience collection... (13700 times) [2024-08-05 18:08:39,410][15444] InferenceWorker_p0-w0: stopping experience collection (13700 times) [2024-08-05 18:08:39,438][15444] InferenceWorker_p0-w0: resuming experience collection (13700 times) [2024-08-05 18:08:40,960][15444] Updated weights for policy 0, policy_version 36871 (0.0012) [2024-08-05 18:08:43,119][15372] Fps is (10 sec: 24575.2, 60 sec: 24166.3, 300 sec: 24103.9). Total num frames: 302096384. Throughput: 0: 6014.2. Samples: 75524760. Policy #0 lag: (min: 1.0, avg: 3.4, max: 8.0) [2024-08-05 18:08:43,119][15372] Avg episode reward: [(0, '40.890')] [2024-08-05 18:08:44,208][15444] Updated weights for policy 0, policy_version 36881 (0.0013) [2024-08-05 18:08:47,677][15444] Updated weights for policy 0, policy_version 36891 (0.0023) [2024-08-05 18:08:48,118][15372] Fps is (10 sec: 24576.4, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 302219264. Throughput: 0: 6015.4. Samples: 75560730. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 18:08:48,119][15372] Avg episode reward: [(0, '42.444')] [2024-08-05 18:08:51,095][15444] Updated weights for policy 0, policy_version 36901 (0.0024) [2024-08-05 18:08:53,118][15372] Fps is (10 sec: 23758.3, 60 sec: 24029.9, 300 sec: 24103.9). Total num frames: 302333952. Throughput: 0: 6020.2. Samples: 75579070. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 18:08:53,126][15372] Avg episode reward: [(0, '42.279')] [2024-08-05 18:08:54,525][15444] Updated weights for policy 0, policy_version 36911 (0.0023) [2024-08-05 18:08:57,730][15444] Updated weights for policy 0, policy_version 36921 (0.0014) [2024-08-05 18:08:58,118][15372] Fps is (10 sec: 23756.6, 60 sec: 24030.0, 300 sec: 24131.7). Total num frames: 302456832. Throughput: 0: 6034.5. Samples: 75615430. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 18:08:58,119][15372] Avg episode reward: [(0, '42.801')] [2024-08-05 18:09:01,224][15444] Updated weights for policy 0, policy_version 36931 (0.0011) [2024-08-05 18:09:03,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.5, 300 sec: 24103.9). Total num frames: 302579712. Throughput: 0: 6038.5. Samples: 75651380. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 18:09:03,126][15372] Avg episode reward: [(0, '42.989')] [2024-08-05 18:09:04,421][15444] Updated weights for policy 0, policy_version 36941 (0.0020) [2024-08-05 18:09:07,906][15444] Updated weights for policy 0, policy_version 36951 (0.0029) [2024-08-05 18:09:08,118][15372] Fps is (10 sec: 25395.3, 60 sec: 24302.9, 300 sec: 24159.5). Total num frames: 302710784. Throughput: 0: 6060.5. Samples: 75670170. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 18:09:08,119][15372] Avg episode reward: [(0, '42.356')] [2024-08-05 18:09:11,778][15444] Updated weights for policy 0, policy_version 36961 (0.0013) [2024-08-05 18:09:13,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24167.8, 300 sec: 24103.9). Total num frames: 302817280. Throughput: 0: 6040.0. Samples: 75705820. Policy #0 lag: (min: 0.0, avg: 2.8, max: 7.0) [2024-08-05 18:09:13,119][15372] Avg episode reward: [(0, '41.815')] [2024-08-05 18:09:14,710][15444] Updated weights for policy 0, policy_version 36971 (0.0014) [2024-08-05 18:09:18,118][15372] Fps is (10 sec: 22118.3, 60 sec: 24029.9, 300 sec: 24103.9). Total num frames: 302931968. Throughput: 0: 6017.1. Samples: 75741440. Policy #0 lag: (min: 0.0, avg: 2.8, max: 7.0) [2024-08-05 18:09:18,126][15372] Avg episode reward: [(0, '41.603')] [2024-08-05 18:09:18,459][15444] Updated weights for policy 0, policy_version 36981 (0.0013) [2024-08-05 18:09:21,799][15444] Updated weights for policy 0, policy_version 36991 (0.0013) [2024-08-05 18:09:23,114][15417] Signal inference workers to stop experience collection... (13750 times) [2024-08-05 18:09:23,114][15417] Signal inference workers to resume experience collection... (13750 times) [2024-08-05 18:09:23,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 303063040. Throughput: 0: 6042.7. Samples: 75759730. Policy #0 lag: (min: 0.0, avg: 2.8, max: 7.0) [2024-08-05 18:09:23,119][15372] Avg episode reward: [(0, '41.100')] [2024-08-05 18:09:23,163][15444] InferenceWorker_p0-w0: stopping experience collection (13750 times) [2024-08-05 18:09:23,163][15444] InferenceWorker_p0-w0: resuming experience collection (13750 times) [2024-08-05 18:09:25,005][15444] Updated weights for policy 0, policy_version 37001 (0.0013) [2024-08-05 18:09:28,118][15372] Fps is (10 sec: 25395.3, 60 sec: 24302.9, 300 sec: 24131.7). Total num frames: 303185920. Throughput: 0: 6041.9. Samples: 75796640. Policy #0 lag: (min: 0.0, avg: 2.8, max: 7.0) [2024-08-05 18:09:28,126][15372] Avg episode reward: [(0, '40.823')] [2024-08-05 18:09:28,739][15444] Updated weights for policy 0, policy_version 37011 (0.0020) [2024-08-05 18:09:31,695][15444] Updated weights for policy 0, policy_version 37021 (0.0042) [2024-08-05 18:09:33,118][15372] Fps is (10 sec: 22937.5, 60 sec: 24030.0, 300 sec: 24076.1). Total num frames: 303292416. Throughput: 0: 6025.1. Samples: 75831860. Policy #0 lag: (min: 0.0, avg: 2.8, max: 7.0) [2024-08-05 18:09:33,126][15372] Avg episode reward: [(0, '41.797')] [2024-08-05 18:09:35,406][15444] Updated weights for policy 0, policy_version 37031 (0.0014) [2024-08-05 18:09:38,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24303.0, 300 sec: 24131.7). Total num frames: 303431680. Throughput: 0: 6028.2. Samples: 75850340. Policy #0 lag: (min: 0.0, avg: 3.2, max: 7.0) [2024-08-05 18:09:38,119][15372] Avg episode reward: [(0, '41.625')] [2024-08-05 18:09:38,382][15444] Updated weights for policy 0, policy_version 37041 (0.0017) [2024-08-05 18:09:42,042][15444] Updated weights for policy 0, policy_version 37051 (0.0039) [2024-08-05 18:09:43,118][15372] Fps is (10 sec: 25395.3, 60 sec: 24166.6, 300 sec: 24131.7). Total num frames: 303546368. Throughput: 0: 6027.8. Samples: 75886680. Policy #0 lag: (min: 0.0, avg: 3.2, max: 7.0) [2024-08-05 18:09:43,119][15372] Avg episode reward: [(0, '41.530')] [2024-08-05 18:09:45,509][15444] Updated weights for policy 0, policy_version 37061 (0.0012) [2024-08-05 18:09:48,118][15372] Fps is (10 sec: 22937.6, 60 sec: 24029.9, 300 sec: 24103.9). Total num frames: 303661056. Throughput: 0: 6031.3. Samples: 75922790. Policy #0 lag: (min: 0.0, avg: 3.2, max: 7.0) [2024-08-05 18:09:48,126][15372] Avg episode reward: [(0, '41.562')] [2024-08-05 18:09:48,859][15444] Updated weights for policy 0, policy_version 37071 (0.0012) [2024-08-05 18:09:52,412][15444] Updated weights for policy 0, policy_version 37081 (0.0021) [2024-08-05 18:09:53,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24076.2). Total num frames: 303783936. Throughput: 0: 6012.0. Samples: 75940710. Policy #0 lag: (min: 0.0, avg: 3.2, max: 7.0) [2024-08-05 18:09:53,119][15372] Avg episode reward: [(0, '41.074')] [2024-08-05 18:09:55,469][15444] Updated weights for policy 0, policy_version 37091 (0.0013) [2024-08-05 18:09:58,119][15372] Fps is (10 sec: 24575.5, 60 sec: 24166.3, 300 sec: 24103.9). Total num frames: 303906816. Throughput: 0: 6014.4. Samples: 75976470. Policy #0 lag: (min: 0.0, avg: 3.2, max: 7.0) [2024-08-05 18:09:58,126][15372] Avg episode reward: [(0, '40.830')] [2024-08-05 18:09:59,133][15444] Updated weights for policy 0, policy_version 37101 (0.0018) [2024-08-05 18:10:01,034][15417] Signal inference workers to stop experience collection... (13800 times) [2024-08-05 18:10:01,035][15417] Signal inference workers to resume experience collection... (13800 times) [2024-08-05 18:10:01,090][15444] InferenceWorker_p0-w0: stopping experience collection (13800 times) [2024-08-05 18:10:01,090][15444] InferenceWorker_p0-w0: resuming experience collection (13800 times) [2024-08-05 18:10:02,710][15444] Updated weights for policy 0, policy_version 37111 (0.0021) [2024-08-05 18:10:03,119][15372] Fps is (10 sec: 23756.3, 60 sec: 24029.8, 300 sec: 24103.9). Total num frames: 304021504. Throughput: 0: 6036.0. Samples: 76013060. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 18:10:03,119][15372] Avg episode reward: [(0, '41.144')] [2024-08-05 18:10:05,705][15444] Updated weights for policy 0, policy_version 37121 (0.0016) [2024-08-05 18:10:08,119][15372] Fps is (10 sec: 24576.2, 60 sec: 24029.8, 300 sec: 24103.9). Total num frames: 304152576. Throughput: 0: 6029.5. Samples: 76031060. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 18:10:08,126][15372] Avg episode reward: [(0, '40.971')] [2024-08-05 18:10:09,185][15444] Updated weights for policy 0, policy_version 37131 (0.0018) [2024-08-05 18:10:12,767][15444] Updated weights for policy 0, policy_version 37141 (0.0015) [2024-08-05 18:10:13,118][15372] Fps is (10 sec: 24576.6, 60 sec: 24166.4, 300 sec: 24103.9). Total num frames: 304267264. Throughput: 0: 6008.0. Samples: 76067000. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 18:10:13,119][15372] Avg episode reward: [(0, '41.520')] [2024-08-05 18:10:15,970][15444] Updated weights for policy 0, policy_version 37151 (0.0029) [2024-08-05 18:10:18,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24302.9, 300 sec: 24076.1). Total num frames: 304390144. Throughput: 0: 6026.2. Samples: 76103040. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 18:10:18,127][15372] Avg episode reward: [(0, '41.782')] [2024-08-05 18:10:18,130][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000037157_304390144.pth... [2024-08-05 18:10:18,272][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000036450_298598400.pth [2024-08-05 18:10:19,631][15444] Updated weights for policy 0, policy_version 37161 (0.0018) [2024-08-05 18:10:23,003][15444] Updated weights for policy 0, policy_version 37171 (0.0025) [2024-08-05 18:10:23,119][15372] Fps is (10 sec: 23756.3, 60 sec: 24029.8, 300 sec: 24076.1). Total num frames: 304504832. Throughput: 0: 6010.0. Samples: 76120790. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 18:10:23,119][15372] Avg episode reward: [(0, '40.825')] [2024-08-05 18:10:26,247][15444] Updated weights for policy 0, policy_version 37181 (0.0023) [2024-08-05 18:10:28,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.8, 300 sec: 24103.9). Total num frames: 304627712. Throughput: 0: 6000.2. Samples: 76156690. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 18:10:28,126][15372] Avg episode reward: [(0, '41.152')] [2024-08-05 18:10:29,691][15444] Updated weights for policy 0, policy_version 37191 (0.0021) [2024-08-05 18:10:33,118][15372] Fps is (10 sec: 23757.3, 60 sec: 24166.4, 300 sec: 24076.2). Total num frames: 304742400. Throughput: 0: 6005.6. Samples: 76193040. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 18:10:33,126][15372] Avg episode reward: [(0, '42.192')] [2024-08-05 18:10:33,175][15444] Updated weights for policy 0, policy_version 37201 (0.0021) [2024-08-05 18:10:36,371][15444] Updated weights for policy 0, policy_version 37211 (0.0012) [2024-08-05 18:10:38,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24029.9, 300 sec: 24103.9). Total num frames: 304873472. Throughput: 0: 6030.9. Samples: 76212100. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 18:10:38,126][15372] Avg episode reward: [(0, '42.094')] [2024-08-05 18:10:39,724][15444] Updated weights for policy 0, policy_version 37221 (0.0018) [2024-08-05 18:10:43,111][15444] Updated weights for policy 0, policy_version 37231 (0.0025) [2024-08-05 18:10:43,118][15372] Fps is (10 sec: 25395.2, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 304996352. Throughput: 0: 6057.8. Samples: 76249070. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 18:10:43,119][15372] Avg episode reward: [(0, '42.159')] [2024-08-05 18:10:46,355][15444] Updated weights for policy 0, policy_version 37241 (0.0011) [2024-08-05 18:10:48,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24103.9). Total num frames: 305111040. Throughput: 0: 6038.5. Samples: 76284790. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 18:10:48,126][15372] Avg episode reward: [(0, '41.695')] [2024-08-05 18:10:50,002][15444] Updated weights for policy 0, policy_version 37251 (0.0015) [2024-08-05 18:10:53,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 305233920. Throughput: 0: 6044.7. Samples: 76303070. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 18:10:53,119][15372] Avg episode reward: [(0, '41.003')] [2024-08-05 18:10:53,271][15444] Updated weights for policy 0, policy_version 37261 (0.0011) [2024-08-05 18:10:56,723][15444] Updated weights for policy 0, policy_version 37271 (0.0017) [2024-08-05 18:10:58,119][15372] Fps is (10 sec: 24575.8, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 305356800. Throughput: 0: 6052.7. Samples: 76339370. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 18:10:58,119][15372] Avg episode reward: [(0, '40.875')] [2024-08-05 18:10:59,875][15444] Updated weights for policy 0, policy_version 37281 (0.0013) [2024-08-05 18:11:00,562][15417] Signal inference workers to stop experience collection... (13850 times) [2024-08-05 18:11:00,563][15417] Signal inference workers to resume experience collection... (13850 times) [2024-08-05 18:11:00,609][15444] InferenceWorker_p0-w0: stopping experience collection (13850 times) [2024-08-05 18:11:00,609][15444] InferenceWorker_p0-w0: resuming experience collection (13850 times) [2024-08-05 18:11:03,119][15372] Fps is (10 sec: 24575.0, 60 sec: 24302.9, 300 sec: 24131.7). Total num frames: 305479680. Throughput: 0: 6074.0. Samples: 76376370. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 18:11:03,119][15372] Avg episode reward: [(0, '41.943')] [2024-08-05 18:11:03,307][15444] Updated weights for policy 0, policy_version 37291 (0.0011) [2024-08-05 18:11:06,870][15444] Updated weights for policy 0, policy_version 37301 (0.0036) [2024-08-05 18:11:08,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24166.5, 300 sec: 24159.5). Total num frames: 305602560. Throughput: 0: 6075.4. Samples: 76394180. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 18:11:08,119][15372] Avg episode reward: [(0, '42.346')] [2024-08-05 18:11:10,085][15444] Updated weights for policy 0, policy_version 37311 (0.0025) [2024-08-05 18:11:13,118][15372] Fps is (10 sec: 24576.9, 60 sec: 24302.9, 300 sec: 24159.5). Total num frames: 305725440. Throughput: 0: 6101.3. Samples: 76431250. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 18:11:13,126][15372] Avg episode reward: [(0, '40.913')] [2024-08-05 18:11:13,303][15444] Updated weights for policy 0, policy_version 37321 (0.0018) [2024-08-05 18:11:16,707][15444] Updated weights for policy 0, policy_version 37331 (0.0019) [2024-08-05 18:11:18,119][15372] Fps is (10 sec: 24575.1, 60 sec: 24302.8, 300 sec: 24159.4). Total num frames: 305848320. Throughput: 0: 6091.3. Samples: 76467150. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 18:11:18,119][15372] Avg episode reward: [(0, '40.255')] [2024-08-05 18:11:19,979][15444] Updated weights for policy 0, policy_version 37341 (0.0012) [2024-08-05 18:11:23,119][15372] Fps is (10 sec: 24575.7, 60 sec: 24439.5, 300 sec: 24159.4). Total num frames: 305971200. Throughput: 0: 6093.3. Samples: 76486300. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 18:11:23,126][15372] Avg episode reward: [(0, '40.822')] [2024-08-05 18:11:23,544][15444] Updated weights for policy 0, policy_version 37351 (0.0018) [2024-08-05 18:11:26,849][15444] Updated weights for policy 0, policy_version 37361 (0.0012) [2024-08-05 18:11:28,119][15372] Fps is (10 sec: 23756.9, 60 sec: 24302.8, 300 sec: 24131.7). Total num frames: 306085888. Throughput: 0: 6067.5. Samples: 76522110. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 18:11:28,119][15372] Avg episode reward: [(0, '41.820')] [2024-08-05 18:11:30,171][15444] Updated weights for policy 0, policy_version 37371 (0.0011) [2024-08-05 18:11:33,119][15372] Fps is (10 sec: 22937.1, 60 sec: 24302.8, 300 sec: 24103.9). Total num frames: 306200576. Throughput: 0: 6076.2. Samples: 76558220. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 18:11:33,127][15372] Avg episode reward: [(0, '41.323')] [2024-08-05 18:11:33,688][15444] Updated weights for policy 0, policy_version 37381 (0.0012) [2024-08-05 18:11:37,345][15444] Updated weights for policy 0, policy_version 37391 (0.0022) [2024-08-05 18:11:38,118][15372] Fps is (10 sec: 24576.8, 60 sec: 24302.9, 300 sec: 24159.5). Total num frames: 306331648. Throughput: 0: 6072.0. Samples: 76576310. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 18:11:38,119][15372] Avg episode reward: [(0, '42.034')] [2024-08-05 18:11:40,253][15444] Updated weights for policy 0, policy_version 37401 (0.0014) [2024-08-05 18:11:43,133][15372] Fps is (10 sec: 25358.0, 60 sec: 24296.8, 300 sec: 24158.2). Total num frames: 306454528. Throughput: 0: 6074.0. Samples: 76612790. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 18:11:43,134][15372] Avg episode reward: [(0, '42.211')] [2024-08-05 18:11:44,036][15444] Updated weights for policy 0, policy_version 37411 (0.0025) [2024-08-05 18:11:47,144][15444] Updated weights for policy 0, policy_version 37421 (0.0010) [2024-08-05 18:11:48,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24302.9, 300 sec: 24131.7). Total num frames: 306569216. Throughput: 0: 6046.9. Samples: 76648480. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 18:11:48,126][15372] Avg episode reward: [(0, '42.556')] [2024-08-05 18:11:50,695][15444] Updated weights for policy 0, policy_version 37431 (0.0012) [2024-08-05 18:11:51,078][15417] Signal inference workers to stop experience collection... (13900 times) [2024-08-05 18:11:51,078][15417] Signal inference workers to resume experience collection... (13900 times) [2024-08-05 18:11:51,148][15444] InferenceWorker_p0-w0: stopping experience collection (13900 times) [2024-08-05 18:11:51,148][15444] InferenceWorker_p0-w0: resuming experience collection (13900 times) [2024-08-05 18:11:53,119][15372] Fps is (10 sec: 24612.6, 60 sec: 24439.4, 300 sec: 24187.2). Total num frames: 306700288. Throughput: 0: 6068.0. Samples: 76667240. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 18:11:53,119][15372] Avg episode reward: [(0, '41.980')] [2024-08-05 18:11:54,089][15444] Updated weights for policy 0, policy_version 37441 (0.0019) [2024-08-05 18:11:57,303][15444] Updated weights for policy 0, policy_version 37451 (0.0019) [2024-08-05 18:11:58,118][15372] Fps is (10 sec: 24575.8, 60 sec: 24302.9, 300 sec: 24159.5). Total num frames: 306814976. Throughput: 0: 6059.3. Samples: 76703920. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 18:11:58,119][15372] Avg episode reward: [(0, '42.242')] [2024-08-05 18:12:00,811][15444] Updated weights for policy 0, policy_version 37461 (0.0015) [2024-08-05 18:12:03,119][15372] Fps is (10 sec: 22936.7, 60 sec: 24166.3, 300 sec: 24131.6). Total num frames: 306929664. Throughput: 0: 6064.0. Samples: 76740030. Policy #0 lag: (min: 1.0, avg: 3.5, max: 7.0) [2024-08-05 18:12:03,127][15372] Avg episode reward: [(0, '41.167')] [2024-08-05 18:12:04,105][15444] Updated weights for policy 0, policy_version 37471 (0.0016) [2024-08-05 18:12:07,649][15444] Updated weights for policy 0, policy_version 37481 (0.0041) [2024-08-05 18:12:08,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 307052544. Throughput: 0: 6031.4. Samples: 76757710. Policy #0 lag: (min: 1.0, avg: 3.5, max: 7.0) [2024-08-05 18:12:08,119][15372] Avg episode reward: [(0, '40.748')] [2024-08-05 18:12:11,157][15444] Updated weights for policy 0, policy_version 37491 (0.0030) [2024-08-05 18:12:13,119][15372] Fps is (10 sec: 24577.2, 60 sec: 24166.4, 300 sec: 24159.4). Total num frames: 307175424. Throughput: 0: 6032.7. Samples: 76793580. Policy #0 lag: (min: 1.0, avg: 3.5, max: 7.0) [2024-08-05 18:12:13,126][15372] Avg episode reward: [(0, '41.021')] [2024-08-05 18:12:14,274][15444] Updated weights for policy 0, policy_version 37501 (0.0022) [2024-08-05 18:12:18,029][15444] Updated weights for policy 0, policy_version 37511 (0.0012) [2024-08-05 18:12:18,118][15372] Fps is (10 sec: 23756.6, 60 sec: 24030.0, 300 sec: 24131.7). Total num frames: 307290112. Throughput: 0: 6048.5. Samples: 76830400. Policy #0 lag: (min: 1.0, avg: 3.5, max: 7.0) [2024-08-05 18:12:18,119][15372] Avg episode reward: [(0, '41.814')] [2024-08-05 18:12:18,122][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000037511_307290112.pth... [2024-08-05 18:12:18,267][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000036803_301490176.pth [2024-08-05 18:12:20,955][15444] Updated weights for policy 0, policy_version 37521 (0.0022) [2024-08-05 18:12:23,119][15372] Fps is (10 sec: 24575.9, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 307421184. Throughput: 0: 6039.3. Samples: 76848080. Policy #0 lag: (min: 1.0, avg: 3.4, max: 8.0) [2024-08-05 18:12:23,126][15372] Avg episode reward: [(0, '41.580')] [2024-08-05 18:12:24,791][15444] Updated weights for policy 0, policy_version 37531 (0.0012) [2024-08-05 18:12:28,062][15444] Updated weights for policy 0, policy_version 37541 (0.0013) [2024-08-05 18:12:28,119][15372] Fps is (10 sec: 24575.3, 60 sec: 24166.4, 300 sec: 24159.4). Total num frames: 307535872. Throughput: 0: 6024.6. Samples: 76883810. Policy #0 lag: (min: 1.0, avg: 3.4, max: 8.0) [2024-08-05 18:12:28,119][15372] Avg episode reward: [(0, '42.022')] [2024-08-05 18:12:31,398][15444] Updated weights for policy 0, policy_version 37551 (0.0017) [2024-08-05 18:12:33,118][15372] Fps is (10 sec: 22937.9, 60 sec: 24166.6, 300 sec: 24131.7). Total num frames: 307650560. Throughput: 0: 6031.6. Samples: 76919900. Policy #0 lag: (min: 1.0, avg: 3.4, max: 8.0) [2024-08-05 18:12:33,126][15372] Avg episode reward: [(0, '42.089')] [2024-08-05 18:12:33,599][15417] Signal inference workers to stop experience collection... (13950 times) [2024-08-05 18:12:33,599][15417] Signal inference workers to resume experience collection... (13950 times) [2024-08-05 18:12:33,648][15444] InferenceWorker_p0-w0: stopping experience collection (13950 times) [2024-08-05 18:12:33,654][15444] InferenceWorker_p0-w0: resuming experience collection (13950 times) [2024-08-05 18:12:35,015][15444] Updated weights for policy 0, policy_version 37561 (0.0016) [2024-08-05 18:12:38,118][15372] Fps is (10 sec: 23757.5, 60 sec: 24029.8, 300 sec: 24159.5). Total num frames: 307773440. Throughput: 0: 6012.7. Samples: 76937810. Policy #0 lag: (min: 1.0, avg: 3.4, max: 8.0) [2024-08-05 18:12:38,126][15372] Avg episode reward: [(0, '42.642')] [2024-08-05 18:12:38,134][15444] Updated weights for policy 0, policy_version 37571 (0.0020) [2024-08-05 18:12:41,684][15444] Updated weights for policy 0, policy_version 37581 (0.0019) [2024-08-05 18:12:43,119][15372] Fps is (10 sec: 23756.3, 60 sec: 23899.2, 300 sec: 24131.7). Total num frames: 307888128. Throughput: 0: 5996.9. Samples: 76973780. Policy #0 lag: (min: 1.0, avg: 3.4, max: 8.0) [2024-08-05 18:12:43,119][15372] Avg episode reward: [(0, '41.918')] [2024-08-05 18:12:45,226][15444] Updated weights for policy 0, policy_version 37591 (0.0018) [2024-08-05 18:12:48,119][15372] Fps is (10 sec: 24575.7, 60 sec: 24166.3, 300 sec: 24159.5). Total num frames: 308019200. Throughput: 0: 5998.3. Samples: 77009950. Policy #0 lag: (min: 1.0, avg: 4.4, max: 8.0) [2024-08-05 18:12:48,119][15372] Avg episode reward: [(0, '41.996')] [2024-08-05 18:12:48,436][15444] Updated weights for policy 0, policy_version 37601 (0.0015) [2024-08-05 18:12:52,081][15444] Updated weights for policy 0, policy_version 37611 (0.0010) [2024-08-05 18:12:53,118][15372] Fps is (10 sec: 24576.5, 60 sec: 23893.4, 300 sec: 24131.7). Total num frames: 308133888. Throughput: 0: 6012.0. Samples: 77028250. Policy #0 lag: (min: 1.0, avg: 4.4, max: 8.0) [2024-08-05 18:12:53,119][15372] Avg episode reward: [(0, '41.243')] [2024-08-05 18:12:55,213][15444] Updated weights for policy 0, policy_version 37621 (0.0036) [2024-08-05 18:12:58,118][15372] Fps is (10 sec: 23757.1, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 308256768. Throughput: 0: 6015.8. Samples: 77064290. Policy #0 lag: (min: 1.0, avg: 4.4, max: 8.0) [2024-08-05 18:12:58,126][15372] Avg episode reward: [(0, '40.877')] [2024-08-05 18:12:58,727][15444] Updated weights for policy 0, policy_version 37631 (0.0021) [2024-08-05 18:13:02,201][15444] Updated weights for policy 0, policy_version 37641 (0.0023) [2024-08-05 18:13:03,119][15372] Fps is (10 sec: 24575.7, 60 sec: 24166.6, 300 sec: 24159.4). Total num frames: 308379648. Throughput: 0: 6003.1. Samples: 77100540. Policy #0 lag: (min: 1.0, avg: 4.4, max: 8.0) [2024-08-05 18:13:03,119][15372] Avg episode reward: [(0, '42.010')] [2024-08-05 18:13:05,381][15444] Updated weights for policy 0, policy_version 37651 (0.0013) [2024-08-05 18:13:08,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24166.4, 300 sec: 24187.5). Total num frames: 308502528. Throughput: 0: 6010.0. Samples: 77118530. Policy #0 lag: (min: 1.0, avg: 4.4, max: 8.0) [2024-08-05 18:13:08,119][15372] Avg episode reward: [(0, '42.326')] [2024-08-05 18:13:08,984][15444] Updated weights for policy 0, policy_version 37661 (0.0017) [2024-08-05 18:13:12,034][15444] Updated weights for policy 0, policy_version 37671 (0.0023) [2024-08-05 18:13:13,119][15372] Fps is (10 sec: 23756.9, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 308617216. Throughput: 0: 6024.9. Samples: 77154930. Policy #0 lag: (min: 0.0, avg: 4.5, max: 8.0) [2024-08-05 18:13:13,127][15372] Avg episode reward: [(0, '42.018')] [2024-08-05 18:13:15,721][15444] Updated weights for policy 0, policy_version 37681 (0.0022) [2024-08-05 18:13:17,934][15417] Signal inference workers to stop experience collection... (14000 times) [2024-08-05 18:13:17,949][15417] Signal inference workers to resume experience collection... (14000 times) [2024-08-05 18:13:18,011][15444] InferenceWorker_p0-w0: stopping experience collection (14000 times) [2024-08-05 18:13:18,017][15444] InferenceWorker_p0-w0: resuming experience collection (14000 times) [2024-08-05 18:13:18,119][15372] Fps is (10 sec: 23756.5, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 308740096. Throughput: 0: 6038.4. Samples: 77191630. Policy #0 lag: (min: 0.0, avg: 4.5, max: 8.0) [2024-08-05 18:13:18,119][15372] Avg episode reward: [(0, '41.010')] [2024-08-05 18:13:19,105][15444] Updated weights for policy 0, policy_version 37691 (0.0012) [2024-08-05 18:13:22,355][15444] Updated weights for policy 0, policy_version 37701 (0.0030) [2024-08-05 18:13:23,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24029.9, 300 sec: 24187.2). Total num frames: 308862976. Throughput: 0: 6023.6. Samples: 77208870. Policy #0 lag: (min: 0.0, avg: 4.5, max: 8.0) [2024-08-05 18:13:23,119][15372] Avg episode reward: [(0, '42.467')] [2024-08-05 18:13:26,041][15444] Updated weights for policy 0, policy_version 37711 (0.0013) [2024-08-05 18:13:28,118][15372] Fps is (10 sec: 22937.9, 60 sec: 23893.5, 300 sec: 24131.7). Total num frames: 308969472. Throughput: 0: 6020.7. Samples: 77244710. Policy #0 lag: (min: 0.0, avg: 4.5, max: 8.0) [2024-08-05 18:13:28,119][15372] Avg episode reward: [(0, '42.082')] [2024-08-05 18:13:29,261][15444] Updated weights for policy 0, policy_version 37721 (0.0020) [2024-08-05 18:13:32,952][15444] Updated weights for policy 0, policy_version 37731 (0.0033) [2024-08-05 18:13:33,119][15372] Fps is (10 sec: 22937.3, 60 sec: 24029.8, 300 sec: 24131.7). Total num frames: 309092352. Throughput: 0: 6006.9. Samples: 77280260. Policy #0 lag: (min: 0.0, avg: 4.5, max: 8.0) [2024-08-05 18:13:33,119][15372] Avg episode reward: [(0, '41.361')] [2024-08-05 18:13:36,518][15444] Updated weights for policy 0, policy_version 37741 (0.0024) [2024-08-05 18:13:38,121][15372] Fps is (10 sec: 25388.5, 60 sec: 24165.4, 300 sec: 24159.3). Total num frames: 309223424. Throughput: 0: 5997.9. Samples: 77298170. Policy #0 lag: (min: 1.0, avg: 3.1, max: 7.0) [2024-08-05 18:13:38,121][15372] Avg episode reward: [(0, '41.819')] [2024-08-05 18:13:39,643][15444] Updated weights for policy 0, policy_version 37751 (0.0013) [2024-08-05 18:13:43,118][15372] Fps is (10 sec: 24576.5, 60 sec: 24166.5, 300 sec: 24131.7). Total num frames: 309338112. Throughput: 0: 6012.7. Samples: 77334860. Policy #0 lag: (min: 1.0, avg: 3.1, max: 7.0) [2024-08-05 18:13:43,126][15372] Avg episode reward: [(0, '42.324')] [2024-08-05 18:13:43,127][15444] Updated weights for policy 0, policy_version 37761 (0.0024) [2024-08-05 18:13:46,314][15444] Updated weights for policy 0, policy_version 37771 (0.0022) [2024-08-05 18:13:48,118][15372] Fps is (10 sec: 23763.1, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 309460992. Throughput: 0: 6001.1. Samples: 77370590. Policy #0 lag: (min: 1.0, avg: 3.1, max: 7.0) [2024-08-05 18:13:48,126][15372] Avg episode reward: [(0, '41.987')] [2024-08-05 18:13:49,790][15444] Updated weights for policy 0, policy_version 37781 (0.0036) [2024-08-05 18:13:53,119][15372] Fps is (10 sec: 23756.2, 60 sec: 24029.8, 300 sec: 24131.7). Total num frames: 309575680. Throughput: 0: 6010.9. Samples: 77389020. Policy #0 lag: (min: 1.0, avg: 3.1, max: 7.0) [2024-08-05 18:13:53,126][15372] Avg episode reward: [(0, '41.719')] [2024-08-05 18:13:53,289][15444] Updated weights for policy 0, policy_version 37791 (0.0028) [2024-08-05 18:13:56,339][15444] Updated weights for policy 0, policy_version 37801 (0.0030) [2024-08-05 18:13:58,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 309698560. Throughput: 0: 6008.7. Samples: 77425320. Policy #0 lag: (min: 1.0, avg: 3.1, max: 7.0) [2024-08-05 18:13:58,126][15372] Avg episode reward: [(0, '41.542')] [2024-08-05 18:13:58,166][15417] Signal inference workers to stop experience collection... (14050 times) [2024-08-05 18:13:58,171][15417] Signal inference workers to resume experience collection... (14050 times) [2024-08-05 18:13:58,223][15444] InferenceWorker_p0-w0: stopping experience collection (14050 times) [2024-08-05 18:13:58,223][15444] InferenceWorker_p0-w0: resuming experience collection (14050 times) [2024-08-05 18:14:00,071][15444] Updated weights for policy 0, policy_version 37811 (0.0015) [2024-08-05 18:14:03,119][15372] Fps is (10 sec: 24575.6, 60 sec: 24029.8, 300 sec: 24103.9). Total num frames: 309821440. Throughput: 0: 5994.6. Samples: 77461390. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 18:14:03,127][15372] Avg episode reward: [(0, '41.463')] [2024-08-05 18:14:03,261][15444] Updated weights for policy 0, policy_version 37821 (0.0014) [2024-08-05 18:14:06,726][15444] Updated weights for policy 0, policy_version 37831 (0.0038) [2024-08-05 18:14:08,118][15372] Fps is (10 sec: 23756.8, 60 sec: 23893.3, 300 sec: 24131.7). Total num frames: 309936128. Throughput: 0: 6012.4. Samples: 77479430. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 18:14:08,119][15372] Avg episode reward: [(0, '42.216')] [2024-08-05 18:14:10,223][15444] Updated weights for policy 0, policy_version 37841 (0.0011) [2024-08-05 18:14:13,121][15372] Fps is (10 sec: 23752.1, 60 sec: 24029.0, 300 sec: 24159.3). Total num frames: 310059008. Throughput: 0: 6027.5. Samples: 77515960. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 18:14:13,129][15372] Avg episode reward: [(0, '41.782')] [2024-08-05 18:14:13,467][15444] Updated weights for policy 0, policy_version 37851 (0.0022) [2024-08-05 18:14:16,957][15444] Updated weights for policy 0, policy_version 37861 (0.0030) [2024-08-05 18:14:18,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 310181888. Throughput: 0: 6038.0. Samples: 77551970. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 18:14:18,119][15372] Avg episode reward: [(0, '40.553')] [2024-08-05 18:14:18,144][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000037865_310190080.pth... [2024-08-05 18:14:18,257][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000037157_304390144.pth [2024-08-05 18:14:20,198][15444] Updated weights for policy 0, policy_version 37871 (0.0012) [2024-08-05 18:14:23,119][15372] Fps is (10 sec: 23761.9, 60 sec: 23893.3, 300 sec: 24103.9). Total num frames: 310296576. Throughput: 0: 6048.8. Samples: 77570350. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 18:14:23,126][15372] Avg episode reward: [(0, '40.660')] [2024-08-05 18:14:23,708][15444] Updated weights for policy 0, policy_version 37881 (0.0019) [2024-08-05 18:14:27,167][15444] Updated weights for policy 0, policy_version 37891 (0.0036) [2024-08-05 18:14:28,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 310427648. Throughput: 0: 6031.5. Samples: 77606280. Policy #0 lag: (min: 1.0, avg: 3.5, max: 7.0) [2024-08-05 18:14:28,119][15372] Avg episode reward: [(0, '40.905')] [2024-08-05 18:14:30,366][15444] Updated weights for policy 0, policy_version 37901 (0.0021) [2024-08-05 18:14:33,119][15372] Fps is (10 sec: 25395.1, 60 sec: 24302.9, 300 sec: 24131.7). Total num frames: 310550528. Throughput: 0: 6045.1. Samples: 77642620. Policy #0 lag: (min: 1.0, avg: 3.5, max: 7.0) [2024-08-05 18:14:33,119][15372] Avg episode reward: [(0, '41.545')] [2024-08-05 18:14:33,910][15444] Updated weights for policy 0, policy_version 37911 (0.0018) [2024-08-05 18:14:37,239][15444] Updated weights for policy 0, policy_version 37921 (0.0024) [2024-08-05 18:14:38,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24030.9, 300 sec: 24131.7). Total num frames: 310665216. Throughput: 0: 6041.8. Samples: 77660900. Policy #0 lag: (min: 1.0, avg: 3.5, max: 7.0) [2024-08-05 18:14:38,119][15372] Avg episode reward: [(0, '41.923')] [2024-08-05 18:14:40,491][15444] Updated weights for policy 0, policy_version 37931 (0.0022) [2024-08-05 18:14:43,119][15372] Fps is (10 sec: 23756.6, 60 sec: 24166.3, 300 sec: 24159.4). Total num frames: 310788096. Throughput: 0: 6025.3. Samples: 77696460. Policy #0 lag: (min: 1.0, avg: 3.5, max: 7.0) [2024-08-05 18:14:43,127][15372] Avg episode reward: [(0, '40.593')] [2024-08-05 18:14:44,114][15444] Updated weights for policy 0, policy_version 37941 (0.0017) [2024-08-05 18:14:47,828][15444] Updated weights for policy 0, policy_version 37951 (0.0035) [2024-08-05 18:14:47,896][15417] Signal inference workers to stop experience collection... (14100 times) [2024-08-05 18:14:47,897][15417] Signal inference workers to resume experience collection... (14100 times) [2024-08-05 18:14:47,928][15444] InferenceWorker_p0-w0: stopping experience collection (14100 times) [2024-08-05 18:14:47,928][15444] InferenceWorker_p0-w0: resuming experience collection (14100 times) [2024-08-05 18:14:48,119][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 310902784. Throughput: 0: 6027.4. Samples: 77732620. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 18:14:48,119][15372] Avg episode reward: [(0, '39.915')] [2024-08-05 18:14:50,845][15444] Updated weights for policy 0, policy_version 37961 (0.0024) [2024-08-05 18:14:53,119][15372] Fps is (10 sec: 23757.2, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 311025664. Throughput: 0: 6027.8. Samples: 77750680. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 18:14:53,120][15372] Avg episode reward: [(0, '40.873')] [2024-08-05 18:14:54,560][15444] Updated weights for policy 0, policy_version 37971 (0.0029) [2024-08-05 18:14:57,627][15444] Updated weights for policy 0, policy_version 37981 (0.0013) [2024-08-05 18:14:58,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 311140352. Throughput: 0: 6019.4. Samples: 77786820. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 18:14:58,119][15372] Avg episode reward: [(0, '41.759')] [2024-08-05 18:15:01,206][15444] Updated weights for policy 0, policy_version 37991 (0.0021) [2024-08-05 18:15:03,118][15372] Fps is (10 sec: 23757.1, 60 sec: 24030.0, 300 sec: 24103.9). Total num frames: 311263232. Throughput: 0: 6015.8. Samples: 77822680. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 18:15:03,119][15372] Avg episode reward: [(0, '41.234')] [2024-08-05 18:15:04,481][15444] Updated weights for policy 0, policy_version 38001 (0.0012) [2024-08-05 18:15:07,818][15444] Updated weights for policy 0, policy_version 38011 (0.0011) [2024-08-05 18:15:08,119][15372] Fps is (10 sec: 24575.6, 60 sec: 24166.3, 300 sec: 24131.7). Total num frames: 311386112. Throughput: 0: 6015.8. Samples: 77841060. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 18:15:08,119][15372] Avg episode reward: [(0, '41.417')] [2024-08-05 18:15:11,381][15444] Updated weights for policy 0, policy_version 38021 (0.0018) [2024-08-05 18:15:13,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24167.3, 300 sec: 24131.7). Total num frames: 311508992. Throughput: 0: 6012.4. Samples: 77876840. Policy #0 lag: (min: 0.0, avg: 4.3, max: 9.0) [2024-08-05 18:15:13,126][15372] Avg episode reward: [(0, '41.940')] [2024-08-05 18:15:14,654][15444] Updated weights for policy 0, policy_version 38031 (0.0034) [2024-08-05 18:15:18,104][15444] Updated weights for policy 0, policy_version 38041 (0.0016) [2024-08-05 18:15:18,118][15372] Fps is (10 sec: 24576.5, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 311631872. Throughput: 0: 6024.9. Samples: 77913740. Policy #0 lag: (min: 0.0, avg: 4.3, max: 9.0) [2024-08-05 18:15:18,119][15372] Avg episode reward: [(0, '42.532')] [2024-08-05 18:15:21,288][15444] Updated weights for policy 0, policy_version 38051 (0.0011) [2024-08-05 18:15:23,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24303.0, 300 sec: 24159.5). Total num frames: 311754752. Throughput: 0: 6028.9. Samples: 77932200. Policy #0 lag: (min: 0.0, avg: 4.3, max: 9.0) [2024-08-05 18:15:23,126][15372] Avg episode reward: [(0, '41.360')] [2024-08-05 18:15:24,848][15444] Updated weights for policy 0, policy_version 38061 (0.0020) [2024-08-05 18:15:28,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 311869440. Throughput: 0: 6062.3. Samples: 77969260. Policy #0 lag: (min: 0.0, avg: 4.3, max: 9.0) [2024-08-05 18:15:28,126][15372] Avg episode reward: [(0, '41.779')] [2024-08-05 18:15:28,146][15444] Updated weights for policy 0, policy_version 38071 (0.0021) [2024-08-05 18:15:31,414][15444] Updated weights for policy 0, policy_version 38081 (0.0032) [2024-08-05 18:15:33,119][15372] Fps is (10 sec: 24575.6, 60 sec: 24166.4, 300 sec: 24159.4). Total num frames: 312000512. Throughput: 0: 6052.6. Samples: 78004990. Policy #0 lag: (min: 0.0, avg: 4.3, max: 9.0) [2024-08-05 18:15:33,126][15372] Avg episode reward: [(0, '42.333')] [2024-08-05 18:15:34,876][15444] Updated weights for policy 0, policy_version 38091 (0.0017) [2024-08-05 18:15:37,325][15417] Signal inference workers to stop experience collection... (14150 times) [2024-08-05 18:15:37,326][15417] Signal inference workers to resume experience collection... (14150 times) [2024-08-05 18:15:37,404][15444] InferenceWorker_p0-w0: stopping experience collection (14150 times) [2024-08-05 18:15:37,409][15444] InferenceWorker_p0-w0: resuming experience collection (14150 times) [2024-08-05 18:15:38,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 312115200. Throughput: 0: 6062.0. Samples: 78023470. Policy #0 lag: (min: 1.0, avg: 3.6, max: 7.0) [2024-08-05 18:15:38,126][15372] Avg episode reward: [(0, '41.305')] [2024-08-05 18:15:38,345][15444] Updated weights for policy 0, policy_version 38101 (0.0019) [2024-08-05 18:15:41,577][15444] Updated weights for policy 0, policy_version 38111 (0.0011) [2024-08-05 18:15:43,118][15372] Fps is (10 sec: 23757.2, 60 sec: 24166.5, 300 sec: 24159.5). Total num frames: 312238080. Throughput: 0: 6061.8. Samples: 78059600. Policy #0 lag: (min: 1.0, avg: 3.6, max: 7.0) [2024-08-05 18:15:43,126][15372] Avg episode reward: [(0, '41.067')] [2024-08-05 18:15:44,939][15444] Updated weights for policy 0, policy_version 38121 (0.0019) [2024-08-05 18:15:48,091][15444] Updated weights for policy 0, policy_version 38131 (0.0026) [2024-08-05 18:15:48,119][15372] Fps is (10 sec: 25394.4, 60 sec: 24439.3, 300 sec: 24187.2). Total num frames: 312369152. Throughput: 0: 6080.6. Samples: 78096310. Policy #0 lag: (min: 1.0, avg: 3.6, max: 7.0) [2024-08-05 18:15:48,119][15372] Avg episode reward: [(0, '41.543')] [2024-08-05 18:15:51,633][15444] Updated weights for policy 0, policy_version 38141 (0.0014) [2024-08-05 18:15:53,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.5, 300 sec: 24131.7). Total num frames: 312475648. Throughput: 0: 6094.2. Samples: 78115300. Policy #0 lag: (min: 1.0, avg: 3.6, max: 7.0) [2024-08-05 18:15:53,126][15372] Avg episode reward: [(0, '41.597')] [2024-08-05 18:15:55,202][15444] Updated weights for policy 0, policy_version 38151 (0.0022) [2024-08-05 18:15:58,124][15372] Fps is (10 sec: 23744.2, 60 sec: 24437.2, 300 sec: 24159.0). Total num frames: 312606720. Throughput: 0: 6112.3. Samples: 78151930. Policy #0 lag: (min: 1.0, avg: 3.6, max: 7.0) [2024-08-05 18:15:58,132][15372] Avg episode reward: [(0, '41.514')] [2024-08-05 18:15:58,182][15444] Updated weights for policy 0, policy_version 38161 (0.0013) [2024-08-05 18:16:01,798][15444] Updated weights for policy 0, policy_version 38171 (0.0013) [2024-08-05 18:16:03,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24302.9, 300 sec: 24131.7). Total num frames: 312721408. Throughput: 0: 6076.4. Samples: 78187180. Policy #0 lag: (min: 0.0, avg: 5.0, max: 10.0) [2024-08-05 18:16:03,119][15372] Avg episode reward: [(0, '42.306')] [2024-08-05 18:16:05,330][15444] Updated weights for policy 0, policy_version 38181 (0.0023) [2024-08-05 18:16:08,119][15372] Fps is (10 sec: 23768.3, 60 sec: 24302.7, 300 sec: 24131.6). Total num frames: 312844288. Throughput: 0: 6085.0. Samples: 78206030. Policy #0 lag: (min: 0.0, avg: 5.0, max: 10.0) [2024-08-05 18:16:08,120][15372] Avg episode reward: [(0, '41.922')] [2024-08-05 18:16:08,458][15444] Updated weights for policy 0, policy_version 38191 (0.0014) [2024-08-05 18:16:12,183][15444] Updated weights for policy 0, policy_version 38201 (0.0023) [2024-08-05 18:16:13,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24302.9, 300 sec: 24131.7). Total num frames: 312967168. Throughput: 0: 6056.0. Samples: 78241780. Policy #0 lag: (min: 0.0, avg: 5.0, max: 10.0) [2024-08-05 18:16:13,119][15372] Avg episode reward: [(0, '40.952')] [2024-08-05 18:16:15,379][15444] Updated weights for policy 0, policy_version 38211 (0.0012) [2024-08-05 18:16:16,773][15417] Signal inference workers to stop experience collection... (14200 times) [2024-08-05 18:16:16,781][15417] Signal inference workers to resume experience collection... (14200 times) [2024-08-05 18:16:16,812][15444] InferenceWorker_p0-w0: stopping experience collection (14200 times) [2024-08-05 18:16:16,812][15444] InferenceWorker_p0-w0: resuming experience collection (14200 times) [2024-08-05 18:16:18,118][15372] Fps is (10 sec: 23758.6, 60 sec: 24166.4, 300 sec: 24103.9). Total num frames: 313081856. Throughput: 0: 6058.0. Samples: 78277600. Policy #0 lag: (min: 0.0, avg: 5.0, max: 10.0) [2024-08-05 18:16:18,119][15372] Avg episode reward: [(0, '42.133')] [2024-08-05 18:16:18,122][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000038218_313081856.pth... [2024-08-05 18:16:18,227][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000037511_307290112.pth [2024-08-05 18:16:18,891][15444] Updated weights for policy 0, policy_version 38221 (0.0022) [2024-08-05 18:16:22,201][15444] Updated weights for policy 0, policy_version 38231 (0.0028) [2024-08-05 18:16:23,119][15372] Fps is (10 sec: 23755.6, 60 sec: 24166.2, 300 sec: 24131.7). Total num frames: 313204736. Throughput: 0: 6057.3. Samples: 78296050. Policy #0 lag: (min: 0.0, avg: 5.0, max: 10.0) [2024-08-05 18:16:23,119][15372] Avg episode reward: [(0, '41.902')] [2024-08-05 18:16:25,568][15444] Updated weights for policy 0, policy_version 38241 (0.0027) [2024-08-05 18:16:28,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24302.9, 300 sec: 24159.5). Total num frames: 313327616. Throughput: 0: 6069.6. Samples: 78332730. Policy #0 lag: (min: 1.0, avg: 3.7, max: 7.0) [2024-08-05 18:16:28,119][15372] Avg episode reward: [(0, '41.609')] [2024-08-05 18:16:29,068][15444] Updated weights for policy 0, policy_version 38251 (0.0019) [2024-08-05 18:16:32,436][15444] Updated weights for policy 0, policy_version 38261 (0.0023) [2024-08-05 18:16:33,118][15372] Fps is (10 sec: 23758.1, 60 sec: 24029.9, 300 sec: 24103.9). Total num frames: 313442304. Throughput: 0: 6039.2. Samples: 78368070. Policy #0 lag: (min: 1.0, avg: 3.7, max: 7.0) [2024-08-05 18:16:33,126][15372] Avg episode reward: [(0, '41.773')] [2024-08-05 18:16:35,899][15444] Updated weights for policy 0, policy_version 38271 (0.0023) [2024-08-05 18:16:38,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.4, 300 sec: 24105.1). Total num frames: 313565184. Throughput: 0: 6024.0. Samples: 78386380. Policy #0 lag: (min: 1.0, avg: 3.7, max: 7.0) [2024-08-05 18:16:38,131][15372] Avg episode reward: [(0, '40.911')] [2024-08-05 18:16:39,275][15444] Updated weights for policy 0, policy_version 38281 (0.0022) [2024-08-05 18:16:42,671][15444] Updated weights for policy 0, policy_version 38291 (0.0015) [2024-08-05 18:16:43,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 313688064. Throughput: 0: 6009.2. Samples: 78422310. Policy #0 lag: (min: 1.0, avg: 3.7, max: 7.0) [2024-08-05 18:16:43,119][15372] Avg episode reward: [(0, '40.741')] [2024-08-05 18:16:45,754][15444] Updated weights for policy 0, policy_version 38301 (0.0011) [2024-08-05 18:16:48,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24030.0, 300 sec: 24103.9). Total num frames: 313810944. Throughput: 0: 6030.7. Samples: 78458560. Policy #0 lag: (min: 1.0, avg: 3.7, max: 7.0) [2024-08-05 18:16:48,126][15372] Avg episode reward: [(0, '41.971')] [2024-08-05 18:16:49,557][15444] Updated weights for policy 0, policy_version 38311 (0.0011) [2024-08-05 18:16:52,896][15444] Updated weights for policy 0, policy_version 38321 (0.0020) [2024-08-05 18:16:53,119][15372] Fps is (10 sec: 23756.4, 60 sec: 24166.3, 300 sec: 24103.9). Total num frames: 313925632. Throughput: 0: 6013.0. Samples: 78476610. Policy #0 lag: (min: 0.0, avg: 3.1, max: 8.0) [2024-08-05 18:16:53,119][15372] Avg episode reward: [(0, '42.141')] [2024-08-05 18:16:56,244][15444] Updated weights for policy 0, policy_version 38331 (0.0012) [2024-08-05 18:16:58,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24032.1, 300 sec: 24131.7). Total num frames: 314048512. Throughput: 0: 6007.6. Samples: 78512120. Policy #0 lag: (min: 0.0, avg: 3.1, max: 8.0) [2024-08-05 18:16:58,126][15372] Avg episode reward: [(0, '42.374')] [2024-08-05 18:16:58,230][15417] Signal inference workers to stop experience collection... (14250 times) [2024-08-05 18:16:58,231][15417] Signal inference workers to resume experience collection... (14250 times) [2024-08-05 18:16:58,287][15444] InferenceWorker_p0-w0: stopping experience collection (14250 times) [2024-08-05 18:16:58,288][15444] InferenceWorker_p0-w0: resuming experience collection (14250 times) [2024-08-05 18:16:59,800][15444] Updated weights for policy 0, policy_version 38341 (0.0029) [2024-08-05 18:17:02,955][15444] Updated weights for policy 0, policy_version 38351 (0.0022) [2024-08-05 18:17:03,118][15372] Fps is (10 sec: 24576.4, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 314171392. Throughput: 0: 6025.1. Samples: 78548730. Policy #0 lag: (min: 0.0, avg: 3.1, max: 8.0) [2024-08-05 18:17:03,119][15372] Avg episode reward: [(0, '40.692')] [2024-08-05 18:17:06,541][15444] Updated weights for policy 0, policy_version 38361 (0.0013) [2024-08-05 18:17:08,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24030.2, 300 sec: 24103.9). Total num frames: 314286080. Throughput: 0: 6018.3. Samples: 78566870. Policy #0 lag: (min: 0.0, avg: 3.1, max: 8.0) [2024-08-05 18:17:08,119][15372] Avg episode reward: [(0, '41.441')] [2024-08-05 18:17:09,660][15444] Updated weights for policy 0, policy_version 38371 (0.0019) [2024-08-05 18:17:13,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 314408960. Throughput: 0: 6017.3. Samples: 78603510. Policy #0 lag: (min: 0.0, avg: 3.1, max: 8.0) [2024-08-05 18:17:13,126][15372] Avg episode reward: [(0, '42.075')] [2024-08-05 18:17:13,289][15444] Updated weights for policy 0, policy_version 38381 (0.0029) [2024-08-05 18:17:16,489][15444] Updated weights for policy 0, policy_version 38391 (0.0010) [2024-08-05 18:17:18,119][15372] Fps is (10 sec: 24575.6, 60 sec: 24166.3, 300 sec: 24103.9). Total num frames: 314531840. Throughput: 0: 6033.8. Samples: 78639590. Policy #0 lag: (min: 0.0, avg: 4.4, max: 8.0) [2024-08-05 18:17:18,127][15372] Avg episode reward: [(0, '41.912')] [2024-08-05 18:17:19,756][15444] Updated weights for policy 0, policy_version 38401 (0.0014) [2024-08-05 18:17:23,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.6, 300 sec: 24131.7). Total num frames: 314654720. Throughput: 0: 6044.9. Samples: 78658400. Policy #0 lag: (min: 0.0, avg: 4.4, max: 8.0) [2024-08-05 18:17:23,126][15372] Avg episode reward: [(0, '41.581')] [2024-08-05 18:17:23,237][15444] Updated weights for policy 0, policy_version 38411 (0.0025) [2024-08-05 18:17:26,474][15444] Updated weights for policy 0, policy_version 38421 (0.0013) [2024-08-05 18:17:28,118][15372] Fps is (10 sec: 24576.3, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 314777600. Throughput: 0: 6048.7. Samples: 78694500. Policy #0 lag: (min: 0.0, avg: 4.4, max: 8.0) [2024-08-05 18:17:28,126][15372] Avg episode reward: [(0, '42.185')] [2024-08-05 18:17:29,918][15444] Updated weights for policy 0, policy_version 38431 (0.0011) [2024-08-05 18:17:33,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24302.9, 300 sec: 24159.5). Total num frames: 314900480. Throughput: 0: 6071.3. Samples: 78731770. Policy #0 lag: (min: 0.0, avg: 4.4, max: 8.0) [2024-08-05 18:17:33,126][15372] Avg episode reward: [(0, '42.205')] [2024-08-05 18:17:33,348][15444] Updated weights for policy 0, policy_version 38441 (0.0013) [2024-08-05 18:17:36,615][15444] Updated weights for policy 0, policy_version 38451 (0.0025) [2024-08-05 18:17:38,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24303.0, 300 sec: 24187.2). Total num frames: 315023360. Throughput: 0: 6074.9. Samples: 78749980. Policy #0 lag: (min: 0.0, avg: 4.4, max: 8.0) [2024-08-05 18:17:38,126][15372] Avg episode reward: [(0, '42.105')] [2024-08-05 18:17:40,023][15444] Updated weights for policy 0, policy_version 38461 (0.0025) [2024-08-05 18:17:43,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24302.9, 300 sec: 24159.5). Total num frames: 315146240. Throughput: 0: 6096.9. Samples: 78786480. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 18:17:43,126][15372] Avg episode reward: [(0, '41.718')] [2024-08-05 18:17:43,356][15444] Updated weights for policy 0, policy_version 38471 (0.0016) [2024-08-05 18:17:46,863][15444] Updated weights for policy 0, policy_version 38481 (0.0014) [2024-08-05 18:17:48,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 315269120. Throughput: 0: 6074.9. Samples: 78822100. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 18:17:48,119][15372] Avg episode reward: [(0, '40.948')] [2024-08-05 18:17:50,078][15444] Updated weights for policy 0, policy_version 38491 (0.0022) [2024-08-05 18:17:53,119][15372] Fps is (10 sec: 23755.2, 60 sec: 24302.7, 300 sec: 24159.4). Total num frames: 315383808. Throughput: 0: 6090.4. Samples: 78840940. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 18:17:53,127][15372] Avg episode reward: [(0, '42.020')] [2024-08-05 18:17:53,417][15444] Updated weights for policy 0, policy_version 38501 (0.0018) [2024-08-05 18:17:54,390][15417] Signal inference workers to stop experience collection... (14300 times) [2024-08-05 18:17:54,392][15417] Signal inference workers to resume experience collection... (14300 times) [2024-08-05 18:17:54,440][15444] InferenceWorker_p0-w0: stopping experience collection (14300 times) [2024-08-05 18:17:54,440][15444] InferenceWorker_p0-w0: resuming experience collection (14300 times) [2024-08-05 18:17:56,856][15444] Updated weights for policy 0, policy_version 38511 (0.0014) [2024-08-05 18:17:58,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24302.9, 300 sec: 24159.5). Total num frames: 315506688. Throughput: 0: 6093.1. Samples: 78877700. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 18:17:58,126][15372] Avg episode reward: [(0, '41.988')] [2024-08-05 18:18:00,234][15444] Updated weights for policy 0, policy_version 38521 (0.0014) [2024-08-05 18:18:03,118][15372] Fps is (10 sec: 24577.6, 60 sec: 24302.9, 300 sec: 24159.5). Total num frames: 315629568. Throughput: 0: 6097.6. Samples: 78913980. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 18:18:03,126][15372] Avg episode reward: [(0, '41.161')] [2024-08-05 18:18:03,667][15444] Updated weights for policy 0, policy_version 38531 (0.0012) [2024-08-05 18:18:07,131][15444] Updated weights for policy 0, policy_version 38541 (0.0015) [2024-08-05 18:18:08,119][15372] Fps is (10 sec: 24575.2, 60 sec: 24439.3, 300 sec: 24187.2). Total num frames: 315752448. Throughput: 0: 6087.7. Samples: 78932350. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 18:18:08,119][15372] Avg episode reward: [(0, '41.860')] [2024-08-05 18:18:10,270][15444] Updated weights for policy 0, policy_version 38551 (0.0010) [2024-08-05 18:18:13,119][15372] Fps is (10 sec: 23754.4, 60 sec: 24302.6, 300 sec: 24159.4). Total num frames: 315867136. Throughput: 0: 6096.1. Samples: 78968830. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 18:18:13,127][15372] Avg episode reward: [(0, '42.150')] [2024-08-05 18:18:13,694][15444] Updated weights for policy 0, policy_version 38561 (0.0024) [2024-08-05 18:18:17,135][15444] Updated weights for policy 0, policy_version 38571 (0.0013) [2024-08-05 18:18:18,118][15372] Fps is (10 sec: 24576.9, 60 sec: 24439.5, 300 sec: 24187.2). Total num frames: 315998208. Throughput: 0: 6073.6. Samples: 79005080. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 18:18:18,119][15372] Avg episode reward: [(0, '42.418')] [2024-08-05 18:18:18,122][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000038574_315998208.pth... [2024-08-05 18:18:18,259][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000037865_310190080.pth [2024-08-05 18:18:20,417][15444] Updated weights for policy 0, policy_version 38581 (0.0018) [2024-08-05 18:18:23,120][15372] Fps is (10 sec: 24573.6, 60 sec: 24302.2, 300 sec: 24214.8). Total num frames: 316112896. Throughput: 0: 6072.2. Samples: 79023240. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 18:18:23,121][15372] Avg episode reward: [(0, '42.093')] [2024-08-05 18:18:24,072][15444] Updated weights for policy 0, policy_version 38591 (0.0016) [2024-08-05 18:18:27,273][15444] Updated weights for policy 0, policy_version 38601 (0.0016) [2024-08-05 18:18:28,119][15372] Fps is (10 sec: 23756.1, 60 sec: 24302.8, 300 sec: 24215.0). Total num frames: 316235776. Throughput: 0: 6064.2. Samples: 79059370. Policy #0 lag: (min: 1.0, avg: 4.1, max: 7.0) [2024-08-05 18:18:28,119][15372] Avg episode reward: [(0, '41.936')] [2024-08-05 18:18:30,639][15444] Updated weights for policy 0, policy_version 38611 (0.0012) [2024-08-05 18:18:33,118][15372] Fps is (10 sec: 24581.0, 60 sec: 24303.0, 300 sec: 24187.4). Total num frames: 316358656. Throughput: 0: 6080.5. Samples: 79095720. Policy #0 lag: (min: 1.0, avg: 4.1, max: 7.0) [2024-08-05 18:18:33,126][15372] Avg episode reward: [(0, '42.496')] [2024-08-05 18:18:34,148][15444] Updated weights for policy 0, policy_version 38621 (0.0026) [2024-08-05 18:18:37,441][15444] Updated weights for policy 0, policy_version 38631 (0.0029) [2024-08-05 18:18:38,119][15372] Fps is (10 sec: 23756.1, 60 sec: 24166.2, 300 sec: 24187.2). Total num frames: 316473344. Throughput: 0: 6072.2. Samples: 79114190. Policy #0 lag: (min: 1.0, avg: 4.1, max: 7.0) [2024-08-05 18:18:38,120][15372] Avg episode reward: [(0, '42.153')] [2024-08-05 18:18:40,990][15444] Updated weights for policy 0, policy_version 38641 (0.0022) [2024-08-05 18:18:43,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 316596224. Throughput: 0: 6045.3. Samples: 79149740. Policy #0 lag: (min: 1.0, avg: 4.1, max: 7.0) [2024-08-05 18:18:43,119][15372] Avg episode reward: [(0, '41.520')] [2024-08-05 18:18:44,286][15444] Updated weights for policy 0, policy_version 38651 (0.0022) [2024-08-05 18:18:47,547][15444] Updated weights for policy 0, policy_version 38661 (0.0016) [2024-08-05 18:18:48,118][15372] Fps is (10 sec: 24577.5, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 316719104. Throughput: 0: 6041.8. Samples: 79185860. Policy #0 lag: (min: 1.0, avg: 4.1, max: 7.0) [2024-08-05 18:18:48,119][15372] Avg episode reward: [(0, '40.716')] [2024-08-05 18:18:51,128][15444] Updated weights for policy 0, policy_version 38671 (0.0011) [2024-08-05 18:18:53,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24303.2, 300 sec: 24215.0). Total num frames: 316841984. Throughput: 0: 6043.8. Samples: 79204320. Policy #0 lag: (min: 1.0, avg: 3.8, max: 8.0) [2024-08-05 18:18:53,119][15372] Avg episode reward: [(0, '41.666')] [2024-08-05 18:18:54,237][15444] Updated weights for policy 0, policy_version 38681 (0.0011) [2024-08-05 18:18:55,881][15417] Signal inference workers to stop experience collection... (14350 times) [2024-08-05 18:18:55,882][15417] Signal inference workers to resume experience collection... (14350 times) [2024-08-05 18:18:55,933][15444] InferenceWorker_p0-w0: stopping experience collection (14350 times) [2024-08-05 18:18:55,934][15444] InferenceWorker_p0-w0: resuming experience collection (14350 times) [2024-08-05 18:18:57,840][15444] Updated weights for policy 0, policy_version 38691 (0.0012) [2024-08-05 18:18:58,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24303.0, 300 sec: 24215.0). Total num frames: 316964864. Throughput: 0: 6047.5. Samples: 79240960. Policy #0 lag: (min: 1.0, avg: 3.8, max: 8.0) [2024-08-05 18:18:58,119][15372] Avg episode reward: [(0, '42.079')] [2024-08-05 18:19:00,937][15444] Updated weights for policy 0, policy_version 38701 (0.0011) [2024-08-05 18:19:03,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24302.9, 300 sec: 24242.8). Total num frames: 317087744. Throughput: 0: 6057.8. Samples: 79277680. Policy #0 lag: (min: 1.0, avg: 3.8, max: 8.0) [2024-08-05 18:19:03,126][15372] Avg episode reward: [(0, '42.171')] [2024-08-05 18:19:04,472][15444] Updated weights for policy 0, policy_version 38711 (0.0013) [2024-08-05 18:19:07,882][15444] Updated weights for policy 0, policy_version 38721 (0.0014) [2024-08-05 18:19:08,119][15372] Fps is (10 sec: 23756.5, 60 sec: 24166.5, 300 sec: 24215.2). Total num frames: 317202432. Throughput: 0: 6059.4. Samples: 79295900. Policy #0 lag: (min: 1.0, avg: 3.8, max: 8.0) [2024-08-05 18:19:08,119][15372] Avg episode reward: [(0, '43.017')] [2024-08-05 18:19:11,071][15444] Updated weights for policy 0, policy_version 38731 (0.0013) [2024-08-05 18:19:13,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24439.9, 300 sec: 24242.8). Total num frames: 317333504. Throughput: 0: 6050.5. Samples: 79331640. Policy #0 lag: (min: 1.0, avg: 3.8, max: 8.0) [2024-08-05 18:19:13,119][15372] Avg episode reward: [(0, '44.107')] [2024-08-05 18:19:13,120][15417] Saving new best policy, reward=44.107! [2024-08-05 18:19:14,942][15444] Updated weights for policy 0, policy_version 38741 (0.0023) [2024-08-05 18:19:18,032][15444] Updated weights for policy 0, policy_version 38751 (0.0019) [2024-08-05 18:19:18,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24166.4, 300 sec: 24242.8). Total num frames: 317448192. Throughput: 0: 6030.2. Samples: 79367080. Policy #0 lag: (min: 0.0, avg: 3.2, max: 7.0) [2024-08-05 18:19:18,119][15372] Avg episode reward: [(0, '43.677')] [2024-08-05 18:19:21,537][15444] Updated weights for policy 0, policy_version 38761 (0.0023) [2024-08-05 18:19:23,118][15372] Fps is (10 sec: 22937.6, 60 sec: 24167.2, 300 sec: 24187.2). Total num frames: 317562880. Throughput: 0: 6031.0. Samples: 79385580. Policy #0 lag: (min: 0.0, avg: 3.2, max: 7.0) [2024-08-05 18:19:23,126][15372] Avg episode reward: [(0, '41.765')] [2024-08-05 18:19:24,810][15444] Updated weights for policy 0, policy_version 38771 (0.0025) [2024-08-05 18:19:28,119][15372] Fps is (10 sec: 23756.3, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 317685760. Throughput: 0: 6047.3. Samples: 79421870. Policy #0 lag: (min: 0.0, avg: 3.2, max: 7.0) [2024-08-05 18:19:28,126][15372] Avg episode reward: [(0, '41.688')] [2024-08-05 18:19:28,276][15444] Updated weights for policy 0, policy_version 38781 (0.0022) [2024-08-05 18:19:31,823][15444] Updated weights for policy 0, policy_version 38791 (0.0019) [2024-08-05 18:19:33,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 317808640. Throughput: 0: 6050.9. Samples: 79458150. Policy #0 lag: (min: 0.0, avg: 3.2, max: 7.0) [2024-08-05 18:19:33,119][15372] Avg episode reward: [(0, '42.095')] [2024-08-05 18:19:34,976][15444] Updated weights for policy 0, policy_version 38801 (0.0020) [2024-08-05 18:19:38,118][15372] Fps is (10 sec: 24576.7, 60 sec: 24303.2, 300 sec: 24215.0). Total num frames: 317931520. Throughput: 0: 6050.7. Samples: 79476600. Policy #0 lag: (min: 0.0, avg: 3.2, max: 7.0) [2024-08-05 18:19:38,126][15372] Avg episode reward: [(0, '42.189')] [2024-08-05 18:19:38,381][15444] Updated weights for policy 0, policy_version 38811 (0.0026) [2024-08-05 18:19:41,620][15444] Updated weights for policy 0, policy_version 38821 (0.0023) [2024-08-05 18:19:42,723][15417] Signal inference workers to stop experience collection... (14400 times) [2024-08-05 18:19:42,731][15417] Signal inference workers to resume experience collection... (14400 times) [2024-08-05 18:19:42,801][15444] InferenceWorker_p0-w0: stopping experience collection (14400 times) [2024-08-05 18:19:42,801][15444] InferenceWorker_p0-w0: resuming experience collection (14400 times) [2024-08-05 18:19:43,119][15372] Fps is (10 sec: 24574.4, 60 sec: 24302.7, 300 sec: 24242.7). Total num frames: 318054400. Throughput: 0: 6039.2. Samples: 79512730. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 18:19:43,120][15372] Avg episode reward: [(0, '42.343')] [2024-08-05 18:19:45,182][15444] Updated weights for policy 0, policy_version 38831 (0.0012) [2024-08-05 18:19:48,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24302.9, 300 sec: 24242.8). Total num frames: 318177280. Throughput: 0: 6036.7. Samples: 79549330. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 18:19:48,126][15372] Avg episode reward: [(0, '42.835')] [2024-08-05 18:19:48,730][15444] Updated weights for policy 0, policy_version 38841 (0.0016) [2024-08-05 18:19:51,765][15444] Updated weights for policy 0, policy_version 38851 (0.0010) [2024-08-05 18:19:53,118][15372] Fps is (10 sec: 23758.4, 60 sec: 24166.4, 300 sec: 24242.8). Total num frames: 318291968. Throughput: 0: 6043.4. Samples: 79567850. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 18:19:53,119][15372] Avg episode reward: [(0, '43.296')] [2024-08-05 18:19:55,431][15444] Updated weights for policy 0, policy_version 38861 (0.0014) [2024-08-05 18:19:58,119][15372] Fps is (10 sec: 23754.6, 60 sec: 24166.0, 300 sec: 24242.7). Total num frames: 318414848. Throughput: 0: 6056.8. Samples: 79604200. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 18:19:58,120][15372] Avg episode reward: [(0, '41.941')] [2024-08-05 18:19:58,452][15444] Updated weights for policy 0, policy_version 38871 (0.0031) [2024-08-05 18:20:02,099][15444] Updated weights for policy 0, policy_version 38881 (0.0011) [2024-08-05 18:20:03,119][15372] Fps is (10 sec: 24575.5, 60 sec: 24166.3, 300 sec: 24242.8). Total num frames: 318537728. Throughput: 0: 6057.3. Samples: 79639660. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 18:20:03,119][15372] Avg episode reward: [(0, '41.553')] [2024-08-05 18:20:05,706][15444] Updated weights for policy 0, policy_version 38891 (0.0024) [2024-08-05 18:20:08,119][15372] Fps is (10 sec: 23758.9, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 318652416. Throughput: 0: 6071.8. Samples: 79658810. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 18:20:08,119][15372] Avg episode reward: [(0, '40.781')] [2024-08-05 18:20:08,711][15444] Updated weights for policy 0, policy_version 38901 (0.0023) [2024-08-05 18:20:12,292][15444] Updated weights for policy 0, policy_version 38911 (0.0011) [2024-08-05 18:20:13,119][15372] Fps is (10 sec: 23757.0, 60 sec: 24029.8, 300 sec: 24215.0). Total num frames: 318775296. Throughput: 0: 6054.9. Samples: 79694340. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 18:20:13,119][15372] Avg episode reward: [(0, '41.858')] [2024-08-05 18:20:15,334][15444] Updated weights for policy 0, policy_version 38921 (0.0021) [2024-08-05 18:20:18,119][15372] Fps is (10 sec: 24575.9, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 318898176. Throughput: 0: 6068.9. Samples: 79731250. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 18:20:18,119][15372] Avg episode reward: [(0, '42.207')] [2024-08-05 18:20:18,128][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000038928_318898176.pth... [2024-08-05 18:20:18,281][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000038218_313081856.pth [2024-08-05 18:20:19,079][15444] Updated weights for policy 0, policy_version 38931 (0.0013) [2024-08-05 18:20:21,346][15417] Signal inference workers to stop experience collection... (14450 times) [2024-08-05 18:20:21,347][15417] Signal inference workers to resume experience collection... (14450 times) [2024-08-05 18:20:21,412][15444] InferenceWorker_p0-w0: stopping experience collection (14450 times) [2024-08-05 18:20:21,412][15444] InferenceWorker_p0-w0: resuming experience collection (14450 times) [2024-08-05 18:20:22,315][15444] Updated weights for policy 0, policy_version 38941 (0.0018) [2024-08-05 18:20:23,119][15372] Fps is (10 sec: 24575.9, 60 sec: 24302.9, 300 sec: 24242.8). Total num frames: 319021056. Throughput: 0: 6066.2. Samples: 79749580. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 18:20:23,119][15372] Avg episode reward: [(0, '43.699')] [2024-08-05 18:20:25,422][15444] Updated weights for policy 0, policy_version 38951 (0.0018) [2024-08-05 18:20:28,119][15372] Fps is (10 sec: 24575.2, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 319143936. Throughput: 0: 6099.4. Samples: 79787200. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 18:20:28,119][15372] Avg episode reward: [(0, '43.529')] [2024-08-05 18:20:28,915][15444] Updated weights for policy 0, policy_version 38961 (0.0015) [2024-08-05 18:20:32,396][15444] Updated weights for policy 0, policy_version 38971 (0.0021) [2024-08-05 18:20:33,118][15372] Fps is (10 sec: 24576.4, 60 sec: 24303.0, 300 sec: 24242.8). Total num frames: 319266816. Throughput: 0: 6077.6. Samples: 79822820. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 18:20:33,126][15372] Avg episode reward: [(0, '42.510')] [2024-08-05 18:20:35,574][15444] Updated weights for policy 0, policy_version 38981 (0.0017) [2024-08-05 18:20:38,119][15372] Fps is (10 sec: 23757.0, 60 sec: 24166.3, 300 sec: 24215.0). Total num frames: 319381504. Throughput: 0: 6070.6. Samples: 79841030. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 18:20:38,127][15372] Avg episode reward: [(0, '42.830')] [2024-08-05 18:20:39,075][15444] Updated weights for policy 0, policy_version 38991 (0.0021) [2024-08-05 18:20:42,421][15444] Updated weights for policy 0, policy_version 39001 (0.0013) [2024-08-05 18:20:43,119][15372] Fps is (10 sec: 23756.2, 60 sec: 24166.6, 300 sec: 24187.2). Total num frames: 319504384. Throughput: 0: 6067.2. Samples: 79877220. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 18:20:43,127][15372] Avg episode reward: [(0, '43.435')] [2024-08-05 18:20:46,010][15444] Updated weights for policy 0, policy_version 39011 (0.0011) [2024-08-05 18:20:48,119][15372] Fps is (10 sec: 24576.4, 60 sec: 24166.3, 300 sec: 24242.8). Total num frames: 319627264. Throughput: 0: 6072.7. Samples: 79912930. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 18:20:48,119][15372] Avg episode reward: [(0, '42.115')] [2024-08-05 18:20:49,451][15444] Updated weights for policy 0, policy_version 39021 (0.0011) [2024-08-05 18:20:52,906][15444] Updated weights for policy 0, policy_version 39031 (0.0030) [2024-08-05 18:20:53,118][15372] Fps is (10 sec: 23757.4, 60 sec: 24166.4, 300 sec: 24187.7). Total num frames: 319741952. Throughput: 0: 6052.2. Samples: 79931160. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 18:20:53,119][15372] Avg episode reward: [(0, '41.267')] [2024-08-05 18:20:56,122][15444] Updated weights for policy 0, policy_version 39041 (0.0014) [2024-08-05 18:20:58,119][15372] Fps is (10 sec: 24576.1, 60 sec: 24303.3, 300 sec: 24242.8). Total num frames: 319873024. Throughput: 0: 6060.2. Samples: 79967050. Policy #0 lag: (min: 1.0, avg: 4.0, max: 7.0) [2024-08-05 18:20:58,126][15372] Avg episode reward: [(0, '41.706')] [2024-08-05 18:20:59,567][15444] Updated weights for policy 0, policy_version 39051 (0.0013) [2024-08-05 18:21:02,901][15444] Updated weights for policy 0, policy_version 39061 (0.0018) [2024-08-05 18:21:03,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.5, 300 sec: 24215.1). Total num frames: 319987712. Throughput: 0: 6052.5. Samples: 80003610. Policy #0 lag: (min: 1.0, avg: 4.0, max: 7.0) [2024-08-05 18:21:03,119][15372] Avg episode reward: [(0, '41.950')] [2024-08-05 18:21:06,223][15444] Updated weights for policy 0, policy_version 39071 (0.0016) [2024-08-05 18:21:08,118][15372] Fps is (10 sec: 23757.1, 60 sec: 24303.0, 300 sec: 24215.0). Total num frames: 320110592. Throughput: 0: 6056.9. Samples: 80022140. Policy #0 lag: (min: 1.0, avg: 4.0, max: 7.0) [2024-08-05 18:21:08,126][15372] Avg episode reward: [(0, '42.918')] [2024-08-05 18:21:09,580][15444] Updated weights for policy 0, policy_version 39081 (0.0011) [2024-08-05 18:21:13,026][15444] Updated weights for policy 0, policy_version 39091 (0.0022) [2024-08-05 18:21:13,119][15372] Fps is (10 sec: 24575.9, 60 sec: 24303.0, 300 sec: 24242.8). Total num frames: 320233472. Throughput: 0: 6030.7. Samples: 80058580. Policy #0 lag: (min: 1.0, avg: 4.0, max: 7.0) [2024-08-05 18:21:13,119][15372] Avg episode reward: [(0, '42.793')] [2024-08-05 18:21:16,151][15444] Updated weights for policy 0, policy_version 39101 (0.0012) [2024-08-05 18:21:18,119][15372] Fps is (10 sec: 24575.8, 60 sec: 24302.9, 300 sec: 24242.8). Total num frames: 320356352. Throughput: 0: 6047.5. Samples: 80094960. Policy #0 lag: (min: 1.0, avg: 4.0, max: 7.0) [2024-08-05 18:21:18,127][15372] Avg episode reward: [(0, '41.697')] [2024-08-05 18:21:19,688][15444] Updated weights for policy 0, policy_version 39111 (0.0013) [2024-08-05 18:21:23,119][15372] Fps is (10 sec: 23754.7, 60 sec: 24166.1, 300 sec: 24214.9). Total num frames: 320471040. Throughput: 0: 6061.0. Samples: 80113780. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 18:21:23,127][15372] Avg episode reward: [(0, '42.145')] [2024-08-05 18:21:23,185][15444] Updated weights for policy 0, policy_version 39121 (0.0017) [2024-08-05 18:21:23,303][15417] Signal inference workers to stop experience collection... (14500 times) [2024-08-05 18:21:23,304][15417] Signal inference workers to resume experience collection... (14500 times) [2024-08-05 18:21:23,352][15444] InferenceWorker_p0-w0: stopping experience collection (14500 times) [2024-08-05 18:21:23,352][15444] InferenceWorker_p0-w0: resuming experience collection (14500 times) [2024-08-05 18:21:26,411][15444] Updated weights for policy 0, policy_version 39131 (0.0019) [2024-08-05 18:21:28,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24303.1, 300 sec: 24270.5). Total num frames: 320602112. Throughput: 0: 6051.6. Samples: 80149540. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 18:21:28,126][15372] Avg episode reward: [(0, '42.139')] [2024-08-05 18:21:29,882][15444] Updated weights for policy 0, policy_version 39141 (0.0016) [2024-08-05 18:21:33,118][15372] Fps is (10 sec: 24578.3, 60 sec: 24166.4, 300 sec: 24242.8). Total num frames: 320716800. Throughput: 0: 6067.1. Samples: 80185950. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 18:21:33,126][15372] Avg episode reward: [(0, '42.304')] [2024-08-05 18:21:33,359][15444] Updated weights for policy 0, policy_version 39151 (0.0025) [2024-08-05 18:21:36,583][15444] Updated weights for policy 0, policy_version 39161 (0.0018) [2024-08-05 18:21:38,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24303.1, 300 sec: 24242.8). Total num frames: 320839680. Throughput: 0: 6076.4. Samples: 80204600. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 18:21:38,126][15372] Avg episode reward: [(0, '41.392')] [2024-08-05 18:21:39,833][15444] Updated weights for policy 0, policy_version 39171 (0.0024) [2024-08-05 18:21:43,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.5, 300 sec: 24215.0). Total num frames: 320954368. Throughput: 0: 6093.1. Samples: 80241240. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 18:21:43,126][15372] Avg episode reward: [(0, '41.621')] [2024-08-05 18:21:43,341][15444] Updated weights for policy 0, policy_version 39181 (0.0012) [2024-08-05 18:21:46,567][15444] Updated weights for policy 0, policy_version 39191 (0.0014) [2024-08-05 18:21:48,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24303.0, 300 sec: 24270.5). Total num frames: 321085440. Throughput: 0: 6076.2. Samples: 80277040. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 18:21:48,126][15372] Avg episode reward: [(0, '40.504')] [2024-08-05 18:21:49,990][15444] Updated weights for policy 0, policy_version 39201 (0.0013) [2024-08-05 18:21:53,118][15372] Fps is (10 sec: 25395.2, 60 sec: 24439.5, 300 sec: 24270.5). Total num frames: 321208320. Throughput: 0: 6075.6. Samples: 80295540. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 18:21:53,126][15372] Avg episode reward: [(0, '41.802')] [2024-08-05 18:21:53,602][15444] Updated weights for policy 0, policy_version 39211 (0.0013) [2024-08-05 18:21:56,666][15444] Updated weights for policy 0, policy_version 39221 (0.0022) [2024-08-05 18:21:58,119][15372] Fps is (10 sec: 23756.4, 60 sec: 24166.4, 300 sec: 24242.8). Total num frames: 321323008. Throughput: 0: 6075.1. Samples: 80331960. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 18:21:58,126][15372] Avg episode reward: [(0, '42.194')] [2024-08-05 18:22:00,132][15444] Updated weights for policy 0, policy_version 39231 (0.0018) [2024-08-05 18:22:03,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24439.5, 300 sec: 24298.3). Total num frames: 321454080. Throughput: 0: 6080.2. Samples: 80368570. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 18:22:03,126][15372] Avg episode reward: [(0, '41.758')] [2024-08-05 18:22:03,487][15444] Updated weights for policy 0, policy_version 39241 (0.0017) [2024-08-05 18:22:06,908][15444] Updated weights for policy 0, policy_version 39251 (0.0015) [2024-08-05 18:22:08,118][15372] Fps is (10 sec: 24576.5, 60 sec: 24302.9, 300 sec: 24270.5). Total num frames: 321568768. Throughput: 0: 6060.8. Samples: 80386510. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 18:22:08,119][15372] Avg episode reward: [(0, '41.599')] [2024-08-05 18:22:10,219][15444] Updated weights for policy 0, policy_version 39261 (0.0013) [2024-08-05 18:22:13,125][15372] Fps is (10 sec: 22923.2, 60 sec: 24163.9, 300 sec: 24242.3). Total num frames: 321683456. Throughput: 0: 6068.3. Samples: 80422650. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 18:22:13,133][15372] Avg episode reward: [(0, '41.685')] [2024-08-05 18:22:13,909][15444] Updated weights for policy 0, policy_version 39271 (0.0019) [2024-08-05 18:22:17,363][15444] Updated weights for policy 0, policy_version 39281 (0.0010) [2024-08-05 18:22:18,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24242.8). Total num frames: 321806336. Throughput: 0: 6052.0. Samples: 80458290. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 18:22:18,119][15372] Avg episode reward: [(0, '42.100')] [2024-08-05 18:22:18,122][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000039283_321806336.pth... [2024-08-05 18:22:18,242][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000038574_315998208.pth [2024-08-05 18:22:19,586][15417] Signal inference workers to stop experience collection... (14550 times) [2024-08-05 18:22:19,586][15417] Signal inference workers to resume experience collection... (14550 times) [2024-08-05 18:22:19,656][15444] InferenceWorker_p0-w0: stopping experience collection (14550 times) [2024-08-05 18:22:19,656][15444] InferenceWorker_p0-w0: resuming experience collection (14550 times) [2024-08-05 18:22:20,683][15444] Updated weights for policy 0, policy_version 39291 (0.0026) [2024-08-05 18:22:23,119][15372] Fps is (10 sec: 24590.2, 60 sec: 24303.1, 300 sec: 24242.7). Total num frames: 321929216. Throughput: 0: 6041.9. Samples: 80476490. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 18:22:23,119][15372] Avg episode reward: [(0, '41.864')] [2024-08-05 18:22:23,973][15444] Updated weights for policy 0, policy_version 39301 (0.0026) [2024-08-05 18:22:27,324][15444] Updated weights for policy 0, policy_version 39311 (0.0017) [2024-08-05 18:22:28,119][15372] Fps is (10 sec: 23756.6, 60 sec: 24029.8, 300 sec: 24215.0). Total num frames: 322043904. Throughput: 0: 6025.3. Samples: 80512380. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 18:22:28,119][15372] Avg episode reward: [(0, '42.251')] [2024-08-05 18:22:30,985][15444] Updated weights for policy 0, policy_version 39321 (0.0034) [2024-08-05 18:22:33,119][15372] Fps is (10 sec: 23757.8, 60 sec: 24166.3, 300 sec: 24215.0). Total num frames: 322166784. Throughput: 0: 6032.9. Samples: 80548520. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 18:22:33,119][15372] Avg episode reward: [(0, '42.037')] [2024-08-05 18:22:34,390][15444] Updated weights for policy 0, policy_version 39331 (0.0021) [2024-08-05 18:22:37,752][15444] Updated weights for policy 0, policy_version 39341 (0.0019) [2024-08-05 18:22:38,119][15372] Fps is (10 sec: 24575.6, 60 sec: 24166.3, 300 sec: 24215.0). Total num frames: 322289664. Throughput: 0: 6013.1. Samples: 80566130. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 18:22:38,119][15372] Avg episode reward: [(0, '42.558')] [2024-08-05 18:22:41,181][15444] Updated weights for policy 0, policy_version 39351 (0.0014) [2024-08-05 18:22:43,118][15372] Fps is (10 sec: 23757.1, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 322404352. Throughput: 0: 6004.5. Samples: 80602160. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 18:22:43,126][15372] Avg episode reward: [(0, '42.853')] [2024-08-05 18:22:44,330][15444] Updated weights for policy 0, policy_version 39361 (0.0013) [2024-08-05 18:22:47,900][15444] Updated weights for policy 0, policy_version 39371 (0.0014) [2024-08-05 18:22:48,118][15372] Fps is (10 sec: 23757.3, 60 sec: 24029.9, 300 sec: 24215.0). Total num frames: 322527232. Throughput: 0: 5992.2. Samples: 80638220. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 18:22:48,119][15372] Avg episode reward: [(0, '42.477')] [2024-08-05 18:22:51,384][15444] Updated weights for policy 0, policy_version 39381 (0.0012) [2024-08-05 18:22:53,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24029.9, 300 sec: 24215.0). Total num frames: 322650112. Throughput: 0: 6012.2. Samples: 80657060. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 18:22:53,126][15372] Avg episode reward: [(0, '41.350')] [2024-08-05 18:22:54,659][15444] Updated weights for policy 0, policy_version 39391 (0.0023) [2024-08-05 18:22:58,055][15444] Updated weights for policy 0, policy_version 39401 (0.0012) [2024-08-05 18:22:58,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24166.5, 300 sec: 24215.0). Total num frames: 322772992. Throughput: 0: 6021.1. Samples: 80693560. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 18:22:58,119][15372] Avg episode reward: [(0, '40.124')] [2024-08-05 18:23:01,270][15444] Updated weights for policy 0, policy_version 39411 (0.0019) [2024-08-05 18:23:03,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24029.9, 300 sec: 24215.0). Total num frames: 322895872. Throughput: 0: 6024.9. Samples: 80729410. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 18:23:03,126][15372] Avg episode reward: [(0, '42.140')] [2024-08-05 18:23:04,653][15444] Updated weights for policy 0, policy_version 39421 (0.0011) [2024-08-05 18:23:08,119][15372] Fps is (10 sec: 23757.0, 60 sec: 24029.9, 300 sec: 24215.1). Total num frames: 323010560. Throughput: 0: 6033.0. Samples: 80747970. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 18:23:08,126][15372] Avg episode reward: [(0, '42.147')] [2024-08-05 18:23:08,192][15444] Updated weights for policy 0, policy_version 39431 (0.0024) [2024-08-05 18:23:08,309][15417] Signal inference workers to stop experience collection... (14600 times) [2024-08-05 18:23:08,311][15417] Signal inference workers to resume experience collection... (14600 times) [2024-08-05 18:23:08,360][15444] InferenceWorker_p0-w0: stopping experience collection (14600 times) [2024-08-05 18:23:08,367][15444] InferenceWorker_p0-w0: resuming experience collection (14600 times) [2024-08-05 18:23:11,530][15444] Updated weights for policy 0, policy_version 39441 (0.0032) [2024-08-05 18:23:13,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24168.9, 300 sec: 24187.2). Total num frames: 323133440. Throughput: 0: 6031.8. Samples: 80783810. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 18:23:13,126][15372] Avg episode reward: [(0, '42.582')] [2024-08-05 18:23:14,815][15444] Updated weights for policy 0, policy_version 39451 (0.0038) [2024-08-05 18:23:18,119][15372] Fps is (10 sec: 24574.6, 60 sec: 24166.2, 300 sec: 24215.1). Total num frames: 323256320. Throughput: 0: 6037.3. Samples: 80820200. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 18:23:18,127][15372] Avg episode reward: [(0, '41.732')] [2024-08-05 18:23:18,451][15444] Updated weights for policy 0, policy_version 39461 (0.0011) [2024-08-05 18:23:21,901][15444] Updated weights for policy 0, policy_version 39471 (0.0023) [2024-08-05 18:23:23,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.6, 300 sec: 24215.0). Total num frames: 323379200. Throughput: 0: 6054.3. Samples: 80838570. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 18:23:23,119][15372] Avg episode reward: [(0, '41.437')] [2024-08-05 18:23:24,998][15444] Updated weights for policy 0, policy_version 39481 (0.0020) [2024-08-05 18:23:28,119][15372] Fps is (10 sec: 23757.8, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 323493888. Throughput: 0: 6061.3. Samples: 80874920. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 18:23:28,126][15372] Avg episode reward: [(0, '42.851')] [2024-08-05 18:23:28,590][15444] Updated weights for policy 0, policy_version 39491 (0.0012) [2024-08-05 18:23:31,863][15444] Updated weights for policy 0, policy_version 39501 (0.0011) [2024-08-05 18:23:33,119][15372] Fps is (10 sec: 24575.8, 60 sec: 24303.0, 300 sec: 24242.8). Total num frames: 323624960. Throughput: 0: 6056.7. Samples: 80910770. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 18:23:33,119][15372] Avg episode reward: [(0, '42.903')] [2024-08-05 18:23:35,077][15444] Updated weights for policy 0, policy_version 39511 (0.0023) [2024-08-05 18:23:38,119][15372] Fps is (10 sec: 25395.3, 60 sec: 24303.0, 300 sec: 24242.8). Total num frames: 323747840. Throughput: 0: 6061.5. Samples: 80929830. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 18:23:38,127][15372] Avg episode reward: [(0, '41.356')] [2024-08-05 18:23:38,548][15444] Updated weights for policy 0, policy_version 39521 (0.0012) [2024-08-05 18:23:41,890][15444] Updated weights for policy 0, policy_version 39531 (0.0012) [2024-08-05 18:23:43,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 323862528. Throughput: 0: 6053.3. Samples: 80965960. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 18:23:43,119][15372] Avg episode reward: [(0, '42.169')] [2024-08-05 18:23:45,344][15444] Updated weights for policy 0, policy_version 39541 (0.0013) [2024-08-05 18:23:48,128][15372] Fps is (10 sec: 23734.6, 60 sec: 24299.1, 300 sec: 24214.2). Total num frames: 323985408. Throughput: 0: 6053.2. Samples: 81001860. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 18:23:48,128][15372] Avg episode reward: [(0, '43.056')] [2024-08-05 18:23:48,819][15444] Updated weights for policy 0, policy_version 39551 (0.0011) [2024-08-05 18:23:52,180][15444] Updated weights for policy 0, policy_version 39561 (0.0016) [2024-08-05 18:23:53,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 324100096. Throughput: 0: 6050.7. Samples: 81020250. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 18:23:53,119][15372] Avg episode reward: [(0, '41.784')] [2024-08-05 18:23:55,447][15444] Updated weights for policy 0, policy_version 39571 (0.0012) [2024-08-05 18:23:58,118][15372] Fps is (10 sec: 24599.2, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 324231168. Throughput: 0: 6072.0. Samples: 81057050. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 18:23:58,119][15372] Avg episode reward: [(0, '42.286')] [2024-08-05 18:23:58,901][15444] Updated weights for policy 0, policy_version 39581 (0.0012) [2024-08-05 18:24:02,278][15444] Updated weights for policy 0, policy_version 39591 (0.0031) [2024-08-05 18:24:03,121][15372] Fps is (10 sec: 24576.1, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 324345856. Throughput: 0: 6054.3. Samples: 81092640. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 18:24:03,121][15372] Avg episode reward: [(0, '42.437')] [2024-08-05 18:24:05,454][15444] Updated weights for policy 0, policy_version 39601 (0.0010) [2024-08-05 18:24:08,118][15372] Fps is (10 sec: 22937.7, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 324460544. Throughput: 0: 6046.2. Samples: 81110650. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 18:24:08,126][15372] Avg episode reward: [(0, '42.055')] [2024-08-05 18:24:09,351][15444] Updated weights for policy 0, policy_version 39611 (0.0037) [2024-08-05 18:24:11,220][15417] Signal inference workers to stop experience collection... (14650 times) [2024-08-05 18:24:11,221][15417] Signal inference workers to resume experience collection... (14650 times) [2024-08-05 18:24:11,245][15444] InferenceWorker_p0-w0: stopping experience collection (14650 times) [2024-08-05 18:24:11,245][15444] InferenceWorker_p0-w0: resuming experience collection (14650 times) [2024-08-05 18:24:12,913][15444] Updated weights for policy 0, policy_version 39621 (0.0023) [2024-08-05 18:24:13,119][15372] Fps is (10 sec: 22937.3, 60 sec: 24029.8, 300 sec: 24159.5). Total num frames: 324575232. Throughput: 0: 6025.1. Samples: 81146050. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 18:24:13,119][15372] Avg episode reward: [(0, '42.320')] [2024-08-05 18:24:15,866][15444] Updated weights for policy 0, policy_version 39631 (0.0031) [2024-08-05 18:24:18,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.6, 300 sec: 24215.0). Total num frames: 324706304. Throughput: 0: 6032.2. Samples: 81182220. Policy #0 lag: (min: 0.0, avg: 4.5, max: 9.0) [2024-08-05 18:24:18,126][15372] Avg episode reward: [(0, '42.665')] [2024-08-05 18:24:18,130][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000039637_324706304.pth... [2024-08-05 18:24:18,245][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000038928_318898176.pth [2024-08-05 18:24:19,593][15444] Updated weights for policy 0, policy_version 39641 (0.0014) [2024-08-05 18:24:22,852][15444] Updated weights for policy 0, policy_version 39651 (0.0023) [2024-08-05 18:24:23,119][15372] Fps is (10 sec: 24575.6, 60 sec: 24029.8, 300 sec: 24187.2). Total num frames: 324820992. Throughput: 0: 6010.6. Samples: 81200310. Policy #0 lag: (min: 0.0, avg: 4.5, max: 9.0) [2024-08-05 18:24:23,119][15372] Avg episode reward: [(0, '42.517')] [2024-08-05 18:24:26,148][15444] Updated weights for policy 0, policy_version 39661 (0.0012) [2024-08-05 18:24:28,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.5, 300 sec: 24187.2). Total num frames: 324943872. Throughput: 0: 6005.6. Samples: 81236210. Policy #0 lag: (min: 0.0, avg: 4.5, max: 9.0) [2024-08-05 18:24:28,126][15372] Avg episode reward: [(0, '41.442')] [2024-08-05 18:24:29,552][15444] Updated weights for policy 0, policy_version 39671 (0.0018) [2024-08-05 18:24:33,011][15444] Updated weights for policy 0, policy_version 39681 (0.0023) [2024-08-05 18:24:33,118][15372] Fps is (10 sec: 24576.6, 60 sec: 24029.9, 300 sec: 24187.2). Total num frames: 325066752. Throughput: 0: 6017.3. Samples: 81272580. Policy #0 lag: (min: 0.0, avg: 4.5, max: 9.0) [2024-08-05 18:24:33,119][15372] Avg episode reward: [(0, '41.891')] [2024-08-05 18:24:36,428][15444] Updated weights for policy 0, policy_version 39691 (0.0031) [2024-08-05 18:24:38,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24029.9, 300 sec: 24187.3). Total num frames: 325189632. Throughput: 0: 6011.1. Samples: 81290750. Policy #0 lag: (min: 0.0, avg: 4.5, max: 9.0) [2024-08-05 18:24:38,126][15372] Avg episode reward: [(0, '42.493')] [2024-08-05 18:24:39,743][15444] Updated weights for policy 0, policy_version 39701 (0.0013) [2024-08-05 18:24:43,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 325304320. Throughput: 0: 6010.0. Samples: 81327500. Policy #0 lag: (min: 1.0, avg: 4.5, max: 8.0) [2024-08-05 18:24:43,126][15372] Avg episode reward: [(0, '41.160')] [2024-08-05 18:24:43,382][15444] Updated weights for policy 0, policy_version 39711 (0.0050) [2024-08-05 18:24:45,779][15417] Signal inference workers to stop experience collection... (14700 times) [2024-08-05 18:24:45,780][15417] Signal inference workers to resume experience collection... (14700 times) [2024-08-05 18:24:45,839][15444] InferenceWorker_p0-w0: stopping experience collection (14700 times) [2024-08-05 18:24:45,839][15444] InferenceWorker_p0-w0: resuming experience collection (14700 times) [2024-08-05 18:24:46,333][15444] Updated weights for policy 0, policy_version 39721 (0.0012) [2024-08-05 18:24:48,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24033.7, 300 sec: 24187.2). Total num frames: 325427200. Throughput: 0: 6029.1. Samples: 81363950. Policy #0 lag: (min: 1.0, avg: 4.5, max: 8.0) [2024-08-05 18:24:48,119][15372] Avg episode reward: [(0, '40.911')] [2024-08-05 18:24:49,922][15444] Updated weights for policy 0, policy_version 39731 (0.0012) [2024-08-05 18:24:53,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.4, 300 sec: 24187.3). Total num frames: 325550080. Throughput: 0: 6026.9. Samples: 81381860. Policy #0 lag: (min: 1.0, avg: 4.5, max: 8.0) [2024-08-05 18:24:53,119][15372] Avg episode reward: [(0, '41.676')] [2024-08-05 18:24:53,426][15444] Updated weights for policy 0, policy_version 39741 (0.0028) [2024-08-05 18:24:56,467][15444] Updated weights for policy 0, policy_version 39751 (0.0023) [2024-08-05 18:24:58,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24029.9, 300 sec: 24187.2). Total num frames: 325672960. Throughput: 0: 6052.9. Samples: 81418430. Policy #0 lag: (min: 1.0, avg: 4.5, max: 8.0) [2024-08-05 18:24:58,126][15372] Avg episode reward: [(0, '41.751')] [2024-08-05 18:24:59,906][15444] Updated weights for policy 0, policy_version 39761 (0.0011) [2024-08-05 18:25:03,119][15372] Fps is (10 sec: 24575.4, 60 sec: 24166.3, 300 sec: 24215.0). Total num frames: 325795840. Throughput: 0: 6073.7. Samples: 81455540. Policy #0 lag: (min: 1.0, avg: 4.5, max: 8.0) [2024-08-05 18:25:03,119][15372] Avg episode reward: [(0, '42.309')] [2024-08-05 18:25:03,258][15444] Updated weights for policy 0, policy_version 39771 (0.0014) [2024-08-05 18:25:06,732][15444] Updated weights for policy 0, policy_version 39781 (0.0026) [2024-08-05 18:25:08,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 325918720. Throughput: 0: 6080.9. Samples: 81473950. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 18:25:08,119][15372] Avg episode reward: [(0, '42.729')] [2024-08-05 18:25:09,857][15444] Updated weights for policy 0, policy_version 39791 (0.0016) [2024-08-05 18:25:13,118][15372] Fps is (10 sec: 24576.6, 60 sec: 24439.5, 300 sec: 24215.0). Total num frames: 326041600. Throughput: 0: 6090.9. Samples: 81510300. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 18:25:13,126][15372] Avg episode reward: [(0, '43.230')] [2024-08-05 18:25:13,329][15444] Updated weights for policy 0, policy_version 39801 (0.0011) [2024-08-05 18:25:16,862][15444] Updated weights for policy 0, policy_version 39811 (0.0023) [2024-08-05 18:25:18,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 326164480. Throughput: 0: 6078.9. Samples: 81546130. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 18:25:18,119][15372] Avg episode reward: [(0, '42.168')] [2024-08-05 18:25:20,186][15444] Updated weights for policy 0, policy_version 39821 (0.0014) [2024-08-05 18:25:23,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24303.0, 300 sec: 24187.3). Total num frames: 326279168. Throughput: 0: 6075.1. Samples: 81564130. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 18:25:23,126][15372] Avg episode reward: [(0, '41.685')] [2024-08-05 18:25:23,658][15444] Updated weights for policy 0, policy_version 39831 (0.0015) [2024-08-05 18:25:26,999][15444] Updated weights for policy 0, policy_version 39841 (0.0014) [2024-08-05 18:25:28,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 326402048. Throughput: 0: 6061.5. Samples: 81600270. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 18:25:28,119][15372] Avg episode reward: [(0, '41.357')] [2024-08-05 18:25:30,291][15444] Updated weights for policy 0, policy_version 39851 (0.0015) [2024-08-05 18:25:33,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 326524928. Throughput: 0: 6062.2. Samples: 81636750. Policy #0 lag: (min: 0.0, avg: 4.1, max: 9.0) [2024-08-05 18:25:33,127][15372] Avg episode reward: [(0, '41.452')] [2024-08-05 18:25:33,994][15444] Updated weights for policy 0, policy_version 39861 (0.0017) [2024-08-05 18:25:36,413][15417] Signal inference workers to stop experience collection... (14750 times) [2024-08-05 18:25:36,413][15417] Signal inference workers to resume experience collection... (14750 times) [2024-08-05 18:25:36,467][15444] InferenceWorker_p0-w0: stopping experience collection (14750 times) [2024-08-05 18:25:36,468][15444] InferenceWorker_p0-w0: resuming experience collection (14750 times) [2024-08-05 18:25:36,928][15444] Updated weights for policy 0, policy_version 39871 (0.0029) [2024-08-05 18:25:38,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 326639616. Throughput: 0: 6072.4. Samples: 81655120. Policy #0 lag: (min: 0.0, avg: 4.1, max: 9.0) [2024-08-05 18:25:38,119][15372] Avg episode reward: [(0, '42.531')] [2024-08-05 18:25:40,433][15444] Updated weights for policy 0, policy_version 39881 (0.0028) [2024-08-05 18:25:43,120][15372] Fps is (10 sec: 24572.7, 60 sec: 24438.9, 300 sec: 24214.9). Total num frames: 326770688. Throughput: 0: 6082.0. Samples: 81692130. Policy #0 lag: (min: 0.0, avg: 4.1, max: 9.0) [2024-08-05 18:25:43,120][15372] Avg episode reward: [(0, '42.076')] [2024-08-05 18:25:43,882][15444] Updated weights for policy 0, policy_version 39891 (0.0011) [2024-08-05 18:25:47,429][15444] Updated weights for policy 0, policy_version 39901 (0.0018) [2024-08-05 18:25:48,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 326885376. Throughput: 0: 6054.0. Samples: 81727970. Policy #0 lag: (min: 0.0, avg: 4.1, max: 9.0) [2024-08-05 18:25:48,119][15372] Avg episode reward: [(0, '42.481')] [2024-08-05 18:25:50,421][15444] Updated weights for policy 0, policy_version 39911 (0.0022) [2024-08-05 18:25:53,119][15372] Fps is (10 sec: 23759.8, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 327008256. Throughput: 0: 6050.7. Samples: 81746230. Policy #0 lag: (min: 0.0, avg: 4.1, max: 9.0) [2024-08-05 18:25:53,126][15372] Avg episode reward: [(0, '42.186')] [2024-08-05 18:25:54,146][15444] Updated weights for policy 0, policy_version 39921 (0.0013) [2024-08-05 18:25:57,384][15444] Updated weights for policy 0, policy_version 39931 (0.0012) [2024-08-05 18:25:58,119][15372] Fps is (10 sec: 23756.5, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 327122944. Throughput: 0: 6043.3. Samples: 81782250. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 18:25:58,126][15372] Avg episode reward: [(0, '41.918')] [2024-08-05 18:26:00,819][15444] Updated weights for policy 0, policy_version 39941 (0.0032) [2024-08-05 18:26:03,119][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.5, 300 sec: 24187.2). Total num frames: 327245824. Throughput: 0: 6048.7. Samples: 81818320. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 18:26:03,127][15372] Avg episode reward: [(0, '41.530')] [2024-08-05 18:26:04,339][15444] Updated weights for policy 0, policy_version 39951 (0.0011) [2024-08-05 18:26:07,727][15444] Updated weights for policy 0, policy_version 39961 (0.0024) [2024-08-05 18:26:08,118][15372] Fps is (10 sec: 24576.3, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 327368704. Throughput: 0: 6048.4. Samples: 81836310. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 18:26:08,119][15372] Avg episode reward: [(0, '41.634')] [2024-08-05 18:26:11,066][15444] Updated weights for policy 0, policy_version 39971 (0.0051) [2024-08-05 18:26:13,119][15372] Fps is (10 sec: 24575.8, 60 sec: 24166.3, 300 sec: 24187.2). Total num frames: 327491584. Throughput: 0: 6044.7. Samples: 81872280. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 18:26:13,119][15372] Avg episode reward: [(0, '41.339')] [2024-08-05 18:26:14,484][15444] Updated weights for policy 0, policy_version 39981 (0.0023) [2024-08-05 18:26:18,119][15372] Fps is (10 sec: 22935.2, 60 sec: 23892.9, 300 sec: 24159.4). Total num frames: 327598080. Throughput: 0: 6013.2. Samples: 81907350. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 18:26:18,127][15372] Avg episode reward: [(0, '42.432')] [2024-08-05 18:26:18,165][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000039991_327606272.pth... [2024-08-05 18:26:18,167][15444] Updated weights for policy 0, policy_version 39991 (0.0014) [2024-08-05 18:26:18,315][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000039283_321806336.pth [2024-08-05 18:26:21,301][15444] Updated weights for policy 0, policy_version 40001 (0.0012) [2024-08-05 18:26:22,981][15417] Signal inference workers to stop experience collection... (14800 times) [2024-08-05 18:26:22,989][15417] Signal inference workers to resume experience collection... (14800 times) [2024-08-05 18:26:23,041][15444] InferenceWorker_p0-w0: stopping experience collection (14800 times) [2024-08-05 18:26:23,041][15444] InferenceWorker_p0-w0: resuming experience collection (14800 times) [2024-08-05 18:26:23,118][15372] Fps is (10 sec: 23757.2, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 327729152. Throughput: 0: 6014.7. Samples: 81925780. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 18:26:23,119][15372] Avg episode reward: [(0, '41.318')] [2024-08-05 18:26:24,863][15444] Updated weights for policy 0, policy_version 40011 (0.0018) [2024-08-05 18:26:28,119][15372] Fps is (10 sec: 24577.9, 60 sec: 24029.8, 300 sec: 24159.4). Total num frames: 327843840. Throughput: 0: 6006.6. Samples: 81962420. Policy #0 lag: (min: 1.0, avg: 4.3, max: 7.0) [2024-08-05 18:26:28,126][15372] Avg episode reward: [(0, '41.626')] [2024-08-05 18:26:28,321][15444] Updated weights for policy 0, policy_version 40021 (0.0013) [2024-08-05 18:26:31,491][15444] Updated weights for policy 0, policy_version 40031 (0.0029) [2024-08-05 18:26:33,119][15372] Fps is (10 sec: 22937.2, 60 sec: 23893.3, 300 sec: 24131.7). Total num frames: 327958528. Throughput: 0: 5998.9. Samples: 81997920. Policy #0 lag: (min: 1.0, avg: 4.3, max: 7.0) [2024-08-05 18:26:33,127][15372] Avg episode reward: [(0, '41.352')] [2024-08-05 18:26:35,176][15444] Updated weights for policy 0, policy_version 40041 (0.0020) [2024-08-05 18:26:38,118][15372] Fps is (10 sec: 24576.6, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 328089600. Throughput: 0: 5977.3. Samples: 82015210. Policy #0 lag: (min: 1.0, avg: 4.3, max: 7.0) [2024-08-05 18:26:38,126][15372] Avg episode reward: [(0, '41.868')] [2024-08-05 18:26:38,765][15444] Updated weights for policy 0, policy_version 40051 (0.0011) [2024-08-05 18:26:41,901][15444] Updated weights for policy 0, policy_version 40061 (0.0024) [2024-08-05 18:26:43,118][15372] Fps is (10 sec: 23757.2, 60 sec: 23757.3, 300 sec: 24103.9). Total num frames: 328196096. Throughput: 0: 5987.8. Samples: 82051700. Policy #0 lag: (min: 1.0, avg: 4.3, max: 7.0) [2024-08-05 18:26:43,119][15372] Avg episode reward: [(0, '41.697')] [2024-08-05 18:26:45,500][15444] Updated weights for policy 0, policy_version 40071 (0.0032) [2024-08-05 18:26:48,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 328327168. Throughput: 0: 5985.6. Samples: 82087670. Policy #0 lag: (min: 1.0, avg: 4.3, max: 7.0) [2024-08-05 18:26:48,126][15372] Avg episode reward: [(0, '41.684')] [2024-08-05 18:26:48,721][15444] Updated weights for policy 0, policy_version 40081 (0.0023) [2024-08-05 18:26:52,196][15444] Updated weights for policy 0, policy_version 40091 (0.0012) [2024-08-05 18:26:53,118][15372] Fps is (10 sec: 24576.0, 60 sec: 23893.3, 300 sec: 24131.7). Total num frames: 328441856. Throughput: 0: 5979.6. Samples: 82105390. Policy #0 lag: (min: 0.0, avg: 4.7, max: 8.0) [2024-08-05 18:26:53,119][15372] Avg episode reward: [(0, '42.481')] [2024-08-05 18:26:55,751][15444] Updated weights for policy 0, policy_version 40101 (0.0020) [2024-08-05 18:26:56,152][15417] Signal inference workers to stop experience collection... (14850 times) [2024-08-05 18:26:56,153][15417] Signal inference workers to resume experience collection... (14850 times) [2024-08-05 18:26:56,207][15444] InferenceWorker_p0-w0: stopping experience collection (14850 times) [2024-08-05 18:26:56,216][15444] InferenceWorker_p0-w0: resuming experience collection (14850 times) [2024-08-05 18:26:58,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.9, 300 sec: 24103.9). Total num frames: 328564736. Throughput: 0: 5989.4. Samples: 82141800. Policy #0 lag: (min: 0.0, avg: 4.7, max: 8.0) [2024-08-05 18:26:58,119][15372] Avg episode reward: [(0, '42.406')] [2024-08-05 18:26:58,666][15444] Updated weights for policy 0, policy_version 40111 (0.0019) [2024-08-05 18:27:02,345][15444] Updated weights for policy 0, policy_version 40121 (0.0019) [2024-08-05 18:27:03,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 328687616. Throughput: 0: 6008.6. Samples: 82177730. Policy #0 lag: (min: 0.0, avg: 4.7, max: 8.0) [2024-08-05 18:27:03,119][15372] Avg episode reward: [(0, '42.271')] [2024-08-05 18:27:05,695][15444] Updated weights for policy 0, policy_version 40131 (0.0028) [2024-08-05 18:27:08,118][15372] Fps is (10 sec: 23756.7, 60 sec: 23893.3, 300 sec: 24132.2). Total num frames: 328802304. Throughput: 0: 6012.2. Samples: 82196330. Policy #0 lag: (min: 0.0, avg: 4.7, max: 8.0) [2024-08-05 18:27:08,126][15372] Avg episode reward: [(0, '43.109')] [2024-08-05 18:27:09,010][15444] Updated weights for policy 0, policy_version 40141 (0.0012) [2024-08-05 18:27:12,825][15444] Updated weights for policy 0, policy_version 40151 (0.0032) [2024-08-05 18:27:13,118][15372] Fps is (10 sec: 23757.0, 60 sec: 23893.4, 300 sec: 24131.7). Total num frames: 328925184. Throughput: 0: 5982.3. Samples: 82231620. Policy #0 lag: (min: 0.0, avg: 4.7, max: 8.0) [2024-08-05 18:27:13,119][15372] Avg episode reward: [(0, '42.921')] [2024-08-05 18:27:15,851][15444] Updated weights for policy 0, policy_version 40161 (0.0018) [2024-08-05 18:27:18,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.8, 300 sec: 24131.7). Total num frames: 329048064. Throughput: 0: 6000.0. Samples: 82267920. Policy #0 lag: (min: 1.0, avg: 4.3, max: 7.0) [2024-08-05 18:27:18,126][15372] Avg episode reward: [(0, '42.310')] [2024-08-05 18:27:19,573][15444] Updated weights for policy 0, policy_version 40171 (0.0019) [2024-08-05 18:27:22,776][15444] Updated weights for policy 0, policy_version 40181 (0.0011) [2024-08-05 18:27:23,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 329170944. Throughput: 0: 6010.0. Samples: 82285660. Policy #0 lag: (min: 1.0, avg: 4.3, max: 7.0) [2024-08-05 18:27:23,119][15372] Avg episode reward: [(0, '41.415')] [2024-08-05 18:27:26,169][15444] Updated weights for policy 0, policy_version 40191 (0.0020) [2024-08-05 18:27:28,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24030.0, 300 sec: 24131.7). Total num frames: 329285632. Throughput: 0: 6002.4. Samples: 82321810. Policy #0 lag: (min: 1.0, avg: 4.3, max: 7.0) [2024-08-05 18:27:28,126][15372] Avg episode reward: [(0, '42.413')] [2024-08-05 18:27:29,776][15444] Updated weights for policy 0, policy_version 40201 (0.0024) [2024-08-05 18:27:32,692][15444] Updated weights for policy 0, policy_version 40211 (0.0015) [2024-08-05 18:27:33,119][15372] Fps is (10 sec: 24575.6, 60 sec: 24302.9, 300 sec: 24159.5). Total num frames: 329416704. Throughput: 0: 6019.1. Samples: 82358530. Policy #0 lag: (min: 1.0, avg: 4.3, max: 7.0) [2024-08-05 18:27:33,119][15372] Avg episode reward: [(0, '40.805')] [2024-08-05 18:27:36,462][15444] Updated weights for policy 0, policy_version 40221 (0.0015) [2024-08-05 18:27:36,816][15417] Signal inference workers to stop experience collection... (14900 times) [2024-08-05 18:27:36,817][15417] Signal inference workers to resume experience collection... (14900 times) [2024-08-05 18:27:36,847][15444] InferenceWorker_p0-w0: stopping experience collection (14900 times) [2024-08-05 18:27:36,848][15444] InferenceWorker_p0-w0: resuming experience collection (14900 times) [2024-08-05 18:27:38,119][15372] Fps is (10 sec: 24575.5, 60 sec: 24029.8, 300 sec: 24159.4). Total num frames: 329531392. Throughput: 0: 6043.8. Samples: 82377360. Policy #0 lag: (min: 1.0, avg: 4.3, max: 7.0) [2024-08-05 18:27:38,119][15372] Avg episode reward: [(0, '41.757')] [2024-08-05 18:27:39,555][15444] Updated weights for policy 0, policy_version 40231 (0.0024) [2024-08-05 18:27:43,091][15444] Updated weights for policy 0, policy_version 40241 (0.0014) [2024-08-05 18:27:43,118][15372] Fps is (10 sec: 23757.2, 60 sec: 24302.9, 300 sec: 24159.5). Total num frames: 329654272. Throughput: 0: 6045.6. Samples: 82413850. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 18:27:43,119][15372] Avg episode reward: [(0, '42.432')] [2024-08-05 18:27:46,351][15444] Updated weights for policy 0, policy_version 40251 (0.0033) [2024-08-05 18:27:48,118][15372] Fps is (10 sec: 23757.3, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 329768960. Throughput: 0: 6039.3. Samples: 82449500. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 18:27:48,126][15372] Avg episode reward: [(0, '42.668')] [2024-08-05 18:27:49,717][15444] Updated weights for policy 0, policy_version 40261 (0.0019) [2024-08-05 18:27:53,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 329891840. Throughput: 0: 6044.9. Samples: 82468350. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 18:27:53,126][15372] Avg episode reward: [(0, '42.899')] [2024-08-05 18:27:53,366][15444] Updated weights for policy 0, policy_version 40271 (0.0010) [2024-08-05 18:27:56,421][15444] Updated weights for policy 0, policy_version 40281 (0.0010) [2024-08-05 18:27:58,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 330014720. Throughput: 0: 6055.1. Samples: 82504100. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 18:27:58,126][15372] Avg episode reward: [(0, '42.373')] [2024-08-05 18:28:00,007][15444] Updated weights for policy 0, policy_version 40291 (0.0012) [2024-08-05 18:28:03,119][15372] Fps is (10 sec: 24574.9, 60 sec: 24166.2, 300 sec: 24159.4). Total num frames: 330137600. Throughput: 0: 6064.4. Samples: 82540820. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 18:28:03,128][15372] Avg episode reward: [(0, '41.786')] [2024-08-05 18:28:03,210][15444] Updated weights for policy 0, policy_version 40301 (0.0019) [2024-08-05 18:28:06,743][15444] Updated weights for policy 0, policy_version 40311 (0.0018) [2024-08-05 18:28:08,119][15372] Fps is (10 sec: 23756.2, 60 sec: 24166.3, 300 sec: 24131.7). Total num frames: 330252288. Throughput: 0: 6070.2. Samples: 82558820. Policy #0 lag: (min: 1.0, avg: 4.7, max: 7.0) [2024-08-05 18:28:08,119][15372] Avg episode reward: [(0, '41.777')] [2024-08-05 18:28:10,462][15444] Updated weights for policy 0, policy_version 40321 (0.0017) [2024-08-05 18:28:13,118][15372] Fps is (10 sec: 23758.0, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 330375168. Throughput: 0: 6068.9. Samples: 82594910. Policy #0 lag: (min: 1.0, avg: 4.7, max: 7.0) [2024-08-05 18:28:13,119][15372] Avg episode reward: [(0, '43.150')] [2024-08-05 18:28:13,367][15444] Updated weights for policy 0, policy_version 40331 (0.0021) [2024-08-05 18:28:17,010][15444] Updated weights for policy 0, policy_version 40341 (0.0021) [2024-08-05 18:28:18,118][15372] Fps is (10 sec: 24576.6, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 330498048. Throughput: 0: 6052.5. Samples: 82630890. Policy #0 lag: (min: 1.0, avg: 4.7, max: 7.0) [2024-08-05 18:28:18,119][15372] Avg episode reward: [(0, '42.171')] [2024-08-05 18:28:18,127][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000040345_330506240.pth... [2024-08-05 18:28:18,252][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000039637_324706304.pth [2024-08-05 18:28:19,494][15417] Signal inference workers to stop experience collection... (14950 times) [2024-08-05 18:28:19,495][15417] Signal inference workers to resume experience collection... (14950 times) [2024-08-05 18:28:19,539][15444] InferenceWorker_p0-w0: stopping experience collection (14950 times) [2024-08-05 18:28:19,540][15444] InferenceWorker_p0-w0: resuming experience collection (14950 times) [2024-08-05 18:28:20,222][15444] Updated weights for policy 0, policy_version 40351 (0.0013) [2024-08-05 18:28:23,120][15372] Fps is (10 sec: 23753.2, 60 sec: 24029.3, 300 sec: 24131.6). Total num frames: 330612736. Throughput: 0: 6043.6. Samples: 82649330. Policy #0 lag: (min: 1.0, avg: 4.7, max: 7.0) [2024-08-05 18:28:23,120][15372] Avg episode reward: [(0, '41.958')] [2024-08-05 18:28:23,801][15444] Updated weights for policy 0, policy_version 40361 (0.0010) [2024-08-05 18:28:27,316][15444] Updated weights for policy 0, policy_version 40371 (0.0014) [2024-08-05 18:28:28,119][15372] Fps is (10 sec: 23756.5, 60 sec: 24166.3, 300 sec: 24103.9). Total num frames: 330735616. Throughput: 0: 6028.9. Samples: 82685150. Policy #0 lag: (min: 1.0, avg: 4.7, max: 7.0) [2024-08-05 18:28:28,119][15372] Avg episode reward: [(0, '41.913')] [2024-08-05 18:28:30,333][15444] Updated weights for policy 0, policy_version 40381 (0.0018) [2024-08-05 18:28:33,119][15372] Fps is (10 sec: 24579.4, 60 sec: 24029.9, 300 sec: 24103.9). Total num frames: 330858496. Throughput: 0: 6031.8. Samples: 82720930. Policy #0 lag: (min: 0.0, avg: 4.7, max: 9.0) [2024-08-05 18:28:33,119][15372] Avg episode reward: [(0, '40.664')] [2024-08-05 18:28:34,073][15444] Updated weights for policy 0, policy_version 40391 (0.0020) [2024-08-05 18:28:37,266][15444] Updated weights for policy 0, policy_version 40401 (0.0026) [2024-08-05 18:28:38,118][15372] Fps is (10 sec: 23757.1, 60 sec: 24029.9, 300 sec: 24103.9). Total num frames: 330973184. Throughput: 0: 6024.2. Samples: 82739440. Policy #0 lag: (min: 0.0, avg: 4.7, max: 9.0) [2024-08-05 18:28:38,119][15372] Avg episode reward: [(0, '42.128')] [2024-08-05 18:28:40,809][15444] Updated weights for policy 0, policy_version 40411 (0.0016) [2024-08-05 18:28:43,119][15372] Fps is (10 sec: 23756.7, 60 sec: 24029.8, 300 sec: 24104.7). Total num frames: 331096064. Throughput: 0: 6020.9. Samples: 82775040. Policy #0 lag: (min: 0.0, avg: 4.7, max: 9.0) [2024-08-05 18:28:43,126][15372] Avg episode reward: [(0, '42.657')] [2024-08-05 18:28:44,203][15444] Updated weights for policy 0, policy_version 40421 (0.0025) [2024-08-05 18:28:48,003][15444] Updated weights for policy 0, policy_version 40431 (0.0034) [2024-08-05 18:28:48,120][15372] Fps is (10 sec: 23752.6, 60 sec: 24029.2, 300 sec: 24103.8). Total num frames: 331210752. Throughput: 0: 6001.4. Samples: 82810890. Policy #0 lag: (min: 0.0, avg: 4.7, max: 9.0) [2024-08-05 18:28:48,120][15372] Avg episode reward: [(0, '42.690')] [2024-08-05 18:28:50,952][15444] Updated weights for policy 0, policy_version 40441 (0.0022) [2024-08-05 18:28:53,118][15372] Fps is (10 sec: 24576.4, 60 sec: 24166.4, 300 sec: 24103.9). Total num frames: 331341824. Throughput: 0: 6007.4. Samples: 82829150. Policy #0 lag: (min: 0.0, avg: 4.7, max: 9.0) [2024-08-05 18:28:53,126][15372] Avg episode reward: [(0, '41.498')] [2024-08-05 18:28:54,784][15444] Updated weights for policy 0, policy_version 40451 (0.0033) [2024-08-05 18:28:57,060][15417] Signal inference workers to stop experience collection... (15000 times) [2024-08-05 18:28:57,064][15417] Signal inference workers to resume experience collection... (15000 times) [2024-08-05 18:28:57,110][15444] InferenceWorker_p0-w0: stopping experience collection (15000 times) [2024-08-05 18:28:57,120][15444] InferenceWorker_p0-w0: resuming experience collection (15000 times) [2024-08-05 18:28:57,989][15444] Updated weights for policy 0, policy_version 40461 (0.0025) [2024-08-05 18:28:58,119][15372] Fps is (10 sec: 24579.8, 60 sec: 24029.8, 300 sec: 24103.9). Total num frames: 331456512. Throughput: 0: 5992.0. Samples: 82864550. Policy #0 lag: (min: 0.0, avg: 3.1, max: 7.0) [2024-08-05 18:28:58,119][15372] Avg episode reward: [(0, '41.493')] [2024-08-05 18:29:01,355][15444] Updated weights for policy 0, policy_version 40471 (0.0011) [2024-08-05 18:29:03,118][15372] Fps is (10 sec: 22937.5, 60 sec: 23893.5, 300 sec: 24103.9). Total num frames: 331571200. Throughput: 0: 5992.4. Samples: 82900550. Policy #0 lag: (min: 0.0, avg: 3.1, max: 7.0) [2024-08-05 18:29:03,126][15372] Avg episode reward: [(0, '42.575')] [2024-08-05 18:29:05,089][15444] Updated weights for policy 0, policy_version 40481 (0.0023) [2024-08-05 18:29:08,102][15444] Updated weights for policy 0, policy_version 40491 (0.0012) [2024-08-05 18:29:08,118][15372] Fps is (10 sec: 24576.4, 60 sec: 24166.5, 300 sec: 24159.5). Total num frames: 331702272. Throughput: 0: 5977.3. Samples: 82918300. Policy #0 lag: (min: 0.0, avg: 3.1, max: 7.0) [2024-08-05 18:29:08,119][15372] Avg episode reward: [(0, '42.777')] [2024-08-05 18:29:11,699][15444] Updated weights for policy 0, policy_version 40501 (0.0016) [2024-08-05 18:29:13,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24029.8, 300 sec: 24103.9). Total num frames: 331816960. Throughput: 0: 5978.5. Samples: 82954180. Policy #0 lag: (min: 0.0, avg: 3.1, max: 7.0) [2024-08-05 18:29:13,119][15372] Avg episode reward: [(0, '42.707')] [2024-08-05 18:29:14,878][15444] Updated weights for policy 0, policy_version 40511 (0.0029) [2024-08-05 18:29:18,120][15372] Fps is (10 sec: 22934.1, 60 sec: 23892.7, 300 sec: 24103.8). Total num frames: 331931648. Throughput: 0: 5992.2. Samples: 82990590. Policy #0 lag: (min: 0.0, avg: 3.1, max: 7.0) [2024-08-05 18:29:18,128][15372] Avg episode reward: [(0, '42.558')] [2024-08-05 18:29:18,524][15444] Updated weights for policy 0, policy_version 40521 (0.0014) [2024-08-05 18:29:22,006][15444] Updated weights for policy 0, policy_version 40531 (0.0026) [2024-08-05 18:29:23,119][15372] Fps is (10 sec: 24575.9, 60 sec: 24167.0, 300 sec: 24131.7). Total num frames: 332062720. Throughput: 0: 5981.3. Samples: 83008600. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 18:29:23,119][15372] Avg episode reward: [(0, '42.200')] [2024-08-05 18:29:24,978][15417] Signal inference workers to stop experience collection... (15050 times) [2024-08-05 18:29:24,981][15417] Signal inference workers to resume experience collection... (15050 times) [2024-08-05 18:29:25,022][15444] InferenceWorker_p0-w0: stopping experience collection (15050 times) [2024-08-05 18:29:25,022][15444] InferenceWorker_p0-w0: resuming experience collection (15050 times) [2024-08-05 18:29:25,058][15444] Updated weights for policy 0, policy_version 40541 (0.0012) [2024-08-05 18:29:28,118][15372] Fps is (10 sec: 24579.7, 60 sec: 24029.9, 300 sec: 24103.9). Total num frames: 332177408. Throughput: 0: 6002.7. Samples: 83045160. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 18:29:28,119][15372] Avg episode reward: [(0, '42.130')] [2024-08-05 18:29:28,689][15444] Updated weights for policy 0, policy_version 40551 (0.0030) [2024-08-05 18:29:31,890][15444] Updated weights for policy 0, policy_version 40561 (0.0016) [2024-08-05 18:29:33,118][15372] Fps is (10 sec: 23757.1, 60 sec: 24029.9, 300 sec: 24103.9). Total num frames: 332300288. Throughput: 0: 6003.3. Samples: 83081030. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 18:29:33,119][15372] Avg episode reward: [(0, '42.422')] [2024-08-05 18:29:35,431][15444] Updated weights for policy 0, policy_version 40571 (0.0027) [2024-08-05 18:29:38,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 332423168. Throughput: 0: 6006.9. Samples: 83099460. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 18:29:38,119][15372] Avg episode reward: [(0, '41.605')] [2024-08-05 18:29:38,799][15444] Updated weights for policy 0, policy_version 40581 (0.0030) [2024-08-05 18:29:42,135][15444] Updated weights for policy 0, policy_version 40591 (0.0013) [2024-08-05 18:29:43,119][15372] Fps is (10 sec: 23756.1, 60 sec: 24029.8, 300 sec: 24103.9). Total num frames: 332537856. Throughput: 0: 6018.0. Samples: 83135360. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 18:29:43,119][15372] Avg episode reward: [(0, '41.277')] [2024-08-05 18:29:45,698][15444] Updated weights for policy 0, policy_version 40601 (0.0014) [2024-08-05 18:29:48,119][15372] Fps is (10 sec: 23756.6, 60 sec: 24167.1, 300 sec: 24103.9). Total num frames: 332660736. Throughput: 0: 6040.2. Samples: 83172360. Policy #0 lag: (min: 0.0, avg: 3.2, max: 7.0) [2024-08-05 18:29:48,119][15372] Avg episode reward: [(0, '41.616')] [2024-08-05 18:29:48,828][15444] Updated weights for policy 0, policy_version 40611 (0.0021) [2024-08-05 18:29:52,500][15444] Updated weights for policy 0, policy_version 40621 (0.0012) [2024-08-05 18:29:53,118][15372] Fps is (10 sec: 24576.7, 60 sec: 24029.9, 300 sec: 24103.9). Total num frames: 332783616. Throughput: 0: 6036.5. Samples: 83189940. Policy #0 lag: (min: 0.0, avg: 3.2, max: 7.0) [2024-08-05 18:29:53,120][15372] Avg episode reward: [(0, '42.197')] [2024-08-05 18:29:55,446][15444] Updated weights for policy 0, policy_version 40631 (0.0018) [2024-08-05 18:29:58,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24029.9, 300 sec: 24076.2). Total num frames: 332898304. Throughput: 0: 6040.2. Samples: 83225990. Policy #0 lag: (min: 0.0, avg: 3.2, max: 7.0) [2024-08-05 18:29:58,126][15372] Avg episode reward: [(0, '42.256')] [2024-08-05 18:29:59,280][15444] Updated weights for policy 0, policy_version 40641 (0.0019) [2024-08-05 18:30:02,789][15444] Updated weights for policy 0, policy_version 40651 (0.0013) [2024-08-05 18:30:03,119][15372] Fps is (10 sec: 23756.0, 60 sec: 24166.3, 300 sec: 24076.1). Total num frames: 333021184. Throughput: 0: 6029.9. Samples: 83261930. Policy #0 lag: (min: 0.0, avg: 3.2, max: 7.0) [2024-08-05 18:30:03,119][15372] Avg episode reward: [(0, '42.415')] [2024-08-05 18:30:05,760][15444] Updated weights for policy 0, policy_version 40661 (0.0012) [2024-08-05 18:30:08,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24029.9, 300 sec: 24076.1). Total num frames: 333144064. Throughput: 0: 6035.6. Samples: 83280200. Policy #0 lag: (min: 0.0, avg: 3.2, max: 7.0) [2024-08-05 18:30:08,128][15372] Avg episode reward: [(0, '42.469')] [2024-08-05 18:30:09,719][15444] Updated weights for policy 0, policy_version 40671 (0.0020) [2024-08-05 18:30:09,817][15417] Signal inference workers to stop experience collection... (15100 times) [2024-08-05 18:30:09,819][15417] Signal inference workers to resume experience collection... (15100 times) [2024-08-05 18:30:09,857][15444] InferenceWorker_p0-w0: stopping experience collection (15100 times) [2024-08-05 18:30:09,858][15444] InferenceWorker_p0-w0: resuming experience collection (15100 times) [2024-08-05 18:30:12,834][15444] Updated weights for policy 0, policy_version 40681 (0.0014) [2024-08-05 18:30:13,118][15372] Fps is (10 sec: 24576.6, 60 sec: 24166.4, 300 sec: 24076.1). Total num frames: 333266944. Throughput: 0: 6014.4. Samples: 83315810. Policy #0 lag: (min: 0.0, avg: 3.2, max: 7.0) [2024-08-05 18:30:13,119][15372] Avg episode reward: [(0, '42.256')] [2024-08-05 18:30:16,185][15444] Updated weights for policy 0, policy_version 40691 (0.0010) [2024-08-05 18:30:18,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24167.0, 300 sec: 24076.1). Total num frames: 333381632. Throughput: 0: 6014.2. Samples: 83351670. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 18:30:18,126][15372] Avg episode reward: [(0, '41.896')] [2024-08-05 18:30:18,131][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000040696_333381632.pth... [2024-08-05 18:30:18,272][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000039991_327606272.pth [2024-08-05 18:30:19,575][15444] Updated weights for policy 0, policy_version 40701 (0.0021) [2024-08-05 18:30:23,054][15444] Updated weights for policy 0, policy_version 40711 (0.0011) [2024-08-05 18:30:23,118][15372] Fps is (10 sec: 23757.1, 60 sec: 24029.9, 300 sec: 24076.2). Total num frames: 333504512. Throughput: 0: 6014.9. Samples: 83370130. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 18:30:23,119][15372] Avg episode reward: [(0, '41.274')] [2024-08-05 18:30:26,397][15444] Updated weights for policy 0, policy_version 40721 (0.0014) [2024-08-05 18:30:28,119][15372] Fps is (10 sec: 24575.4, 60 sec: 24166.3, 300 sec: 24076.1). Total num frames: 333627392. Throughput: 0: 6016.2. Samples: 83406090. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 18:30:28,126][15372] Avg episode reward: [(0, '41.005')] [2024-08-05 18:30:29,704][15444] Updated weights for policy 0, policy_version 40731 (0.0031) [2024-08-05 18:30:33,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24029.9, 300 sec: 24076.2). Total num frames: 333742080. Throughput: 0: 6003.3. Samples: 83442510. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 18:30:33,119][15372] Avg episode reward: [(0, '41.099')] [2024-08-05 18:30:33,352][15444] Updated weights for policy 0, policy_version 40741 (0.0014) [2024-08-05 18:30:36,550][15444] Updated weights for policy 0, policy_version 40751 (0.0011) [2024-08-05 18:30:38,119][15372] Fps is (10 sec: 23756.1, 60 sec: 24029.6, 300 sec: 24048.4). Total num frames: 333864960. Throughput: 0: 6019.7. Samples: 83460830. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 18:30:38,127][15372] Avg episode reward: [(0, '41.949')] [2024-08-05 18:30:39,927][15444] Updated weights for policy 0, policy_version 40761 (0.0023) [2024-08-05 18:30:43,119][15372] Fps is (10 sec: 24575.8, 60 sec: 24166.5, 300 sec: 24076.1). Total num frames: 333987840. Throughput: 0: 6039.1. Samples: 83497750. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 18:30:43,127][15372] Avg episode reward: [(0, '42.080')] [2024-08-05 18:30:43,338][15444] Updated weights for policy 0, policy_version 40771 (0.0023) [2024-08-05 18:30:46,636][15444] Updated weights for policy 0, policy_version 40781 (0.0011) [2024-08-05 18:30:48,118][15372] Fps is (10 sec: 24577.4, 60 sec: 24166.4, 300 sec: 24076.2). Total num frames: 334110720. Throughput: 0: 6039.4. Samples: 83533700. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 18:30:48,126][15372] Avg episode reward: [(0, '41.722')] [2024-08-05 18:30:49,905][15444] Updated weights for policy 0, policy_version 40791 (0.0011) [2024-08-05 18:30:53,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24166.4, 300 sec: 24103.9). Total num frames: 334233600. Throughput: 0: 6054.7. Samples: 83552660. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 18:30:53,126][15372] Avg episode reward: [(0, '41.098')] [2024-08-05 18:30:53,370][15444] Updated weights for policy 0, policy_version 40801 (0.0013) [2024-08-05 18:30:56,557][15444] Updated weights for policy 0, policy_version 40811 (0.0032) [2024-08-05 18:30:58,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24302.9, 300 sec: 24103.9). Total num frames: 334356480. Throughput: 0: 6059.6. Samples: 83588490. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 18:30:58,126][15372] Avg episode reward: [(0, '41.770')] [2024-08-05 18:30:59,986][15444] Updated weights for policy 0, policy_version 40821 (0.0017) [2024-08-05 18:31:03,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24303.1, 300 sec: 24103.9). Total num frames: 334479360. Throughput: 0: 6081.8. Samples: 83625350. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 18:31:03,126][15372] Avg episode reward: [(0, '42.469')] [2024-08-05 18:31:03,544][15444] Updated weights for policy 0, policy_version 40831 (0.0027) [2024-08-05 18:31:06,692][15444] Updated weights for policy 0, policy_version 40841 (0.0012) [2024-08-05 18:31:08,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24302.9, 300 sec: 24103.9). Total num frames: 334602240. Throughput: 0: 6088.2. Samples: 83644100. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 18:31:08,119][15372] Avg episode reward: [(0, '41.833')] [2024-08-05 18:31:10,120][15444] Updated weights for policy 0, policy_version 40851 (0.0018) [2024-08-05 18:31:12,605][15417] Signal inference workers to stop experience collection... (15150 times) [2024-08-05 18:31:12,605][15417] Signal inference workers to resume experience collection... (15150 times) [2024-08-05 18:31:12,643][15444] InferenceWorker_p0-w0: stopping experience collection (15150 times) [2024-08-05 18:31:12,643][15444] InferenceWorker_p0-w0: resuming experience collection (15150 times) [2024-08-05 18:31:13,119][15372] Fps is (10 sec: 24575.5, 60 sec: 24302.9, 300 sec: 24159.5). Total num frames: 334725120. Throughput: 0: 6104.9. Samples: 83680810. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 18:31:13,126][15372] Avg episode reward: [(0, '41.327')] [2024-08-05 18:31:13,560][15444] Updated weights for policy 0, policy_version 40861 (0.0020) [2024-08-05 18:31:16,758][15444] Updated weights for policy 0, policy_version 40871 (0.0021) [2024-08-05 18:31:18,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24302.9, 300 sec: 24103.9). Total num frames: 334839808. Throughput: 0: 6094.7. Samples: 83716770. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 18:31:18,126][15372] Avg episode reward: [(0, '41.744')] [2024-08-05 18:31:20,379][15444] Updated weights for policy 0, policy_version 40881 (0.0011) [2024-08-05 18:31:23,119][15372] Fps is (10 sec: 23757.1, 60 sec: 24302.9, 300 sec: 24131.7). Total num frames: 334962688. Throughput: 0: 6113.0. Samples: 83735910. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 18:31:23,119][15372] Avg episode reward: [(0, '41.453')] [2024-08-05 18:31:23,515][15444] Updated weights for policy 0, policy_version 40891 (0.0024) [2024-08-05 18:31:27,155][15444] Updated weights for policy 0, policy_version 40901 (0.0017) [2024-08-05 18:31:28,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24303.0, 300 sec: 24159.5). Total num frames: 335085568. Throughput: 0: 6083.1. Samples: 83771490. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 18:31:28,119][15372] Avg episode reward: [(0, '41.697')] [2024-08-05 18:31:30,243][15444] Updated weights for policy 0, policy_version 40911 (0.0027) [2024-08-05 18:31:33,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24302.9, 300 sec: 24103.9). Total num frames: 335200256. Throughput: 0: 6084.7. Samples: 83807510. Policy #0 lag: (min: 1.0, avg: 3.6, max: 8.0) [2024-08-05 18:31:33,126][15372] Avg episode reward: [(0, '41.121')] [2024-08-05 18:31:33,821][15444] Updated weights for policy 0, policy_version 40921 (0.0023) [2024-08-05 18:31:37,345][15444] Updated weights for policy 0, policy_version 40931 (0.0030) [2024-08-05 18:31:38,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24303.2, 300 sec: 24159.5). Total num frames: 335323136. Throughput: 0: 6072.7. Samples: 83825930. Policy #0 lag: (min: 1.0, avg: 3.6, max: 8.0) [2024-08-05 18:31:38,119][15372] Avg episode reward: [(0, '41.099')] [2024-08-05 18:31:40,504][15444] Updated weights for policy 0, policy_version 40941 (0.0021) [2024-08-05 18:31:43,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24302.9, 300 sec: 24131.7). Total num frames: 335446016. Throughput: 0: 6074.0. Samples: 83861820. Policy #0 lag: (min: 1.0, avg: 3.6, max: 8.0) [2024-08-05 18:31:43,126][15372] Avg episode reward: [(0, '41.821')] [2024-08-05 18:31:44,114][15444] Updated weights for policy 0, policy_version 40951 (0.0011) [2024-08-05 18:31:47,364][15444] Updated weights for policy 0, policy_version 40961 (0.0020) [2024-08-05 18:31:48,119][15372] Fps is (10 sec: 23756.5, 60 sec: 24166.3, 300 sec: 24131.7). Total num frames: 335560704. Throughput: 0: 6046.6. Samples: 83897450. Policy #0 lag: (min: 1.0, avg: 3.6, max: 8.0) [2024-08-05 18:31:48,126][15372] Avg episode reward: [(0, '42.429')] [2024-08-05 18:31:50,763][15444] Updated weights for policy 0, policy_version 40971 (0.0014) [2024-08-05 18:31:53,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 335683584. Throughput: 0: 6033.3. Samples: 83915600. Policy #0 lag: (min: 1.0, avg: 3.6, max: 8.0) [2024-08-05 18:31:53,126][15372] Avg episode reward: [(0, '43.011')] [2024-08-05 18:31:54,249][15444] Updated weights for policy 0, policy_version 40981 (0.0018) [2024-08-05 18:31:57,687][15444] Updated weights for policy 0, policy_version 40991 (0.0014) [2024-08-05 18:31:58,118][15372] Fps is (10 sec: 24576.3, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 335806464. Throughput: 0: 6025.8. Samples: 83951970. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 18:31:58,119][15372] Avg episode reward: [(0, '42.280')] [2024-08-05 18:32:01,116][15444] Updated weights for policy 0, policy_version 41001 (0.0012) [2024-08-05 18:32:02,663][15417] Signal inference workers to stop experience collection... (15200 times) [2024-08-05 18:32:02,671][15417] Signal inference workers to resume experience collection... (15200 times) [2024-08-05 18:32:02,706][15444] InferenceWorker_p0-w0: stopping experience collection (15200 times) [2024-08-05 18:32:02,706][15444] InferenceWorker_p0-w0: resuming experience collection (15200 times) [2024-08-05 18:32:03,119][15372] Fps is (10 sec: 23756.2, 60 sec: 24029.8, 300 sec: 24131.7). Total num frames: 335921152. Throughput: 0: 6014.0. Samples: 83987400. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 18:32:03,119][15372] Avg episode reward: [(0, '41.528')] [2024-08-05 18:32:04,621][15444] Updated weights for policy 0, policy_version 41011 (0.0011) [2024-08-05 18:32:07,791][15444] Updated weights for policy 0, policy_version 41021 (0.0036) [2024-08-05 18:32:08,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 336044032. Throughput: 0: 6006.0. Samples: 84006180. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 18:32:08,119][15372] Avg episode reward: [(0, '41.332')] [2024-08-05 18:32:11,150][15444] Updated weights for policy 0, policy_version 41031 (0.0015) [2024-08-05 18:32:13,118][15372] Fps is (10 sec: 24576.6, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 336166912. Throughput: 0: 6003.6. Samples: 84041650. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 18:32:13,126][15372] Avg episode reward: [(0, '41.982')] [2024-08-05 18:32:14,871][15444] Updated weights for policy 0, policy_version 41041 (0.0023) [2024-08-05 18:32:18,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24029.8, 300 sec: 24103.9). Total num frames: 336281600. Throughput: 0: 6009.3. Samples: 84077930. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 18:32:18,126][15372] Avg episode reward: [(0, '43.041')] [2024-08-05 18:32:18,162][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000041051_336289792.pth... [2024-08-05 18:32:18,192][15444] Updated weights for policy 0, policy_version 41051 (0.0012) [2024-08-05 18:32:18,326][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000040345_330506240.pth [2024-08-05 18:32:21,431][15444] Updated weights for policy 0, policy_version 41061 (0.0012) [2024-08-05 18:32:23,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 336404480. Throughput: 0: 6008.2. Samples: 84096300. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 18:32:23,119][15372] Avg episode reward: [(0, '42.380')] [2024-08-05 18:32:25,095][15444] Updated weights for policy 0, policy_version 41071 (0.0012) [2024-08-05 18:32:28,093][15444] Updated weights for policy 0, policy_version 41081 (0.0020) [2024-08-05 18:32:28,118][15372] Fps is (10 sec: 25395.3, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 336535552. Throughput: 0: 6019.6. Samples: 84132700. Policy #0 lag: (min: 1.0, avg: 4.3, max: 9.0) [2024-08-05 18:32:28,119][15372] Avg episode reward: [(0, '42.937')] [2024-08-05 18:32:31,726][15444] Updated weights for policy 0, policy_version 41091 (0.0020) [2024-08-05 18:32:33,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 336650240. Throughput: 0: 6028.2. Samples: 84168720. Policy #0 lag: (min: 1.0, avg: 4.3, max: 9.0) [2024-08-05 18:32:33,119][15372] Avg episode reward: [(0, '43.427')] [2024-08-05 18:32:34,893][15444] Updated weights for policy 0, policy_version 41101 (0.0019) [2024-08-05 18:32:38,119][15372] Fps is (10 sec: 23756.5, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 336773120. Throughput: 0: 6046.6. Samples: 84187700. Policy #0 lag: (min: 1.0, avg: 4.3, max: 9.0) [2024-08-05 18:32:38,126][15372] Avg episode reward: [(0, '42.522')] [2024-08-05 18:32:38,366][15444] Updated weights for policy 0, policy_version 41111 (0.0026) [2024-08-05 18:32:41,728][15444] Updated weights for policy 0, policy_version 41121 (0.0022) [2024-08-05 18:32:43,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 336896000. Throughput: 0: 6030.2. Samples: 84223330. Policy #0 lag: (min: 1.0, avg: 4.3, max: 9.0) [2024-08-05 18:32:43,119][15372] Avg episode reward: [(0, '42.076')] [2024-08-05 18:32:44,646][15417] Signal inference workers to stop experience collection... (15250 times) [2024-08-05 18:32:44,647][15417] Signal inference workers to resume experience collection... (15250 times) [2024-08-05 18:32:44,692][15444] InferenceWorker_p0-w0: stopping experience collection (15250 times) [2024-08-05 18:32:44,692][15444] InferenceWorker_p0-w0: resuming experience collection (15250 times) [2024-08-05 18:32:44,969][15444] Updated weights for policy 0, policy_version 41131 (0.0017) [2024-08-05 18:32:48,119][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 337010688. Throughput: 0: 6068.9. Samples: 84260500. Policy #0 lag: (min: 1.0, avg: 4.3, max: 9.0) [2024-08-05 18:32:48,119][15372] Avg episode reward: [(0, '42.179')] [2024-08-05 18:32:48,399][15444] Updated weights for policy 0, policy_version 41141 (0.0024) [2024-08-05 18:32:51,825][15444] Updated weights for policy 0, policy_version 41151 (0.0021) [2024-08-05 18:32:53,119][15372] Fps is (10 sec: 23756.4, 60 sec: 24166.3, 300 sec: 24131.7). Total num frames: 337133568. Throughput: 0: 6050.4. Samples: 84278450. Policy #0 lag: (min: 0.0, avg: 3.8, max: 9.0) [2024-08-05 18:32:53,119][15372] Avg episode reward: [(0, '41.822')] [2024-08-05 18:32:55,098][15444] Updated weights for policy 0, policy_version 41161 (0.0012) [2024-08-05 18:32:58,119][15372] Fps is (10 sec: 24576.2, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 337256448. Throughput: 0: 6076.2. Samples: 84315080. Policy #0 lag: (min: 0.0, avg: 3.8, max: 9.0) [2024-08-05 18:32:58,132][15372] Avg episode reward: [(0, '42.064')] [2024-08-05 18:32:58,812][15444] Updated weights for policy 0, policy_version 41171 (0.0022) [2024-08-05 18:33:02,065][15444] Updated weights for policy 0, policy_version 41181 (0.0017) [2024-08-05 18:33:03,118][15372] Fps is (10 sec: 24576.6, 60 sec: 24303.0, 300 sec: 24159.5). Total num frames: 337379328. Throughput: 0: 6056.0. Samples: 84350450. Policy #0 lag: (min: 0.0, avg: 3.8, max: 9.0) [2024-08-05 18:33:03,119][15372] Avg episode reward: [(0, '42.022')] [2024-08-05 18:33:05,340][15444] Updated weights for policy 0, policy_version 41191 (0.0011) [2024-08-05 18:33:08,119][15372] Fps is (10 sec: 24576.0, 60 sec: 24302.9, 300 sec: 24159.4). Total num frames: 337502208. Throughput: 0: 6067.5. Samples: 84369340. Policy #0 lag: (min: 0.0, avg: 3.8, max: 9.0) [2024-08-05 18:33:08,119][15372] Avg episode reward: [(0, '41.942')] [2024-08-05 18:33:08,650][15444] Updated weights for policy 0, policy_version 41201 (0.0011) [2024-08-05 18:33:12,318][15444] Updated weights for policy 0, policy_version 41211 (0.0024) [2024-08-05 18:33:13,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24302.9, 300 sec: 24159.5). Total num frames: 337625088. Throughput: 0: 6051.8. Samples: 84405030. Policy #0 lag: (min: 0.0, avg: 3.8, max: 9.0) [2024-08-05 18:33:13,119][15372] Avg episode reward: [(0, '42.256')] [2024-08-05 18:33:15,578][15444] Updated weights for policy 0, policy_version 41221 (0.0029) [2024-08-05 18:33:18,119][15372] Fps is (10 sec: 23755.9, 60 sec: 24302.8, 300 sec: 24159.5). Total num frames: 337739776. Throughput: 0: 6053.3. Samples: 84441120. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 18:33:18,127][15372] Avg episode reward: [(0, '41.467')] [2024-08-05 18:33:19,018][15444] Updated weights for policy 0, policy_version 41231 (0.0015) [2024-08-05 18:33:22,335][15444] Updated weights for policy 0, policy_version 41241 (0.0013) [2024-08-05 18:33:23,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24302.9, 300 sec: 24159.5). Total num frames: 337862656. Throughput: 0: 6034.9. Samples: 84459270. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 18:33:23,126][15372] Avg episode reward: [(0, '42.050')] [2024-08-05 18:33:25,660][15444] Updated weights for policy 0, policy_version 41251 (0.0020) [2024-08-05 18:33:28,118][15372] Fps is (10 sec: 24577.0, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 337985536. Throughput: 0: 6049.8. Samples: 84495570. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 18:33:28,126][15372] Avg episode reward: [(0, '42.506')] [2024-08-05 18:33:29,423][15444] Updated weights for policy 0, policy_version 41261 (0.0013) [2024-08-05 18:33:32,643][15444] Updated weights for policy 0, policy_version 41271 (0.0014) [2024-08-05 18:33:33,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 338100224. Throughput: 0: 6010.7. Samples: 84530980. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 18:33:33,119][15372] Avg episode reward: [(0, '40.732')] [2024-08-05 18:33:35,953][15444] Updated weights for policy 0, policy_version 41281 (0.0019) [2024-08-05 18:33:38,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 338223104. Throughput: 0: 6030.3. Samples: 84549810. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 18:33:38,126][15372] Avg episode reward: [(0, '41.470')] [2024-08-05 18:33:39,534][15444] Updated weights for policy 0, policy_version 41291 (0.0022) [2024-08-05 18:33:39,666][15417] Signal inference workers to stop experience collection... (15300 times) [2024-08-05 18:33:39,667][15417] Signal inference workers to resume experience collection... (15300 times) [2024-08-05 18:33:39,704][15444] InferenceWorker_p0-w0: stopping experience collection (15300 times) [2024-08-05 18:33:39,704][15444] InferenceWorker_p0-w0: resuming experience collection (15300 times) [2024-08-05 18:33:42,692][15444] Updated weights for policy 0, policy_version 41301 (0.0019) [2024-08-05 18:33:43,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24166.4, 300 sec: 24187.4). Total num frames: 338345984. Throughput: 0: 6024.2. Samples: 84586170. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 18:33:43,119][15372] Avg episode reward: [(0, '42.173')] [2024-08-05 18:33:46,381][15444] Updated weights for policy 0, policy_version 41311 (0.0032) [2024-08-05 18:33:48,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 338460672. Throughput: 0: 6032.0. Samples: 84621890. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 18:33:48,119][15372] Avg episode reward: [(0, '42.440')] [2024-08-05 18:33:49,458][15444] Updated weights for policy 0, policy_version 41321 (0.0014) [2024-08-05 18:33:52,875][15444] Updated weights for policy 0, policy_version 41331 (0.0010) [2024-08-05 18:33:53,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.5, 300 sec: 24159.5). Total num frames: 338583552. Throughput: 0: 6028.7. Samples: 84640630. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 18:33:53,119][15372] Avg episode reward: [(0, '41.455')] [2024-08-05 18:33:56,411][15444] Updated weights for policy 0, policy_version 41341 (0.0028) [2024-08-05 18:33:58,119][15372] Fps is (10 sec: 24575.8, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 338706432. Throughput: 0: 6031.1. Samples: 84676430. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 18:33:58,126][15372] Avg episode reward: [(0, '41.537')] [2024-08-05 18:33:59,784][15444] Updated weights for policy 0, policy_version 41351 (0.0024) [2024-08-05 18:34:03,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 338821120. Throughput: 0: 6041.0. Samples: 84712960. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 18:34:03,126][15372] Avg episode reward: [(0, '43.276')] [2024-08-05 18:34:03,238][15444] Updated weights for policy 0, policy_version 41361 (0.0010) [2024-08-05 18:34:06,341][15444] Updated weights for policy 0, policy_version 41371 (0.0014) [2024-08-05 18:34:08,119][15372] Fps is (10 sec: 24576.1, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 338952192. Throughput: 0: 6056.7. Samples: 84731820. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 18:34:08,126][15372] Avg episode reward: [(0, '43.458')] [2024-08-05 18:34:09,736][15444] Updated weights for policy 0, policy_version 41381 (0.0021) [2024-08-05 18:34:13,090][15444] Updated weights for policy 0, policy_version 41391 (0.0019) [2024-08-05 18:34:13,119][15372] Fps is (10 sec: 25394.8, 60 sec: 24166.3, 300 sec: 24215.1). Total num frames: 339075072. Throughput: 0: 6056.2. Samples: 84768100. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 18:34:13,119][15372] Avg episode reward: [(0, '42.403')] [2024-08-05 18:34:16,707][15444] Updated weights for policy 0, policy_version 41401 (0.0022) [2024-08-05 18:34:18,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.6, 300 sec: 24159.5). Total num frames: 339189760. Throughput: 0: 6058.4. Samples: 84803610. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 18:34:18,119][15372] Avg episode reward: [(0, '41.842')] [2024-08-05 18:34:18,122][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000041405_339189760.pth... [2024-08-05 18:34:18,241][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000040696_333381632.pth [2024-08-05 18:34:20,025][15444] Updated weights for policy 0, policy_version 41411 (0.0025) [2024-08-05 18:34:23,118][15372] Fps is (10 sec: 22937.9, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 339304448. Throughput: 0: 6058.9. Samples: 84822460. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 18:34:23,126][15372] Avg episode reward: [(0, '41.988')] [2024-08-05 18:34:23,471][15444] Updated weights for policy 0, policy_version 41421 (0.0016) [2024-08-05 18:34:26,996][15444] Updated weights for policy 0, policy_version 41431 (0.0012) [2024-08-05 18:34:28,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 339435520. Throughput: 0: 6043.1. Samples: 84858110. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 18:34:28,119][15372] Avg episode reward: [(0, '41.623')] [2024-08-05 18:34:30,114][15444] Updated weights for policy 0, policy_version 41441 (0.0017) [2024-08-05 18:34:31,674][15417] Signal inference workers to stop experience collection... (15350 times) [2024-08-05 18:34:31,674][15417] Signal inference workers to resume experience collection... (15350 times) [2024-08-05 18:34:31,712][15444] InferenceWorker_p0-w0: stopping experience collection (15350 times) [2024-08-05 18:34:31,712][15444] InferenceWorker_p0-w0: resuming experience collection (15350 times) [2024-08-05 18:34:33,118][15372] Fps is (10 sec: 25395.2, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 339558400. Throughput: 0: 6069.6. Samples: 84895020. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 18:34:33,119][15372] Avg episode reward: [(0, '42.892')] [2024-08-05 18:34:33,582][15444] Updated weights for policy 0, policy_version 41451 (0.0012) [2024-08-05 18:34:36,866][15444] Updated weights for policy 0, policy_version 41461 (0.0034) [2024-08-05 18:34:38,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 339681280. Throughput: 0: 6072.7. Samples: 84913900. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 18:34:38,119][15372] Avg episode reward: [(0, '43.170')] [2024-08-05 18:34:40,083][15444] Updated weights for policy 0, policy_version 41471 (0.0011) [2024-08-05 18:34:43,119][15372] Fps is (10 sec: 24576.2, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 339804160. Throughput: 0: 6099.6. Samples: 84950910. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 18:34:43,119][15372] Avg episode reward: [(0, '42.965')] [2024-08-05 18:34:43,643][15444] Updated weights for policy 0, policy_version 41481 (0.0014) [2024-08-05 18:34:46,847][15444] Updated weights for policy 0, policy_version 41491 (0.0024) [2024-08-05 18:34:48,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24303.0, 300 sec: 24187.2). Total num frames: 339918848. Throughput: 0: 6079.6. Samples: 84986540. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 18:34:48,119][15372] Avg episode reward: [(0, '41.872')] [2024-08-05 18:34:50,168][15444] Updated weights for policy 0, policy_version 41501 (0.0011) [2024-08-05 18:34:53,119][15372] Fps is (10 sec: 23756.4, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 340041728. Throughput: 0: 6068.2. Samples: 85004890. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 18:34:53,126][15372] Avg episode reward: [(0, '41.798')] [2024-08-05 18:34:53,634][15444] Updated weights for policy 0, policy_version 41511 (0.0034) [2024-08-05 18:34:56,922][15444] Updated weights for policy 0, policy_version 41521 (0.0011) [2024-08-05 18:34:58,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24303.0, 300 sec: 24215.0). Total num frames: 340164608. Throughput: 0: 6064.5. Samples: 85041000. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 18:34:58,119][15372] Avg episode reward: [(0, '42.468')] [2024-08-05 18:35:00,448][15444] Updated weights for policy 0, policy_version 41531 (0.0023) [2024-08-05 18:35:03,118][15372] Fps is (10 sec: 23757.2, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 340279296. Throughput: 0: 6071.6. Samples: 85076830. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 18:35:03,126][15372] Avg episode reward: [(0, '42.867')] [2024-08-05 18:35:03,869][15444] Updated weights for policy 0, policy_version 41541 (0.0019) [2024-08-05 18:35:07,426][15444] Updated weights for policy 0, policy_version 41551 (0.0011) [2024-08-05 18:35:08,120][15372] Fps is (10 sec: 23753.1, 60 sec: 24165.8, 300 sec: 24187.1). Total num frames: 340402176. Throughput: 0: 6052.7. Samples: 85094840. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 18:35:08,121][15372] Avg episode reward: [(0, '43.087')] [2024-08-05 18:35:10,535][15444] Updated weights for policy 0, policy_version 41561 (0.0024) [2024-08-05 18:35:13,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 340525056. Throughput: 0: 6065.3. Samples: 85131050. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 18:35:13,126][15372] Avg episode reward: [(0, '42.471')] [2024-08-05 18:35:14,193][15444] Updated weights for policy 0, policy_version 41571 (0.0023) [2024-08-05 18:35:17,576][15444] Updated weights for policy 0, policy_version 41581 (0.0023) [2024-08-05 18:35:18,118][15372] Fps is (10 sec: 23760.5, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 340639744. Throughput: 0: 6040.9. Samples: 85166860. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 18:35:18,119][15372] Avg episode reward: [(0, '42.158')] [2024-08-05 18:35:20,818][15444] Updated weights for policy 0, policy_version 41591 (0.0018) [2024-08-05 18:35:21,906][15417] Signal inference workers to stop experience collection... (15400 times) [2024-08-05 18:35:21,914][15417] Signal inference workers to resume experience collection... (15400 times) [2024-08-05 18:35:21,986][15444] InferenceWorker_p0-w0: stopping experience collection (15400 times) [2024-08-05 18:35:21,987][15444] InferenceWorker_p0-w0: resuming experience collection (15400 times) [2024-08-05 18:35:23,119][15372] Fps is (10 sec: 23755.7, 60 sec: 24302.8, 300 sec: 24187.2). Total num frames: 340762624. Throughput: 0: 6024.8. Samples: 85185020. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 18:35:23,119][15372] Avg episode reward: [(0, '42.811')] [2024-08-05 18:35:24,231][15444] Updated weights for policy 0, policy_version 41601 (0.0011) [2024-08-05 18:35:27,641][15444] Updated weights for policy 0, policy_version 41611 (0.0034) [2024-08-05 18:35:28,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 340885504. Throughput: 0: 6011.5. Samples: 85221430. Policy #0 lag: (min: 1.0, avg: 4.1, max: 7.0) [2024-08-05 18:35:28,119][15372] Avg episode reward: [(0, '42.133')] [2024-08-05 18:35:30,843][15444] Updated weights for policy 0, policy_version 41621 (0.0013) [2024-08-05 18:35:33,118][15372] Fps is (10 sec: 24577.3, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 341008384. Throughput: 0: 6026.7. Samples: 85257740. Policy #0 lag: (min: 1.0, avg: 4.1, max: 7.0) [2024-08-05 18:35:33,126][15372] Avg episode reward: [(0, '41.917')] [2024-08-05 18:35:34,715][15444] Updated weights for policy 0, policy_version 41631 (0.0033) [2024-08-05 18:35:37,922][15444] Updated weights for policy 0, policy_version 41641 (0.0022) [2024-08-05 18:35:38,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.9, 300 sec: 24187.2). Total num frames: 341123072. Throughput: 0: 6007.8. Samples: 85275240. Policy #0 lag: (min: 1.0, avg: 4.1, max: 7.0) [2024-08-05 18:35:38,119][15372] Avg episode reward: [(0, '43.897')] [2024-08-05 18:35:41,150][15444] Updated weights for policy 0, policy_version 41651 (0.0029) [2024-08-05 18:35:43,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24029.9, 300 sec: 24187.2). Total num frames: 341245952. Throughput: 0: 6004.7. Samples: 85311210. Policy #0 lag: (min: 1.0, avg: 4.1, max: 7.0) [2024-08-05 18:35:43,126][15372] Avg episode reward: [(0, '43.455')] [2024-08-05 18:35:44,711][15444] Updated weights for policy 0, policy_version 41661 (0.0012) [2024-08-05 18:35:47,945][15444] Updated weights for policy 0, policy_version 41671 (0.0016) [2024-08-05 18:35:48,119][15372] Fps is (10 sec: 24574.6, 60 sec: 24166.2, 300 sec: 24187.2). Total num frames: 341368832. Throughput: 0: 6021.3. Samples: 85347790. Policy #0 lag: (min: 1.0, avg: 4.1, max: 7.0) [2024-08-05 18:35:48,119][15372] Avg episode reward: [(0, '42.738')] [2024-08-05 18:35:51,357][15444] Updated weights for policy 0, policy_version 41681 (0.0022) [2024-08-05 18:35:53,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 341483520. Throughput: 0: 6040.4. Samples: 85366650. Policy #0 lag: (min: 1.0, avg: 4.1, max: 7.0) [2024-08-05 18:35:53,126][15372] Avg episode reward: [(0, '43.369')] [2024-08-05 18:35:54,793][15444] Updated weights for policy 0, policy_version 41691 (0.0013) [2024-08-05 18:35:58,118][15372] Fps is (10 sec: 23758.1, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 341606400. Throughput: 0: 6040.4. Samples: 85402870. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 18:35:58,126][15372] Avg episode reward: [(0, '42.172')] [2024-08-05 18:35:58,182][15444] Updated weights for policy 0, policy_version 41701 (0.0015) [2024-08-05 18:36:01,568][15444] Updated weights for policy 0, policy_version 41711 (0.0018) [2024-08-05 18:36:03,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 341729280. Throughput: 0: 6043.1. Samples: 85438800. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 18:36:03,126][15372] Avg episode reward: [(0, '41.673')] [2024-08-05 18:36:04,948][15444] Updated weights for policy 0, policy_version 41721 (0.0017) [2024-08-05 18:36:08,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24167.0, 300 sec: 24159.5). Total num frames: 341852160. Throughput: 0: 6057.4. Samples: 85457600. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 18:36:08,126][15372] Avg episode reward: [(0, '42.376')] [2024-08-05 18:36:08,295][15444] Updated weights for policy 0, policy_version 41731 (0.0018) [2024-08-05 18:36:11,780][15444] Updated weights for policy 0, policy_version 41741 (0.0030) [2024-08-05 18:36:13,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 341975040. Throughput: 0: 6054.9. Samples: 85493900. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 18:36:13,119][15372] Avg episode reward: [(0, '42.157')] [2024-08-05 18:36:14,981][15444] Updated weights for policy 0, policy_version 41751 (0.0012) [2024-08-05 18:36:18,009][15417] Signal inference workers to stop experience collection... (15450 times) [2024-08-05 18:36:18,017][15417] Signal inference workers to resume experience collection... (15450 times) [2024-08-05 18:36:18,053][15444] InferenceWorker_p0-w0: stopping experience collection (15450 times) [2024-08-05 18:36:18,053][15444] InferenceWorker_p0-w0: resuming experience collection (15450 times) [2024-08-05 18:36:18,119][15372] Fps is (10 sec: 24575.5, 60 sec: 24302.8, 300 sec: 24187.2). Total num frames: 342097920. Throughput: 0: 6060.2. Samples: 85530450. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 18:36:18,119][15372] Avg episode reward: [(0, '40.745')] [2024-08-05 18:36:18,122][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000041760_342097920.pth... [2024-08-05 18:36:18,254][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000041051_336289792.pth [2024-08-05 18:36:18,471][15444] Updated weights for policy 0, policy_version 41761 (0.0028) [2024-08-05 18:36:21,726][15444] Updated weights for policy 0, policy_version 41771 (0.0022) [2024-08-05 18:36:23,119][15372] Fps is (10 sec: 24575.5, 60 sec: 24303.0, 300 sec: 24187.2). Total num frames: 342220800. Throughput: 0: 6068.4. Samples: 85548320. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 18:36:23,119][15372] Avg episode reward: [(0, '41.090')] [2024-08-05 18:36:25,105][15444] Updated weights for policy 0, policy_version 41781 (0.0016) [2024-08-05 18:36:28,118][15372] Fps is (10 sec: 23757.2, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 342335488. Throughput: 0: 6085.1. Samples: 85585040. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 18:36:28,126][15372] Avg episode reward: [(0, '41.168')] [2024-08-05 18:36:28,475][15444] Updated weights for policy 0, policy_version 41791 (0.0018) [2024-08-05 18:36:32,011][15444] Updated weights for policy 0, policy_version 41801 (0.0021) [2024-08-05 18:36:33,118][15372] Fps is (10 sec: 23757.3, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 342458368. Throughput: 0: 6073.9. Samples: 85621110. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 18:36:33,119][15372] Avg episode reward: [(0, '41.945')] [2024-08-05 18:36:35,372][15444] Updated weights for policy 0, policy_version 41811 (0.0010) [2024-08-05 18:36:38,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 342581248. Throughput: 0: 6048.2. Samples: 85638820. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 18:36:38,124][15372] Avg episode reward: [(0, '42.263')] [2024-08-05 18:36:38,884][15444] Updated weights for policy 0, policy_version 41821 (0.0021) [2024-08-05 18:36:42,182][15444] Updated weights for policy 0, policy_version 41831 (0.0012) [2024-08-05 18:36:43,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 342704128. Throughput: 0: 6046.5. Samples: 85674960. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 18:36:43,119][15372] Avg episode reward: [(0, '42.540')] [2024-08-05 18:36:45,519][15444] Updated weights for policy 0, policy_version 41841 (0.0031) [2024-08-05 18:36:48,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.6, 300 sec: 24187.2). Total num frames: 342818816. Throughput: 0: 6050.4. Samples: 85711070. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 18:36:48,126][15372] Avg episode reward: [(0, '41.807')] [2024-08-05 18:36:49,062][15444] Updated weights for policy 0, policy_version 41851 (0.0021) [2024-08-05 18:36:52,292][15444] Updated weights for policy 0, policy_version 41861 (0.0019) [2024-08-05 18:36:53,118][15372] Fps is (10 sec: 22937.6, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 342933504. Throughput: 0: 6036.0. Samples: 85729220. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 18:36:53,126][15372] Avg episode reward: [(0, '40.874')] [2024-08-05 18:36:55,626][15444] Updated weights for policy 0, policy_version 41871 (0.0017) [2024-08-05 18:36:58,120][15372] Fps is (10 sec: 24572.7, 60 sec: 24302.4, 300 sec: 24214.9). Total num frames: 343064576. Throughput: 0: 6037.8. Samples: 85765610. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 18:36:58,120][15372] Avg episode reward: [(0, '41.310')] [2024-08-05 18:36:59,351][15444] Updated weights for policy 0, policy_version 41881 (0.0022) [2024-08-05 18:37:02,761][15444] Updated weights for policy 0, policy_version 41891 (0.0020) [2024-08-05 18:37:03,119][15372] Fps is (10 sec: 24574.8, 60 sec: 24166.2, 300 sec: 24187.2). Total num frames: 343179264. Throughput: 0: 6030.6. Samples: 85801830. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 18:37:03,119][15372] Avg episode reward: [(0, '41.294')] [2024-08-05 18:37:05,388][15417] Signal inference workers to stop experience collection... (15500 times) [2024-08-05 18:37:05,389][15417] Signal inference workers to resume experience collection... (15500 times) [2024-08-05 18:37:05,437][15444] InferenceWorker_p0-w0: stopping experience collection (15500 times) [2024-08-05 18:37:05,437][15444] InferenceWorker_p0-w0: resuming experience collection (15500 times) [2024-08-05 18:37:05,988][15444] Updated weights for policy 0, policy_version 41901 (0.0014) [2024-08-05 18:37:08,118][15372] Fps is (10 sec: 23760.1, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 343302144. Throughput: 0: 6031.1. Samples: 85819720. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 18:37:08,119][15372] Avg episode reward: [(0, '41.979')] [2024-08-05 18:37:09,241][15444] Updated weights for policy 0, policy_version 41911 (0.0010) [2024-08-05 18:37:12,504][15444] Updated weights for policy 0, policy_version 41921 (0.0019) [2024-08-05 18:37:13,119][15372] Fps is (10 sec: 23756.3, 60 sec: 24029.6, 300 sec: 24187.2). Total num frames: 343416832. Throughput: 0: 6041.7. Samples: 85856920. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 18:37:13,119][15372] Avg episode reward: [(0, '41.460')] [2024-08-05 18:37:16,050][15444] Updated weights for policy 0, policy_version 41931 (0.0020) [2024-08-05 18:37:18,119][15372] Fps is (10 sec: 24575.8, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 343547904. Throughput: 0: 6027.5. Samples: 85892350. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 18:37:18,119][15372] Avg episode reward: [(0, '41.177')] [2024-08-05 18:37:19,561][15444] Updated weights for policy 0, policy_version 41941 (0.0016) [2024-08-05 18:37:22,911][15444] Updated weights for policy 0, policy_version 41951 (0.0012) [2024-08-05 18:37:23,118][15372] Fps is (10 sec: 24577.7, 60 sec: 24030.0, 300 sec: 24159.5). Total num frames: 343662592. Throughput: 0: 6033.6. Samples: 85910330. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 18:37:23,119][15372] Avg episode reward: [(0, '41.433')] [2024-08-05 18:37:26,066][15444] Updated weights for policy 0, policy_version 41961 (0.0012) [2024-08-05 18:37:28,118][15372] Fps is (10 sec: 23757.1, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 343785472. Throughput: 0: 6034.9. Samples: 85946530. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 18:37:28,119][15372] Avg episode reward: [(0, '42.137')] [2024-08-05 18:37:29,881][15444] Updated weights for policy 0, policy_version 41971 (0.0012) [2024-08-05 18:37:33,119][15372] Fps is (10 sec: 23756.5, 60 sec: 24029.8, 300 sec: 24159.5). Total num frames: 343900160. Throughput: 0: 6029.3. Samples: 85982390. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 18:37:33,126][15372] Avg episode reward: [(0, '42.336')] [2024-08-05 18:37:33,135][15444] Updated weights for policy 0, policy_version 41981 (0.0021) [2024-08-05 18:37:36,410][15444] Updated weights for policy 0, policy_version 41991 (0.0012) [2024-08-05 18:37:38,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 344023040. Throughput: 0: 6047.8. Samples: 86001370. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 18:37:38,126][15372] Avg episode reward: [(0, '42.172')] [2024-08-05 18:37:39,911][15444] Updated weights for policy 0, policy_version 42001 (0.0018) [2024-08-05 18:37:42,993][15444] Updated weights for policy 0, policy_version 42011 (0.0029) [2024-08-05 18:37:43,118][15372] Fps is (10 sec: 25395.5, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 344154112. Throughput: 0: 6048.6. Samples: 86037790. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 18:37:43,119][15372] Avg episode reward: [(0, '42.079')] [2024-08-05 18:37:46,474][15444] Updated weights for policy 0, policy_version 42021 (0.0013) [2024-08-05 18:37:48,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 344268800. Throughput: 0: 6047.6. Samples: 86073970. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 18:37:48,126][15372] Avg episode reward: [(0, '42.818')] [2024-08-05 18:37:50,187][15444] Updated weights for policy 0, policy_version 42031 (0.0015) [2024-08-05 18:37:50,531][15417] Signal inference workers to stop experience collection... (15550 times) [2024-08-05 18:37:50,532][15417] Signal inference workers to resume experience collection... (15550 times) [2024-08-05 18:37:50,607][15444] InferenceWorker_p0-w0: stopping experience collection (15550 times) [2024-08-05 18:37:50,607][15444] InferenceWorker_p0-w0: resuming experience collection (15550 times) [2024-08-05 18:37:53,119][15372] Fps is (10 sec: 23756.5, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 344391680. Throughput: 0: 6050.4. Samples: 86091990. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 18:37:53,119][15372] Avg episode reward: [(0, '43.055')] [2024-08-05 18:37:53,171][15444] Updated weights for policy 0, policy_version 42041 (0.0012) [2024-08-05 18:37:56,841][15444] Updated weights for policy 0, policy_version 42051 (0.0011) [2024-08-05 18:37:58,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24167.0, 300 sec: 24187.2). Total num frames: 344514560. Throughput: 0: 6019.2. Samples: 86127780. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 18:37:58,119][15372] Avg episode reward: [(0, '42.712')] [2024-08-05 18:38:00,245][15444] Updated weights for policy 0, policy_version 42061 (0.0016) [2024-08-05 18:38:03,127][15372] Fps is (10 sec: 23737.3, 60 sec: 24163.2, 300 sec: 24158.8). Total num frames: 344629248. Throughput: 0: 6030.9. Samples: 86163790. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 18:38:03,135][15372] Avg episode reward: [(0, '43.619')] [2024-08-05 18:38:03,798][15444] Updated weights for policy 0, policy_version 42071 (0.0025) [2024-08-05 18:38:07,330][15444] Updated weights for policy 0, policy_version 42081 (0.0013) [2024-08-05 18:38:08,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 344752128. Throughput: 0: 6020.9. Samples: 86181270. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 18:38:08,119][15372] Avg episode reward: [(0, '43.163')] [2024-08-05 18:38:10,358][15444] Updated weights for policy 0, policy_version 42091 (0.0031) [2024-08-05 18:38:13,118][15372] Fps is (10 sec: 23776.6, 60 sec: 24166.7, 300 sec: 24159.5). Total num frames: 344866816. Throughput: 0: 6019.6. Samples: 86217410. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 18:38:13,119][15372] Avg episode reward: [(0, '41.922')] [2024-08-05 18:38:14,064][15444] Updated weights for policy 0, policy_version 42101 (0.0021) [2024-08-05 18:38:17,577][15444] Updated weights for policy 0, policy_version 42111 (0.0018) [2024-08-05 18:38:18,118][15372] Fps is (10 sec: 22937.5, 60 sec: 23893.4, 300 sec: 24131.7). Total num frames: 344981504. Throughput: 0: 6006.0. Samples: 86252660. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 18:38:18,119][15372] Avg episode reward: [(0, '42.396')] [2024-08-05 18:38:18,209][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000042113_344989696.pth... [2024-08-05 18:38:18,346][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000041405_339189760.pth [2024-08-05 18:38:20,784][15444] Updated weights for policy 0, policy_version 42121 (0.0025) [2024-08-05 18:38:23,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 345112576. Throughput: 0: 5990.4. Samples: 86270940. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 18:38:23,119][15372] Avg episode reward: [(0, '43.270')] [2024-08-05 18:38:24,462][15444] Updated weights for policy 0, policy_version 42131 (0.0011) [2024-08-05 18:38:27,020][15417] Signal inference workers to stop experience collection... (15600 times) [2024-08-05 18:38:27,020][15417] Signal inference workers to resume experience collection... (15600 times) [2024-08-05 18:38:27,099][15444] InferenceWorker_p0-w0: stopping experience collection (15600 times) [2024-08-05 18:38:27,104][15444] InferenceWorker_p0-w0: resuming experience collection (15600 times) [2024-08-05 18:38:27,469][15444] Updated weights for policy 0, policy_version 42141 (0.0035) [2024-08-05 18:38:28,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24029.8, 300 sec: 24159.5). Total num frames: 345227264. Throughput: 0: 5992.9. Samples: 86307470. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 18:38:28,119][15372] Avg episode reward: [(0, '43.122')] [2024-08-05 18:38:31,009][15444] Updated weights for policy 0, policy_version 42151 (0.0012) [2024-08-05 18:38:33,119][15372] Fps is (10 sec: 23756.5, 60 sec: 24166.4, 300 sec: 24159.4). Total num frames: 345350144. Throughput: 0: 6003.8. Samples: 86344140. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 18:38:33,119][15372] Avg episode reward: [(0, '42.831')] [2024-08-05 18:38:34,662][15444] Updated weights for policy 0, policy_version 42161 (0.0012) [2024-08-05 18:38:37,704][15444] Updated weights for policy 0, policy_version 42171 (0.0013) [2024-08-05 18:38:38,119][15372] Fps is (10 sec: 24575.9, 60 sec: 24166.3, 300 sec: 24159.4). Total num frames: 345473024. Throughput: 0: 5989.8. Samples: 86361530. Policy #0 lag: (min: 2.0, avg: 3.7, max: 8.0) [2024-08-05 18:38:38,119][15372] Avg episode reward: [(0, '42.781')] [2024-08-05 18:38:41,361][15444] Updated weights for policy 0, policy_version 42181 (0.0019) [2024-08-05 18:38:43,118][15372] Fps is (10 sec: 22937.9, 60 sec: 23756.8, 300 sec: 24131.7). Total num frames: 345579520. Throughput: 0: 5989.1. Samples: 86397290. Policy #0 lag: (min: 2.0, avg: 3.7, max: 8.0) [2024-08-05 18:38:43,119][15372] Avg episode reward: [(0, '42.898')] [2024-08-05 18:38:44,358][15444] Updated weights for policy 0, policy_version 42191 (0.0023) [2024-08-05 18:38:48,097][15444] Updated weights for policy 0, policy_version 42201 (0.0028) [2024-08-05 18:38:48,119][15372] Fps is (10 sec: 23756.1, 60 sec: 24029.7, 300 sec: 24159.4). Total num frames: 345710592. Throughput: 0: 6003.5. Samples: 86433900. Policy #0 lag: (min: 2.0, avg: 3.7, max: 8.0) [2024-08-05 18:38:48,119][15372] Avg episode reward: [(0, '41.869')] [2024-08-05 18:38:51,628][15444] Updated weights for policy 0, policy_version 42211 (0.0012) [2024-08-05 18:38:53,119][15372] Fps is (10 sec: 25393.4, 60 sec: 24029.6, 300 sec: 24159.4). Total num frames: 345833472. Throughput: 0: 6011.7. Samples: 86451800. Policy #0 lag: (min: 2.0, avg: 3.7, max: 8.0) [2024-08-05 18:38:53,127][15372] Avg episode reward: [(0, '41.849')] [2024-08-05 18:38:54,658][15444] Updated weights for policy 0, policy_version 42221 (0.0014) [2024-08-05 18:38:56,925][15417] Signal inference workers to stop experience collection... (15650 times) [2024-08-05 18:38:56,928][15417] Signal inference workers to resume experience collection... (15650 times) [2024-08-05 18:38:56,980][15444] InferenceWorker_p0-w0: stopping experience collection (15650 times) [2024-08-05 18:38:56,986][15444] InferenceWorker_p0-w0: resuming experience collection (15650 times) [2024-08-05 18:38:58,119][15372] Fps is (10 sec: 23757.4, 60 sec: 23893.3, 300 sec: 24159.4). Total num frames: 345948160. Throughput: 0: 6018.9. Samples: 86488260. Policy #0 lag: (min: 2.0, avg: 3.7, max: 8.0) [2024-08-05 18:38:58,126][15372] Avg episode reward: [(0, '42.136')] [2024-08-05 18:38:58,255][15444] Updated weights for policy 0, policy_version 42231 (0.0021) [2024-08-05 18:39:01,463][15444] Updated weights for policy 0, policy_version 42241 (0.0022) [2024-08-05 18:39:03,118][15372] Fps is (10 sec: 23758.5, 60 sec: 24033.2, 300 sec: 24131.7). Total num frames: 346071040. Throughput: 0: 6027.6. Samples: 86523900. Policy #0 lag: (min: 0.0, avg: 4.0, max: 9.0) [2024-08-05 18:39:03,119][15372] Avg episode reward: [(0, '42.197')] [2024-08-05 18:39:04,917][15444] Updated weights for policy 0, policy_version 42251 (0.0019) [2024-08-05 18:39:08,119][15372] Fps is (10 sec: 24576.1, 60 sec: 24029.8, 300 sec: 24131.7). Total num frames: 346193920. Throughput: 0: 6036.4. Samples: 86542580. Policy #0 lag: (min: 0.0, avg: 4.0, max: 9.0) [2024-08-05 18:39:08,126][15372] Avg episode reward: [(0, '43.280')] [2024-08-05 18:39:08,509][15444] Updated weights for policy 0, policy_version 42261 (0.0028) [2024-08-05 18:39:11,713][15444] Updated weights for policy 0, policy_version 42271 (0.0016) [2024-08-05 18:39:13,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 346308608. Throughput: 0: 6026.0. Samples: 86578640. Policy #0 lag: (min: 0.0, avg: 4.0, max: 9.0) [2024-08-05 18:39:13,119][15372] Avg episode reward: [(0, '43.261')] [2024-08-05 18:39:15,166][15444] Updated weights for policy 0, policy_version 42281 (0.0014) [2024-08-05 18:39:18,119][15372] Fps is (10 sec: 23756.5, 60 sec: 24166.3, 300 sec: 24159.4). Total num frames: 346431488. Throughput: 0: 6015.5. Samples: 86614840. Policy #0 lag: (min: 0.0, avg: 4.0, max: 9.0) [2024-08-05 18:39:18,119][15372] Avg episode reward: [(0, '43.207')] [2024-08-05 18:39:18,464][15444] Updated weights for policy 0, policy_version 42291 (0.0023) [2024-08-05 18:39:22,065][15444] Updated weights for policy 0, policy_version 42301 (0.0018) [2024-08-05 18:39:23,118][15372] Fps is (10 sec: 24575.8, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 346554368. Throughput: 0: 6023.8. Samples: 86632600. Policy #0 lag: (min: 0.0, avg: 4.0, max: 9.0) [2024-08-05 18:39:23,119][15372] Avg episode reward: [(0, '42.309')] [2024-08-05 18:39:25,745][15444] Updated weights for policy 0, policy_version 42311 (0.0022) [2024-08-05 18:39:28,118][15372] Fps is (10 sec: 23757.3, 60 sec: 24029.9, 300 sec: 24103.9). Total num frames: 346669056. Throughput: 0: 6018.9. Samples: 86668140. Policy #0 lag: (min: 2.0, avg: 4.3, max: 9.0) [2024-08-05 18:39:28,126][15372] Avg episode reward: [(0, '42.192')] [2024-08-05 18:39:28,892][15444] Updated weights for policy 0, policy_version 42321 (0.0010) [2024-08-05 18:39:32,392][15444] Updated weights for policy 0, policy_version 42331 (0.0017) [2024-08-05 18:39:33,118][15372] Fps is (10 sec: 22937.7, 60 sec: 23893.4, 300 sec: 24076.2). Total num frames: 346783744. Throughput: 0: 6011.2. Samples: 86704400. Policy #0 lag: (min: 2.0, avg: 4.3, max: 9.0) [2024-08-05 18:39:33,119][15372] Avg episode reward: [(0, '43.060')] [2024-08-05 18:39:35,006][15417] Signal inference workers to stop experience collection... (15700 times) [2024-08-05 18:39:35,007][15417] Signal inference workers to resume experience collection... (15700 times) [2024-08-05 18:39:35,059][15444] InferenceWorker_p0-w0: stopping experience collection (15700 times) [2024-08-05 18:39:35,059][15444] InferenceWorker_p0-w0: resuming experience collection (15700 times) [2024-08-05 18:39:35,566][15444] Updated weights for policy 0, policy_version 42341 (0.0021) [2024-08-05 18:39:38,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24029.9, 300 sec: 24103.9). Total num frames: 346914816. Throughput: 0: 6010.8. Samples: 86722280. Policy #0 lag: (min: 2.0, avg: 4.3, max: 9.0) [2024-08-05 18:39:38,119][15372] Avg episode reward: [(0, '42.434')] [2024-08-05 18:39:39,154][15444] Updated weights for policy 0, policy_version 42351 (0.0015) [2024-08-05 18:39:42,361][15444] Updated weights for policy 0, policy_version 42361 (0.0012) [2024-08-05 18:39:43,118][15372] Fps is (10 sec: 24575.8, 60 sec: 24166.4, 300 sec: 24103.9). Total num frames: 347029504. Throughput: 0: 6011.4. Samples: 86758770. Policy #0 lag: (min: 2.0, avg: 4.3, max: 9.0) [2024-08-05 18:39:43,119][15372] Avg episode reward: [(0, '42.502')] [2024-08-05 18:39:45,765][15444] Updated weights for policy 0, policy_version 42371 (0.0020) [2024-08-05 18:39:48,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.6, 300 sec: 24131.7). Total num frames: 347160576. Throughput: 0: 6029.3. Samples: 86795220. Policy #0 lag: (min: 2.0, avg: 4.3, max: 9.0) [2024-08-05 18:39:48,126][15372] Avg episode reward: [(0, '43.170')] [2024-08-05 18:39:49,278][15444] Updated weights for policy 0, policy_version 42381 (0.0012) [2024-08-05 18:39:52,680][15444] Updated weights for policy 0, policy_version 42391 (0.0012) [2024-08-05 18:39:53,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24030.1, 300 sec: 24103.9). Total num frames: 347275264. Throughput: 0: 6006.7. Samples: 86812880. Policy #0 lag: (min: 2.0, avg: 4.3, max: 9.0) [2024-08-05 18:39:53,119][15372] Avg episode reward: [(0, '41.955')] [2024-08-05 18:39:56,064][15444] Updated weights for policy 0, policy_version 42401 (0.0024) [2024-08-05 18:39:58,119][15372] Fps is (10 sec: 22937.5, 60 sec: 24029.9, 300 sec: 24103.9). Total num frames: 347389952. Throughput: 0: 5998.4. Samples: 86848570. Policy #0 lag: (min: 1.0, avg: 3.8, max: 7.0) [2024-08-05 18:39:58,119][15372] Avg episode reward: [(0, '42.180')] [2024-08-05 18:39:59,416][15444] Updated weights for policy 0, policy_version 42411 (0.0018) [2024-08-05 18:40:03,089][15444] Updated weights for policy 0, policy_version 42421 (0.0031) [2024-08-05 18:40:03,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24029.9, 300 sec: 24104.0). Total num frames: 347512832. Throughput: 0: 5984.9. Samples: 86884160. Policy #0 lag: (min: 1.0, avg: 3.8, max: 7.0) [2024-08-05 18:40:03,119][15372] Avg episode reward: [(0, '42.368')] [2024-08-05 18:40:06,280][15444] Updated weights for policy 0, policy_version 42431 (0.0015) [2024-08-05 18:40:08,119][15372] Fps is (10 sec: 24575.9, 60 sec: 24029.9, 300 sec: 24103.9). Total num frames: 347635712. Throughput: 0: 5998.7. Samples: 86902540. Policy #0 lag: (min: 1.0, avg: 3.8, max: 7.0) [2024-08-05 18:40:08,126][15372] Avg episode reward: [(0, '42.439')] [2024-08-05 18:40:09,664][15444] Updated weights for policy 0, policy_version 42441 (0.0011) [2024-08-05 18:40:13,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.9, 300 sec: 24103.9). Total num frames: 347750400. Throughput: 0: 6031.3. Samples: 86939550. Policy #0 lag: (min: 1.0, avg: 3.8, max: 7.0) [2024-08-05 18:40:13,127][15372] Avg episode reward: [(0, '42.875')] [2024-08-05 18:40:13,223][15444] Updated weights for policy 0, policy_version 42451 (0.0018) [2024-08-05 18:40:16,207][15444] Updated weights for policy 0, policy_version 42461 (0.0023) [2024-08-05 18:40:18,119][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.9, 300 sec: 24103.9). Total num frames: 347873280. Throughput: 0: 6018.0. Samples: 86975210. Policy #0 lag: (min: 1.0, avg: 3.8, max: 7.0) [2024-08-05 18:40:18,126][15372] Avg episode reward: [(0, '42.781')] [2024-08-05 18:40:18,143][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000042466_347881472.pth... [2024-08-05 18:40:18,261][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000041760_342097920.pth [2024-08-05 18:40:19,953][15444] Updated weights for policy 0, policy_version 42471 (0.0019) [2024-08-05 18:40:22,074][15417] Signal inference workers to stop experience collection... (15750 times) [2024-08-05 18:40:22,077][15417] Signal inference workers to resume experience collection... (15750 times) [2024-08-05 18:40:22,152][15444] InferenceWorker_p0-w0: stopping experience collection (15750 times) [2024-08-05 18:40:22,161][15444] InferenceWorker_p0-w0: resuming experience collection (15750 times) [2024-08-05 18:40:23,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24029.9, 300 sec: 24103.9). Total num frames: 347996160. Throughput: 0: 6014.9. Samples: 86992950. Policy #0 lag: (min: 0.0, avg: 4.7, max: 8.0) [2024-08-05 18:40:23,126][15372] Avg episode reward: [(0, '41.808')] [2024-08-05 18:40:23,622][15444] Updated weights for policy 0, policy_version 42481 (0.0012) [2024-08-05 18:40:26,507][15444] Updated weights for policy 0, policy_version 42491 (0.0011) [2024-08-05 18:40:28,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24029.9, 300 sec: 24076.1). Total num frames: 348110848. Throughput: 0: 6005.8. Samples: 87029030. Policy #0 lag: (min: 0.0, avg: 4.7, max: 8.0) [2024-08-05 18:40:28,119][15372] Avg episode reward: [(0, '42.406')] [2024-08-05 18:40:30,161][15444] Updated weights for policy 0, policy_version 42501 (0.0012) [2024-08-05 18:40:33,119][15372] Fps is (10 sec: 24575.3, 60 sec: 24302.8, 300 sec: 24131.7). Total num frames: 348241920. Throughput: 0: 6004.4. Samples: 87065420. Policy #0 lag: (min: 0.0, avg: 4.7, max: 8.0) [2024-08-05 18:40:33,127][15372] Avg episode reward: [(0, '42.164')] [2024-08-05 18:40:33,366][15444] Updated weights for policy 0, policy_version 42511 (0.0015) [2024-08-05 18:40:36,949][15444] Updated weights for policy 0, policy_version 42521 (0.0012) [2024-08-05 18:40:38,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24029.9, 300 sec: 24103.9). Total num frames: 348356608. Throughput: 0: 6020.9. Samples: 87083820. Policy #0 lag: (min: 0.0, avg: 4.7, max: 8.0) [2024-08-05 18:40:38,119][15372] Avg episode reward: [(0, '42.559')] [2024-08-05 18:40:40,360][15444] Updated weights for policy 0, policy_version 42531 (0.0021) [2024-08-05 18:40:43,119][15372] Fps is (10 sec: 23757.0, 60 sec: 24166.3, 300 sec: 24104.0). Total num frames: 348479488. Throughput: 0: 6019.8. Samples: 87119460. Policy #0 lag: (min: 0.0, avg: 4.7, max: 8.0) [2024-08-05 18:40:43,119][15372] Avg episode reward: [(0, '43.140')] [2024-08-05 18:40:43,800][15444] Updated weights for policy 0, policy_version 42541 (0.0017) [2024-08-05 18:40:47,221][15444] Updated weights for policy 0, policy_version 42551 (0.0013) [2024-08-05 18:40:48,119][15372] Fps is (10 sec: 23756.6, 60 sec: 23893.3, 300 sec: 24103.9). Total num frames: 348594176. Throughput: 0: 6023.5. Samples: 87155220. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 18:40:48,119][15372] Avg episode reward: [(0, '43.810')] [2024-08-05 18:40:50,522][15444] Updated weights for policy 0, policy_version 42561 (0.0021) [2024-08-05 18:40:53,118][15372] Fps is (10 sec: 23757.2, 60 sec: 24029.9, 300 sec: 24103.9). Total num frames: 348717056. Throughput: 0: 6014.2. Samples: 87173180. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 18:40:53,126][15372] Avg episode reward: [(0, '43.022')] [2024-08-05 18:40:54,048][15444] Updated weights for policy 0, policy_version 42571 (0.0010) [2024-08-05 18:40:57,489][15444] Updated weights for policy 0, policy_version 42581 (0.0021) [2024-08-05 18:40:58,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24029.9, 300 sec: 24076.1). Total num frames: 348831744. Throughput: 0: 5985.6. Samples: 87208900. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 18:40:58,119][15372] Avg episode reward: [(0, '42.473')] [2024-08-05 18:41:00,814][15444] Updated weights for policy 0, policy_version 42591 (0.0014) [2024-08-05 18:41:03,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24029.9, 300 sec: 24076.2). Total num frames: 348954624. Throughput: 0: 6004.7. Samples: 87245420. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 18:41:03,126][15372] Avg episode reward: [(0, '42.990')] [2024-08-05 18:41:04,326][15444] Updated weights for policy 0, policy_version 42601 (0.0035) [2024-08-05 18:41:07,993][15444] Updated weights for policy 0, policy_version 42611 (0.0032) [2024-08-05 18:41:08,118][15372] Fps is (10 sec: 23756.6, 60 sec: 23893.3, 300 sec: 24048.4). Total num frames: 349069312. Throughput: 0: 6012.2. Samples: 87263500. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 18:41:08,119][15372] Avg episode reward: [(0, '43.242')] [2024-08-05 18:41:10,871][15444] Updated weights for policy 0, policy_version 42621 (0.0022) [2024-08-05 18:41:13,119][15372] Fps is (10 sec: 24575.6, 60 sec: 24166.3, 300 sec: 24076.2). Total num frames: 349200384. Throughput: 0: 6006.0. Samples: 87299300. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 18:41:13,119][15372] Avg episode reward: [(0, '42.507')] [2024-08-05 18:41:14,518][15417] Signal inference workers to stop experience collection... (15800 times) [2024-08-05 18:41:14,526][15417] Signal inference workers to resume experience collection... (15800 times) [2024-08-05 18:41:14,559][15444] InferenceWorker_p0-w0: stopping experience collection (15800 times) [2024-08-05 18:41:14,564][15444] InferenceWorker_p0-w0: resuming experience collection (15800 times) [2024-08-05 18:41:14,641][15444] Updated weights for policy 0, policy_version 42631 (0.0011) [2024-08-05 18:41:18,039][15444] Updated weights for policy 0, policy_version 42641 (0.0022) [2024-08-05 18:41:18,119][15372] Fps is (10 sec: 24576.0, 60 sec: 24029.9, 300 sec: 24048.4). Total num frames: 349315072. Throughput: 0: 6001.4. Samples: 87335480. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 18:41:18,119][15372] Avg episode reward: [(0, '40.889')] [2024-08-05 18:41:21,318][15444] Updated weights for policy 0, policy_version 42651 (0.0032) [2024-08-05 18:41:23,119][15372] Fps is (10 sec: 22937.0, 60 sec: 23893.2, 300 sec: 24048.4). Total num frames: 349429760. Throughput: 0: 5996.0. Samples: 87353640. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 18:41:23,126][15372] Avg episode reward: [(0, '41.136')] [2024-08-05 18:41:24,639][15444] Updated weights for policy 0, policy_version 42661 (0.0038) [2024-08-05 18:41:28,072][15444] Updated weights for policy 0, policy_version 42671 (0.0017) [2024-08-05 18:41:28,119][15372] Fps is (10 sec: 24575.7, 60 sec: 24166.3, 300 sec: 24076.1). Total num frames: 349560832. Throughput: 0: 6006.4. Samples: 87389750. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 18:41:28,119][15372] Avg episode reward: [(0, '41.388')] [2024-08-05 18:41:31,565][15444] Updated weights for policy 0, policy_version 42681 (0.0034) [2024-08-05 18:41:33,119][15372] Fps is (10 sec: 24576.2, 60 sec: 23893.3, 300 sec: 24048.4). Total num frames: 349675520. Throughput: 0: 6009.3. Samples: 87425640. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 18:41:33,126][15372] Avg episode reward: [(0, '41.953')] [2024-08-05 18:41:35,021][15444] Updated weights for policy 0, policy_version 42691 (0.0021) [2024-08-05 18:41:38,118][15372] Fps is (10 sec: 23757.2, 60 sec: 24029.9, 300 sec: 24048.4). Total num frames: 349798400. Throughput: 0: 6026.2. Samples: 87444360. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 18:41:38,126][15372] Avg episode reward: [(0, '42.810')] [2024-08-05 18:41:38,427][15444] Updated weights for policy 0, policy_version 42701 (0.0035) [2024-08-05 18:41:41,558][15444] Updated weights for policy 0, policy_version 42711 (0.0011) [2024-08-05 18:41:43,119][15372] Fps is (10 sec: 24576.5, 60 sec: 24029.9, 300 sec: 24076.1). Total num frames: 349921280. Throughput: 0: 6032.4. Samples: 87480360. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 18:41:43,126][15372] Avg episode reward: [(0, '42.906')] [2024-08-05 18:41:45,009][15444] Updated weights for policy 0, policy_version 42721 (0.0026) [2024-08-05 18:41:48,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.4, 300 sec: 24103.9). Total num frames: 350044160. Throughput: 0: 6040.0. Samples: 87517220. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 18:41:48,126][15372] Avg episode reward: [(0, '42.741')] [2024-08-05 18:41:48,235][15444] Updated weights for policy 0, policy_version 42731 (0.0013) [2024-08-05 18:41:51,721][15444] Updated weights for policy 0, policy_version 42741 (0.0019) [2024-08-05 18:41:53,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24029.9, 300 sec: 24048.5). Total num frames: 350158848. Throughput: 0: 6048.5. Samples: 87535680. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 18:41:53,119][15372] Avg episode reward: [(0, '41.883')] [2024-08-05 18:41:55,103][15444] Updated weights for policy 0, policy_version 42751 (0.0024) [2024-08-05 18:41:58,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24302.9, 300 sec: 24104.0). Total num frames: 350289920. Throughput: 0: 6062.7. Samples: 87572120. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 18:41:58,126][15372] Avg episode reward: [(0, '42.639')] [2024-08-05 18:41:58,423][15444] Updated weights for policy 0, policy_version 42761 (0.0027) [2024-08-05 18:42:01,893][15444] Updated weights for policy 0, policy_version 42771 (0.0026) [2024-08-05 18:42:03,119][15372] Fps is (10 sec: 25394.9, 60 sec: 24302.9, 300 sec: 24103.9). Total num frames: 350412800. Throughput: 0: 6056.4. Samples: 87608020. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 18:42:03,119][15372] Avg episode reward: [(0, '42.621')] [2024-08-05 18:42:05,054][15444] Updated weights for policy 0, policy_version 42781 (0.0013) [2024-08-05 18:42:08,119][15372] Fps is (10 sec: 23756.6, 60 sec: 24302.9, 300 sec: 24104.0). Total num frames: 350527488. Throughput: 0: 6068.0. Samples: 87626700. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 18:42:08,126][15372] Avg episode reward: [(0, '42.836')] [2024-08-05 18:42:08,658][15444] Updated weights for policy 0, policy_version 42791 (0.0011) [2024-08-05 18:42:09,033][15417] Signal inference workers to stop experience collection... (15850 times) [2024-08-05 18:42:09,036][15417] Signal inference workers to resume experience collection... (15850 times) [2024-08-05 18:42:09,086][15444] InferenceWorker_p0-w0: stopping experience collection (15850 times) [2024-08-05 18:42:09,087][15444] InferenceWorker_p0-w0: resuming experience collection (15850 times) [2024-08-05 18:42:11,862][15444] Updated weights for policy 0, policy_version 42801 (0.0012) [2024-08-05 18:42:13,118][15372] Fps is (10 sec: 23757.1, 60 sec: 24166.5, 300 sec: 24076.2). Total num frames: 350650368. Throughput: 0: 6067.4. Samples: 87662780. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 18:42:13,126][15372] Avg episode reward: [(0, '43.136')] [2024-08-05 18:42:15,345][15444] Updated weights for policy 0, policy_version 42811 (0.0016) [2024-08-05 18:42:18,119][15372] Fps is (10 sec: 24575.2, 60 sec: 24302.8, 300 sec: 24103.9). Total num frames: 350773248. Throughput: 0: 6043.5. Samples: 87697600. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 18:42:18,119][15372] Avg episode reward: [(0, '42.939')] [2024-08-05 18:42:18,122][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000042819_350773248.pth... [2024-08-05 18:42:18,255][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000042113_344989696.pth [2024-08-05 18:42:19,182][15444] Updated weights for policy 0, policy_version 42821 (0.0029) [2024-08-05 18:42:22,195][15444] Updated weights for policy 0, policy_version 42831 (0.0016) [2024-08-05 18:42:23,118][15372] Fps is (10 sec: 22937.4, 60 sec: 24166.5, 300 sec: 24048.4). Total num frames: 350879744. Throughput: 0: 6034.2. Samples: 87715900. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 18:42:23,126][15372] Avg episode reward: [(0, '42.725')] [2024-08-05 18:42:25,754][15444] Updated weights for policy 0, policy_version 42841 (0.0013) [2024-08-05 18:42:28,118][15372] Fps is (10 sec: 23757.6, 60 sec: 24166.5, 300 sec: 24103.9). Total num frames: 351010816. Throughput: 0: 6036.2. Samples: 87751990. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 18:42:28,119][15372] Avg episode reward: [(0, '43.053')] [2024-08-05 18:42:29,210][15444] Updated weights for policy 0, policy_version 42851 (0.0017) [2024-08-05 18:42:32,544][15444] Updated weights for policy 0, policy_version 42861 (0.0016) [2024-08-05 18:42:33,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24166.5, 300 sec: 24076.1). Total num frames: 351125504. Throughput: 0: 6014.7. Samples: 87787880. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 18:42:33,119][15372] Avg episode reward: [(0, '42.892')] [2024-08-05 18:42:36,000][15444] Updated weights for policy 0, policy_version 42871 (0.0015) [2024-08-05 18:42:38,119][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.4, 300 sec: 24048.4). Total num frames: 351248384. Throughput: 0: 6015.8. Samples: 87806390. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 18:42:38,119][15372] Avg episode reward: [(0, '43.007')] [2024-08-05 18:42:39,230][15444] Updated weights for policy 0, policy_version 42881 (0.0014) [2024-08-05 18:42:42,907][15444] Updated weights for policy 0, policy_version 42891 (0.0021) [2024-08-05 18:42:43,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24029.9, 300 sec: 24048.4). Total num frames: 351363072. Throughput: 0: 6010.9. Samples: 87842610. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 18:42:43,119][15372] Avg episode reward: [(0, '42.831')] [2024-08-05 18:42:46,378][15444] Updated weights for policy 0, policy_version 42901 (0.0014) [2024-08-05 18:42:48,119][15372] Fps is (10 sec: 23756.4, 60 sec: 24029.8, 300 sec: 24048.4). Total num frames: 351485952. Throughput: 0: 5990.6. Samples: 87877600. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 18:42:48,126][15372] Avg episode reward: [(0, '42.083')] [2024-08-05 18:42:49,506][15444] Updated weights for policy 0, policy_version 42911 (0.0018) [2024-08-05 18:42:49,616][15417] Signal inference workers to stop experience collection... (15900 times) [2024-08-05 18:42:49,619][15417] Signal inference workers to resume experience collection... (15900 times) [2024-08-05 18:42:49,663][15444] InferenceWorker_p0-w0: stopping experience collection (15900 times) [2024-08-05 18:42:49,664][15444] InferenceWorker_p0-w0: resuming experience collection (15900 times) [2024-08-05 18:42:53,102][15444] Updated weights for policy 0, policy_version 42921 (0.0027) [2024-08-05 18:42:53,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.4, 300 sec: 24048.4). Total num frames: 351608832. Throughput: 0: 5995.8. Samples: 87896510. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 18:42:53,119][15372] Avg episode reward: [(0, '42.614')] [2024-08-05 18:42:56,180][15444] Updated weights for policy 0, policy_version 42931 (0.0011) [2024-08-05 18:42:58,120][15372] Fps is (10 sec: 24573.5, 60 sec: 24029.4, 300 sec: 24076.7). Total num frames: 351731712. Throughput: 0: 5995.6. Samples: 87932590. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 18:42:58,127][15372] Avg episode reward: [(0, '42.957')] [2024-08-05 18:42:59,637][15444] Updated weights for policy 0, policy_version 42941 (0.0035) [2024-08-05 18:43:03,119][15372] Fps is (10 sec: 23756.1, 60 sec: 23893.3, 300 sec: 24048.4). Total num frames: 351846400. Throughput: 0: 6023.4. Samples: 87968650. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 18:43:03,127][15372] Avg episode reward: [(0, '42.138')] [2024-08-05 18:43:03,261][15444] Updated weights for policy 0, policy_version 42951 (0.0014) [2024-08-05 18:43:06,335][15444] Updated weights for policy 0, policy_version 42961 (0.0019) [2024-08-05 18:43:08,118][15372] Fps is (10 sec: 23759.8, 60 sec: 24029.9, 300 sec: 24076.1). Total num frames: 351969280. Throughput: 0: 6038.4. Samples: 87987630. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 18:43:08,126][15372] Avg episode reward: [(0, '41.825')] [2024-08-05 18:43:09,874][15444] Updated weights for policy 0, policy_version 42971 (0.0011) [2024-08-05 18:43:12,999][15444] Updated weights for policy 0, policy_version 42981 (0.0032) [2024-08-05 18:43:13,118][15372] Fps is (10 sec: 25395.8, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 352100352. Throughput: 0: 6046.4. Samples: 88024080. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 18:43:13,119][15372] Avg episode reward: [(0, '41.627')] [2024-08-05 18:43:16,642][15444] Updated weights for policy 0, policy_version 42991 (0.0024) [2024-08-05 18:43:18,119][15372] Fps is (10 sec: 24575.8, 60 sec: 24030.0, 300 sec: 24076.1). Total num frames: 352215040. Throughput: 0: 6041.5. Samples: 88059750. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 18:43:18,119][15372] Avg episode reward: [(0, '41.614')] [2024-08-05 18:43:20,000][15444] Updated weights for policy 0, policy_version 43001 (0.0023) [2024-08-05 18:43:23,119][15372] Fps is (10 sec: 22937.5, 60 sec: 24166.4, 300 sec: 24076.1). Total num frames: 352329728. Throughput: 0: 6048.9. Samples: 88078590. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 18:43:23,126][15372] Avg episode reward: [(0, '44.275')] [2024-08-05 18:43:23,127][15417] Saving new best policy, reward=44.275! [2024-08-05 18:43:23,453][15444] Updated weights for policy 0, policy_version 43011 (0.0019) [2024-08-05 18:43:26,981][15444] Updated weights for policy 0, policy_version 43021 (0.0018) [2024-08-05 18:43:28,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24166.4, 300 sec: 24103.9). Total num frames: 352460800. Throughput: 0: 6028.4. Samples: 88113890. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 18:43:28,119][15372] Avg episode reward: [(0, '44.249')] [2024-08-05 18:43:30,086][15444] Updated weights for policy 0, policy_version 43031 (0.0019) [2024-08-05 18:43:33,119][15372] Fps is (10 sec: 24575.9, 60 sec: 24166.4, 300 sec: 24076.1). Total num frames: 352575488. Throughput: 0: 6067.8. Samples: 88150650. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 18:43:33,126][15372] Avg episode reward: [(0, '43.243')] [2024-08-05 18:43:33,751][15444] Updated weights for policy 0, policy_version 43041 (0.0017) [2024-08-05 18:43:37,008][15444] Updated weights for policy 0, policy_version 43051 (0.0020) [2024-08-05 18:43:38,119][15372] Fps is (10 sec: 23756.6, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 352698368. Throughput: 0: 6060.4. Samples: 88169230. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 18:43:38,119][15372] Avg episode reward: [(0, '43.463')] [2024-08-05 18:43:40,285][15444] Updated weights for policy 0, policy_version 43061 (0.0012) [2024-08-05 18:43:43,119][15372] Fps is (10 sec: 24576.0, 60 sec: 24302.9, 300 sec: 24103.9). Total num frames: 352821248. Throughput: 0: 6070.8. Samples: 88205770. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 18:43:43,126][15372] Avg episode reward: [(0, '43.319')] [2024-08-05 18:43:43,713][15444] Updated weights for policy 0, policy_version 43071 (0.0015) [2024-08-05 18:43:43,853][15417] Signal inference workers to stop experience collection... (15950 times) [2024-08-05 18:43:43,854][15417] Signal inference workers to resume experience collection... (15950 times) [2024-08-05 18:43:43,893][15444] InferenceWorker_p0-w0: stopping experience collection (15950 times) [2024-08-05 18:43:43,893][15444] InferenceWorker_p0-w0: resuming experience collection (15950 times) [2024-08-05 18:43:46,938][15444] Updated weights for policy 0, policy_version 43081 (0.0026) [2024-08-05 18:43:48,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24303.0, 300 sec: 24104.0). Total num frames: 352944128. Throughput: 0: 6066.7. Samples: 88241650. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 18:43:48,126][15372] Avg episode reward: [(0, '41.685')] [2024-08-05 18:43:50,674][15444] Updated weights for policy 0, policy_version 43091 (0.0012) [2024-08-05 18:43:53,124][15372] Fps is (10 sec: 23743.4, 60 sec: 24164.1, 300 sec: 24103.5). Total num frames: 353058816. Throughput: 0: 6058.3. Samples: 88260290. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 18:43:53,132][15372] Avg episode reward: [(0, '42.014')] [2024-08-05 18:43:53,919][15444] Updated weights for policy 0, policy_version 43101 (0.0020) [2024-08-05 18:43:57,447][15444] Updated weights for policy 0, policy_version 43111 (0.0011) [2024-08-05 18:43:58,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.9, 300 sec: 24103.9). Total num frames: 353181696. Throughput: 0: 6031.6. Samples: 88295500. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 18:43:58,119][15372] Avg episode reward: [(0, '42.782')] [2024-08-05 18:44:00,629][15444] Updated weights for policy 0, policy_version 43121 (0.0018) [2024-08-05 18:44:03,118][15372] Fps is (10 sec: 23770.5, 60 sec: 24166.5, 300 sec: 24076.2). Total num frames: 353296384. Throughput: 0: 6034.9. Samples: 88331320. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 18:44:03,132][15372] Avg episode reward: [(0, '42.076')] [2024-08-05 18:44:04,195][15444] Updated weights for policy 0, policy_version 43131 (0.0024) [2024-08-05 18:44:07,893][15444] Updated weights for policy 0, policy_version 43141 (0.0016) [2024-08-05 18:44:08,119][15372] Fps is (10 sec: 22936.9, 60 sec: 24029.8, 300 sec: 24076.1). Total num frames: 353411072. Throughput: 0: 6018.4. Samples: 88349420. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 18:44:08,119][15372] Avg episode reward: [(0, '42.757')] [2024-08-05 18:44:10,930][15444] Updated weights for policy 0, policy_version 43151 (0.0020) [2024-08-05 18:44:13,119][15372] Fps is (10 sec: 24575.8, 60 sec: 24029.9, 300 sec: 24103.9). Total num frames: 353542144. Throughput: 0: 6028.2. Samples: 88385160. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 18:44:13,126][15372] Avg episode reward: [(0, '43.107')] [2024-08-05 18:44:14,631][15444] Updated weights for policy 0, policy_version 43161 (0.0028) [2024-08-05 18:44:17,832][15444] Updated weights for policy 0, policy_version 43171 (0.0017) [2024-08-05 18:44:18,119][15372] Fps is (10 sec: 24575.6, 60 sec: 24029.7, 300 sec: 24076.1). Total num frames: 353656832. Throughput: 0: 5998.8. Samples: 88420600. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 18:44:18,119][15372] Avg episode reward: [(0, '42.722')] [2024-08-05 18:44:18,123][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000043171_353656832.pth... [2024-08-05 18:44:18,228][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000042466_347881472.pth [2024-08-05 18:44:21,354][15444] Updated weights for policy 0, policy_version 43181 (0.0013) [2024-08-05 18:44:23,118][15372] Fps is (10 sec: 22937.8, 60 sec: 24029.9, 300 sec: 24076.1). Total num frames: 353771520. Throughput: 0: 5993.6. Samples: 88438940. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 18:44:23,126][15372] Avg episode reward: [(0, '44.005')] [2024-08-05 18:44:24,881][15444] Updated weights for policy 0, policy_version 43191 (0.0026) [2024-08-05 18:44:27,408][15417] Signal inference workers to stop experience collection... (16000 times) [2024-08-05 18:44:27,411][15417] Signal inference workers to resume experience collection... (16000 times) [2024-08-05 18:44:27,451][15444] InferenceWorker_p0-w0: stopping experience collection (16000 times) [2024-08-05 18:44:27,451][15444] InferenceWorker_p0-w0: resuming experience collection (16000 times) [2024-08-05 18:44:27,998][15444] Updated weights for policy 0, policy_version 43201 (0.0013) [2024-08-05 18:44:28,118][15372] Fps is (10 sec: 24577.1, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 353902592. Throughput: 0: 5986.2. Samples: 88475150. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 18:44:28,119][15372] Avg episode reward: [(0, '42.941')] [2024-08-05 18:44:31,640][15444] Updated weights for policy 0, policy_version 43211 (0.0019) [2024-08-05 18:44:33,119][15372] Fps is (10 sec: 24575.7, 60 sec: 24029.9, 300 sec: 24076.1). Total num frames: 354017280. Throughput: 0: 5980.7. Samples: 88510780. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 18:44:33,126][15372] Avg episode reward: [(0, '42.260')] [2024-08-05 18:44:35,041][15444] Updated weights for policy 0, policy_version 43221 (0.0021) [2024-08-05 18:44:38,124][15372] Fps is (10 sec: 23743.9, 60 sec: 24027.7, 300 sec: 24103.5). Total num frames: 354140160. Throughput: 0: 5987.6. Samples: 88529730. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 18:44:38,132][15372] Avg episode reward: [(0, '42.160')] [2024-08-05 18:44:38,203][15444] Updated weights for policy 0, policy_version 43231 (0.0012) [2024-08-05 18:44:41,722][15444] Updated weights for policy 0, policy_version 43241 (0.0023) [2024-08-05 18:44:43,118][15372] Fps is (10 sec: 24576.4, 60 sec: 24029.9, 300 sec: 24076.2). Total num frames: 354263040. Throughput: 0: 6013.3. Samples: 88566100. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 18:44:43,119][15372] Avg episode reward: [(0, '42.667')] [2024-08-05 18:44:44,876][15444] Updated weights for policy 0, policy_version 43251 (0.0012) [2024-08-05 18:44:48,119][15372] Fps is (10 sec: 23769.6, 60 sec: 23893.3, 300 sec: 24076.1). Total num frames: 354377728. Throughput: 0: 6025.5. Samples: 88602470. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 18:44:48,126][15372] Avg episode reward: [(0, '43.148')] [2024-08-05 18:44:48,519][15444] Updated weights for policy 0, policy_version 43261 (0.0016) [2024-08-05 18:44:51,931][15444] Updated weights for policy 0, policy_version 43271 (0.0025) [2024-08-05 18:44:53,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24168.7, 300 sec: 24131.7). Total num frames: 354508800. Throughput: 0: 6028.0. Samples: 88620680. Policy #0 lag: (min: 0.0, avg: 3.0, max: 7.0) [2024-08-05 18:44:53,119][15372] Avg episode reward: [(0, '43.312')] [2024-08-05 18:44:55,092][15444] Updated weights for policy 0, policy_version 43281 (0.0015) [2024-08-05 18:44:58,119][15372] Fps is (10 sec: 24575.2, 60 sec: 24029.7, 300 sec: 24103.9). Total num frames: 354623488. Throughput: 0: 6046.2. Samples: 88657240. Policy #0 lag: (min: 0.0, avg: 3.0, max: 7.0) [2024-08-05 18:44:58,127][15372] Avg episode reward: [(0, '43.271')] [2024-08-05 18:44:58,801][15444] Updated weights for policy 0, policy_version 43291 (0.0020) [2024-08-05 18:45:00,944][15417] Signal inference workers to stop experience collection... (16050 times) [2024-08-05 18:45:00,944][15417] Signal inference workers to resume experience collection... (16050 times) [2024-08-05 18:45:00,989][15444] InferenceWorker_p0-w0: stopping experience collection (16050 times) [2024-08-05 18:45:00,994][15444] InferenceWorker_p0-w0: resuming experience collection (16050 times) [2024-08-05 18:45:01,735][15444] Updated weights for policy 0, policy_version 43301 (0.0027) [2024-08-05 18:45:03,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.4, 300 sec: 24103.9). Total num frames: 354746368. Throughput: 0: 6047.8. Samples: 88692750. Policy #0 lag: (min: 0.0, avg: 3.0, max: 7.0) [2024-08-05 18:45:03,119][15372] Avg episode reward: [(0, '42.315')] [2024-08-05 18:45:05,333][15444] Updated weights for policy 0, policy_version 43311 (0.0016) [2024-08-05 18:45:08,118][15372] Fps is (10 sec: 25396.0, 60 sec: 24439.6, 300 sec: 24159.5). Total num frames: 354877440. Throughput: 0: 6046.7. Samples: 88711040. Policy #0 lag: (min: 0.0, avg: 3.0, max: 7.0) [2024-08-05 18:45:08,119][15372] Avg episode reward: [(0, '42.074')] [2024-08-05 18:45:08,763][15444] Updated weights for policy 0, policy_version 43321 (0.0020) [2024-08-05 18:45:11,998][15444] Updated weights for policy 0, policy_version 43331 (0.0013) [2024-08-05 18:45:13,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24029.9, 300 sec: 24103.9). Total num frames: 354983936. Throughput: 0: 6054.7. Samples: 88747610. Policy #0 lag: (min: 0.0, avg: 3.0, max: 7.0) [2024-08-05 18:45:13,119][15372] Avg episode reward: [(0, '43.133')] [2024-08-05 18:45:15,441][15444] Updated weights for policy 0, policy_version 43341 (0.0010) [2024-08-05 18:45:18,119][15372] Fps is (10 sec: 23756.1, 60 sec: 24303.0, 300 sec: 24131.7). Total num frames: 355115008. Throughput: 0: 6068.0. Samples: 88783840. Policy #0 lag: (min: 0.0, avg: 3.0, max: 7.0) [2024-08-05 18:45:18,127][15372] Avg episode reward: [(0, '43.737')] [2024-08-05 18:45:18,925][15444] Updated weights for policy 0, policy_version 43351 (0.0014) [2024-08-05 18:45:22,359][15444] Updated weights for policy 0, policy_version 43361 (0.0021) [2024-08-05 18:45:23,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24302.9, 300 sec: 24131.7). Total num frames: 355229696. Throughput: 0: 6041.2. Samples: 88801550. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 18:45:23,119][15372] Avg episode reward: [(0, '42.509')] [2024-08-05 18:45:25,799][15444] Updated weights for policy 0, policy_version 43371 (0.0035) [2024-08-05 18:45:28,119][15372] Fps is (10 sec: 23757.5, 60 sec: 24166.4, 300 sec: 24103.9). Total num frames: 355352576. Throughput: 0: 6034.7. Samples: 88837660. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 18:45:28,126][15372] Avg episode reward: [(0, '42.187')] [2024-08-05 18:45:29,027][15444] Updated weights for policy 0, policy_version 43381 (0.0010) [2024-08-05 18:45:32,477][15444] Updated weights for policy 0, policy_version 43391 (0.0020) [2024-08-05 18:45:33,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24303.0, 300 sec: 24131.7). Total num frames: 355475456. Throughput: 0: 6027.3. Samples: 88873700. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 18:45:33,119][15372] Avg episode reward: [(0, '42.425')] [2024-08-05 18:45:36,113][15444] Updated weights for policy 0, policy_version 43401 (0.0014) [2024-08-05 18:45:38,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24168.6, 300 sec: 24103.9). Total num frames: 355590144. Throughput: 0: 6026.7. Samples: 88891880. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 18:45:38,119][15372] Avg episode reward: [(0, '42.101')] [2024-08-05 18:45:39,072][15444] Updated weights for policy 0, policy_version 43411 (0.0011) [2024-08-05 18:45:42,638][15444] Updated weights for policy 0, policy_version 43421 (0.0019) [2024-08-05 18:45:43,119][15372] Fps is (10 sec: 23755.3, 60 sec: 24166.1, 300 sec: 24131.6). Total num frames: 355713024. Throughput: 0: 6027.8. Samples: 88928490. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 18:45:43,119][15372] Avg episode reward: [(0, '42.449')] [2024-08-05 18:45:46,090][15444] Updated weights for policy 0, policy_version 43431 (0.0013) [2024-08-05 18:45:48,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.4, 300 sec: 24103.9). Total num frames: 355827712. Throughput: 0: 6040.9. Samples: 88964590. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 18:45:48,126][15372] Avg episode reward: [(0, '42.829')] [2024-08-05 18:45:49,530][15444] Updated weights for policy 0, policy_version 43441 (0.0017) [2024-08-05 18:45:52,858][15444] Updated weights for policy 0, policy_version 43451 (0.0022) [2024-08-05 18:45:53,118][15372] Fps is (10 sec: 24577.5, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 355958784. Throughput: 0: 6038.9. Samples: 88982790. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 18:45:53,119][15372] Avg episode reward: [(0, '41.867')] [2024-08-05 18:45:56,207][15444] Updated weights for policy 0, policy_version 43461 (0.0017) [2024-08-05 18:45:58,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.5, 300 sec: 24131.7). Total num frames: 356073472. Throughput: 0: 6023.6. Samples: 89018670. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 18:45:58,126][15372] Avg episode reward: [(0, '42.433')] [2024-08-05 18:45:59,606][15444] Updated weights for policy 0, policy_version 43471 (0.0014) [2024-08-05 18:46:02,926][15444] Updated weights for policy 0, policy_version 43481 (0.0027) [2024-08-05 18:46:03,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 356196352. Throughput: 0: 6026.3. Samples: 89055020. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 18:46:03,119][15372] Avg episode reward: [(0, '42.613')] [2024-08-05 18:46:05,769][15417] Signal inference workers to stop experience collection... (16100 times) [2024-08-05 18:46:05,769][15417] Signal inference workers to resume experience collection... (16100 times) [2024-08-05 18:46:05,797][15444] InferenceWorker_p0-w0: stopping experience collection (16100 times) [2024-08-05 18:46:05,849][15444] InferenceWorker_p0-w0: resuming experience collection (16100 times) [2024-08-05 18:46:06,382][15444] Updated weights for policy 0, policy_version 43491 (0.0020) [2024-08-05 18:46:08,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 356319232. Throughput: 0: 6047.8. Samples: 89073700. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 18:46:08,126][15372] Avg episode reward: [(0, '41.706')] [2024-08-05 18:46:09,646][15444] Updated weights for policy 0, policy_version 43501 (0.0032) [2024-08-05 18:46:13,049][15444] Updated weights for policy 0, policy_version 43511 (0.0022) [2024-08-05 18:46:13,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24303.0, 300 sec: 24159.5). Total num frames: 356442112. Throughput: 0: 6050.0. Samples: 89109910. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 18:46:13,119][15372] Avg episode reward: [(0, '42.232')] [2024-08-05 18:46:16,334][15444] Updated weights for policy 0, policy_version 43521 (0.0021) [2024-08-05 18:46:18,119][15372] Fps is (10 sec: 23756.5, 60 sec: 24030.0, 300 sec: 24159.5). Total num frames: 356556800. Throughput: 0: 6048.9. Samples: 89145900. Policy #0 lag: (min: 0.0, avg: 3.4, max: 8.0) [2024-08-05 18:46:18,126][15372] Avg episode reward: [(0, '42.366')] [2024-08-05 18:46:18,130][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000043525_356556800.pth... [2024-08-05 18:46:18,253][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000042819_350773248.pth [2024-08-05 18:46:20,065][15444] Updated weights for policy 0, policy_version 43531 (0.0027) [2024-08-05 18:46:23,106][15444] Updated weights for policy 0, policy_version 43541 (0.0028) [2024-08-05 18:46:23,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24302.9, 300 sec: 24159.5). Total num frames: 356687872. Throughput: 0: 6054.7. Samples: 89164340. Policy #0 lag: (min: 0.0, avg: 3.4, max: 8.0) [2024-08-05 18:46:23,119][15372] Avg episode reward: [(0, '41.035')] [2024-08-05 18:46:26,716][15444] Updated weights for policy 0, policy_version 43551 (0.0018) [2024-08-05 18:46:28,119][15372] Fps is (10 sec: 24576.2, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 356802560. Throughput: 0: 6050.1. Samples: 89200740. Policy #0 lag: (min: 0.0, avg: 3.4, max: 8.0) [2024-08-05 18:46:28,119][15372] Avg episode reward: [(0, '41.319')] [2024-08-05 18:46:30,023][15444] Updated weights for policy 0, policy_version 43561 (0.0018) [2024-08-05 18:46:33,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 356925440. Throughput: 0: 6064.9. Samples: 89237510. Policy #0 lag: (min: 0.0, avg: 3.4, max: 8.0) [2024-08-05 18:46:33,126][15372] Avg episode reward: [(0, '41.919')] [2024-08-05 18:46:33,398][15444] Updated weights for policy 0, policy_version 43571 (0.0012) [2024-08-05 18:46:36,779][15444] Updated weights for policy 0, policy_version 43581 (0.0015) [2024-08-05 18:46:38,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24302.9, 300 sec: 24159.5). Total num frames: 357048320. Throughput: 0: 6064.0. Samples: 89255670. Policy #0 lag: (min: 0.0, avg: 3.4, max: 8.0) [2024-08-05 18:46:38,119][15372] Avg episode reward: [(0, '43.093')] [2024-08-05 18:46:39,971][15444] Updated weights for policy 0, policy_version 43591 (0.0037) [2024-08-05 18:46:43,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.7, 300 sec: 24131.7). Total num frames: 357163008. Throughput: 0: 6073.3. Samples: 89291970. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 18:46:43,126][15372] Avg episode reward: [(0, '43.409')] [2024-08-05 18:46:43,475][15444] Updated weights for policy 0, policy_version 43601 (0.0012) [2024-08-05 18:46:47,042][15444] Updated weights for policy 0, policy_version 43611 (0.0011) [2024-08-05 18:46:48,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24439.5, 300 sec: 24187.2). Total num frames: 357294080. Throughput: 0: 6071.6. Samples: 89328240. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 18:46:48,119][15372] Avg episode reward: [(0, '44.025')] [2024-08-05 18:46:50,049][15444] Updated weights for policy 0, policy_version 43621 (0.0022) [2024-08-05 18:46:53,119][15372] Fps is (10 sec: 24575.6, 60 sec: 24166.3, 300 sec: 24131.7). Total num frames: 357408768. Throughput: 0: 6069.3. Samples: 89346820. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 18:46:53,126][15372] Avg episode reward: [(0, '42.646')] [2024-08-05 18:46:53,764][15444] Updated weights for policy 0, policy_version 43631 (0.0022) [2024-08-05 18:46:56,084][15417] Signal inference workers to stop experience collection... (16150 times) [2024-08-05 18:46:56,085][15417] Signal inference workers to resume experience collection... (16150 times) [2024-08-05 18:46:56,109][15444] InferenceWorker_p0-w0: stopping experience collection (16150 times) [2024-08-05 18:46:56,109][15444] InferenceWorker_p0-w0: resuming experience collection (16150 times) [2024-08-05 18:46:57,137][15444] Updated weights for policy 0, policy_version 43641 (0.0029) [2024-08-05 18:46:58,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24303.0, 300 sec: 24131.7). Total num frames: 357531648. Throughput: 0: 6056.7. Samples: 89382460. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 18:46:58,119][15372] Avg episode reward: [(0, '42.727')] [2024-08-05 18:47:00,218][15444] Updated weights for policy 0, policy_version 43651 (0.0012) [2024-08-05 18:47:03,118][15372] Fps is (10 sec: 23757.2, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 357646336. Throughput: 0: 6079.1. Samples: 89419460. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 18:47:03,126][15372] Avg episode reward: [(0, '42.990')] [2024-08-05 18:47:03,709][15444] Updated weights for policy 0, policy_version 43661 (0.0016) [2024-08-05 18:47:07,317][15444] Updated weights for policy 0, policy_version 43671 (0.0014) [2024-08-05 18:47:08,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24302.9, 300 sec: 24159.5). Total num frames: 357777408. Throughput: 0: 6069.8. Samples: 89437480. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 18:47:08,119][15372] Avg episode reward: [(0, '42.432')] [2024-08-05 18:47:10,488][15444] Updated weights for policy 0, policy_version 43681 (0.0020) [2024-08-05 18:47:13,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 357892096. Throughput: 0: 6052.7. Samples: 89473110. Policy #0 lag: (min: 1.0, avg: 3.6, max: 8.0) [2024-08-05 18:47:13,126][15372] Avg episode reward: [(0, '42.198')] [2024-08-05 18:47:14,068][15444] Updated weights for policy 0, policy_version 43691 (0.0011) [2024-08-05 18:47:17,314][15444] Updated weights for policy 0, policy_version 43701 (0.0020) [2024-08-05 18:47:18,119][15372] Fps is (10 sec: 22937.3, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 358006784. Throughput: 0: 6027.5. Samples: 89508750. Policy #0 lag: (min: 1.0, avg: 3.6, max: 8.0) [2024-08-05 18:47:18,126][15372] Avg episode reward: [(0, '42.285')] [2024-08-05 18:47:20,680][15444] Updated weights for policy 0, policy_version 43711 (0.0011) [2024-08-05 18:47:23,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 358137856. Throughput: 0: 6036.2. Samples: 89527300. Policy #0 lag: (min: 1.0, avg: 3.6, max: 8.0) [2024-08-05 18:47:23,119][15372] Avg episode reward: [(0, '42.426')] [2024-08-05 18:47:24,457][15444] Updated weights for policy 0, policy_version 43721 (0.0023) [2024-08-05 18:47:27,627][15444] Updated weights for policy 0, policy_version 43731 (0.0012) [2024-08-05 18:47:28,118][15372] Fps is (10 sec: 24576.4, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 358252544. Throughput: 0: 6029.6. Samples: 89563300. Policy #0 lag: (min: 1.0, avg: 3.6, max: 8.0) [2024-08-05 18:47:28,119][15372] Avg episode reward: [(0, '42.487')] [2024-08-05 18:47:30,965][15444] Updated weights for policy 0, policy_version 43741 (0.0019) [2024-08-05 18:47:33,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 358375424. Throughput: 0: 6028.4. Samples: 89599520. Policy #0 lag: (min: 1.0, avg: 3.6, max: 8.0) [2024-08-05 18:47:33,126][15372] Avg episode reward: [(0, '43.378')] [2024-08-05 18:47:34,549][15444] Updated weights for policy 0, policy_version 43751 (0.0023) [2024-08-05 18:47:37,805][15444] Updated weights for policy 0, policy_version 43761 (0.0012) [2024-08-05 18:47:38,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 358490112. Throughput: 0: 6006.0. Samples: 89617090. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 18:47:38,119][15372] Avg episode reward: [(0, '42.783')] [2024-08-05 18:47:41,123][15444] Updated weights for policy 0, policy_version 43771 (0.0020) [2024-08-05 18:47:43,128][15372] Fps is (10 sec: 23733.9, 60 sec: 24162.5, 300 sec: 24158.7). Total num frames: 358612992. Throughput: 0: 6024.7. Samples: 89653630. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 18:47:43,137][15372] Avg episode reward: [(0, '42.440')] [2024-08-05 18:47:44,680][15444] Updated weights for policy 0, policy_version 43781 (0.0018) [2024-08-05 18:47:48,022][15444] Updated weights for policy 0, policy_version 43791 (0.0012) [2024-08-05 18:47:48,119][15372] Fps is (10 sec: 24575.8, 60 sec: 24029.8, 300 sec: 24159.4). Total num frames: 358735872. Throughput: 0: 6000.2. Samples: 89689470. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 18:47:48,119][15372] Avg episode reward: [(0, '42.571')] [2024-08-05 18:47:51,311][15444] Updated weights for policy 0, policy_version 43801 (0.0019) [2024-08-05 18:47:53,119][15372] Fps is (10 sec: 23779.0, 60 sec: 24029.8, 300 sec: 24131.8). Total num frames: 358850560. Throughput: 0: 6020.6. Samples: 89708410. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 18:47:53,126][15372] Avg episode reward: [(0, '43.358')] [2024-08-05 18:47:53,370][15417] Signal inference workers to stop experience collection... (16200 times) [2024-08-05 18:47:53,371][15417] Signal inference workers to resume experience collection... (16200 times) [2024-08-05 18:47:53,443][15444] InferenceWorker_p0-w0: stopping experience collection (16200 times) [2024-08-05 18:47:53,444][15444] InferenceWorker_p0-w0: resuming experience collection (16200 times) [2024-08-05 18:47:54,842][15444] Updated weights for policy 0, policy_version 43811 (0.0011) [2024-08-05 18:47:58,031][15444] Updated weights for policy 0, policy_version 43821 (0.0013) [2024-08-05 18:47:58,118][15372] Fps is (10 sec: 24576.3, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 358981632. Throughput: 0: 6035.3. Samples: 89744700. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 18:47:58,119][15372] Avg episode reward: [(0, '43.379')] [2024-08-05 18:48:01,666][15444] Updated weights for policy 0, policy_version 43831 (0.0013) [2024-08-05 18:48:03,118][15372] Fps is (10 sec: 24576.6, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 359096320. Throughput: 0: 6032.0. Samples: 89780190. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 18:48:03,119][15372] Avg episode reward: [(0, '42.441')] [2024-08-05 18:48:04,970][15444] Updated weights for policy 0, policy_version 43841 (0.0012) [2024-08-05 18:48:08,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 359219200. Throughput: 0: 6044.0. Samples: 89799280. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 18:48:08,126][15372] Avg episode reward: [(0, '43.112')] [2024-08-05 18:48:08,457][15444] Updated weights for policy 0, policy_version 43851 (0.0013) [2024-08-05 18:48:11,761][15444] Updated weights for policy 0, policy_version 43861 (0.0009) [2024-08-05 18:48:13,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 359342080. Throughput: 0: 6042.0. Samples: 89835190. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 18:48:13,119][15372] Avg episode reward: [(0, '43.283')] [2024-08-05 18:48:15,121][15444] Updated weights for policy 0, policy_version 43871 (0.0017) [2024-08-05 18:48:18,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24303.0, 300 sec: 24187.2). Total num frames: 359464960. Throughput: 0: 6046.4. Samples: 89871610. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 18:48:18,126][15372] Avg episode reward: [(0, '43.444')] [2024-08-05 18:48:18,129][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000043880_359464960.pth... [2024-08-05 18:48:18,259][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000043171_353656832.pth [2024-08-05 18:48:18,548][15444] Updated weights for policy 0, policy_version 43881 (0.0018) [2024-08-05 18:48:21,831][15444] Updated weights for policy 0, policy_version 43891 (0.0016) [2024-08-05 18:48:23,119][15372] Fps is (10 sec: 23756.6, 60 sec: 24029.8, 300 sec: 24131.7). Total num frames: 359579648. Throughput: 0: 6060.2. Samples: 89889800. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 18:48:23,119][15372] Avg episode reward: [(0, '43.013')] [2024-08-05 18:48:25,355][15444] Updated weights for policy 0, policy_version 43901 (0.0018) [2024-08-05 18:48:28,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 359702528. Throughput: 0: 6039.7. Samples: 89925360. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 18:48:28,119][15372] Avg episode reward: [(0, '43.154')] [2024-08-05 18:48:28,953][15444] Updated weights for policy 0, policy_version 43911 (0.0012) [2024-08-05 18:48:32,093][15444] Updated weights for policy 0, policy_version 43921 (0.0034) [2024-08-05 18:48:33,118][15372] Fps is (10 sec: 23757.1, 60 sec: 24029.8, 300 sec: 24131.7). Total num frames: 359817216. Throughput: 0: 6035.1. Samples: 89961050. Policy #0 lag: (min: 0.0, avg: 4.0, max: 9.0) [2024-08-05 18:48:33,119][15372] Avg episode reward: [(0, '42.910')] [2024-08-05 18:48:34,035][15417] Signal inference workers to stop experience collection... (16250 times) [2024-08-05 18:48:34,035][15417] Signal inference workers to resume experience collection... (16250 times) [2024-08-05 18:48:34,090][15444] InferenceWorker_p0-w0: stopping experience collection (16250 times) [2024-08-05 18:48:34,095][15444] InferenceWorker_p0-w0: resuming experience collection (16250 times) [2024-08-05 18:48:35,710][15444] Updated weights for policy 0, policy_version 43931 (0.0027) [2024-08-05 18:48:38,126][15372] Fps is (10 sec: 23738.4, 60 sec: 24163.3, 300 sec: 24131.1). Total num frames: 359940096. Throughput: 0: 6025.9. Samples: 89979620. Policy #0 lag: (min: 0.0, avg: 4.0, max: 9.0) [2024-08-05 18:48:38,127][15372] Avg episode reward: [(0, '42.637')] [2024-08-05 18:48:39,258][15444] Updated weights for policy 0, policy_version 43941 (0.0013) [2024-08-05 18:48:42,343][15444] Updated weights for policy 0, policy_version 43951 (0.0014) [2024-08-05 18:48:43,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24170.3, 300 sec: 24131.7). Total num frames: 360062976. Throughput: 0: 6032.2. Samples: 90016150. Policy #0 lag: (min: 0.0, avg: 4.0, max: 9.0) [2024-08-05 18:48:43,126][15372] Avg episode reward: [(0, '42.979')] [2024-08-05 18:48:46,017][15444] Updated weights for policy 0, policy_version 43961 (0.0023) [2024-08-05 18:48:48,118][15372] Fps is (10 sec: 23775.2, 60 sec: 24029.9, 300 sec: 24132.2). Total num frames: 360177664. Throughput: 0: 6036.9. Samples: 90051850. Policy #0 lag: (min: 0.0, avg: 4.0, max: 9.0) [2024-08-05 18:48:48,119][15372] Avg episode reward: [(0, '42.335')] [2024-08-05 18:48:49,103][15444] Updated weights for policy 0, policy_version 43971 (0.0029) [2024-08-05 18:48:52,563][15444] Updated weights for policy 0, policy_version 43981 (0.0018) [2024-08-05 18:48:53,118][15372] Fps is (10 sec: 23756.6, 60 sec: 24166.5, 300 sec: 24131.7). Total num frames: 360300544. Throughput: 0: 6006.0. Samples: 90069550. Policy #0 lag: (min: 0.0, avg: 4.0, max: 9.0) [2024-08-05 18:48:53,119][15372] Avg episode reward: [(0, '42.865')] [2024-08-05 18:48:56,258][15444] Updated weights for policy 0, policy_version 43991 (0.0018) [2024-08-05 18:48:58,119][15372] Fps is (10 sec: 24575.0, 60 sec: 24029.7, 300 sec: 24159.4). Total num frames: 360423424. Throughput: 0: 6008.8. Samples: 90105590. Policy #0 lag: (min: 0.0, avg: 4.0, max: 9.0) [2024-08-05 18:48:58,119][15372] Avg episode reward: [(0, '42.652')] [2024-08-05 18:48:59,333][15444] Updated weights for policy 0, policy_version 44001 (0.0035) [2024-08-05 18:49:02,902][15444] Updated weights for policy 0, policy_version 44011 (0.0023) [2024-08-05 18:49:03,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 360538112. Throughput: 0: 6007.1. Samples: 90141930. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 18:49:03,119][15372] Avg episode reward: [(0, '43.211')] [2024-08-05 18:49:05,995][15444] Updated weights for policy 0, policy_version 44021 (0.0018) [2024-08-05 18:49:08,118][15372] Fps is (10 sec: 24576.9, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 360669184. Throughput: 0: 6013.8. Samples: 90160420. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 18:49:08,119][15372] Avg episode reward: [(0, '42.883')] [2024-08-05 18:49:09,810][15444] Updated weights for policy 0, policy_version 44031 (0.0012) [2024-08-05 18:49:13,119][15372] Fps is (10 sec: 23756.2, 60 sec: 23893.3, 300 sec: 24131.7). Total num frames: 360775680. Throughput: 0: 6012.4. Samples: 90195920. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 18:49:13,126][15372] Avg episode reward: [(0, '41.726')] [2024-08-05 18:49:13,196][15444] Updated weights for policy 0, policy_version 44041 (0.0021) [2024-08-05 18:49:14,232][15417] Signal inference workers to stop experience collection... (16300 times) [2024-08-05 18:49:14,232][15417] Signal inference workers to resume experience collection... (16300 times) [2024-08-05 18:49:14,264][15444] InferenceWorker_p0-w0: stopping experience collection (16300 times) [2024-08-05 18:49:14,265][15444] InferenceWorker_p0-w0: resuming experience collection (16300 times) [2024-08-05 18:49:16,353][15444] Updated weights for policy 0, policy_version 44051 (0.0014) [2024-08-05 18:49:18,119][15372] Fps is (10 sec: 22937.3, 60 sec: 23893.3, 300 sec: 24159.4). Total num frames: 360898560. Throughput: 0: 6027.8. Samples: 90232300. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 18:49:18,119][15372] Avg episode reward: [(0, '42.198')] [2024-08-05 18:49:19,591][15444] Updated weights for policy 0, policy_version 44061 (0.0019) [2024-08-05 18:49:23,048][15444] Updated weights for policy 0, policy_version 44071 (0.0026) [2024-08-05 18:49:23,118][15372] Fps is (10 sec: 25395.8, 60 sec: 24166.5, 300 sec: 24159.5). Total num frames: 361029632. Throughput: 0: 6033.0. Samples: 90251060. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 18:49:23,119][15372] Avg episode reward: [(0, '42.709')] [2024-08-05 18:49:26,449][15444] Updated weights for policy 0, policy_version 44081 (0.0014) [2024-08-05 18:49:28,119][15372] Fps is (10 sec: 24576.2, 60 sec: 24029.8, 300 sec: 24159.5). Total num frames: 361144320. Throughput: 0: 6020.2. Samples: 90287060. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 18:49:28,126][15372] Avg episode reward: [(0, '42.450')] [2024-08-05 18:49:29,900][15444] Updated weights for policy 0, policy_version 44091 (0.0010) [2024-08-05 18:49:33,119][15372] Fps is (10 sec: 23756.0, 60 sec: 24166.3, 300 sec: 24159.9). Total num frames: 361267200. Throughput: 0: 6040.6. Samples: 90323680. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 18:49:33,127][15372] Avg episode reward: [(0, '41.666')] [2024-08-05 18:49:33,219][15444] Updated weights for policy 0, policy_version 44101 (0.0013) [2024-08-05 18:49:36,613][15444] Updated weights for policy 0, policy_version 44111 (0.0020) [2024-08-05 18:49:38,119][15372] Fps is (10 sec: 24575.3, 60 sec: 24169.4, 300 sec: 24159.4). Total num frames: 361390080. Throughput: 0: 6056.4. Samples: 90342090. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 18:49:38,126][15372] Avg episode reward: [(0, '42.136')] [2024-08-05 18:49:39,807][15444] Updated weights for policy 0, policy_version 44121 (0.0015) [2024-08-05 18:49:43,118][15372] Fps is (10 sec: 24576.7, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 361512960. Throughput: 0: 6074.3. Samples: 90378930. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 18:49:43,126][15372] Avg episode reward: [(0, '42.717')] [2024-08-05 18:49:43,218][15444] Updated weights for policy 0, policy_version 44131 (0.0024) [2024-08-05 18:49:46,709][15444] Updated weights for policy 0, policy_version 44141 (0.0012) [2024-08-05 18:49:48,118][15372] Fps is (10 sec: 24577.0, 60 sec: 24303.0, 300 sec: 24159.5). Total num frames: 361635840. Throughput: 0: 6057.8. Samples: 90414530. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 18:49:48,119][15372] Avg episode reward: [(0, '43.364')] [2024-08-05 18:49:50,005][15444] Updated weights for policy 0, policy_version 44151 (0.0028) [2024-08-05 18:49:53,119][15372] Fps is (10 sec: 24575.8, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 361758720. Throughput: 0: 6074.7. Samples: 90433780. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 18:49:53,126][15372] Avg episode reward: [(0, '42.656')] [2024-08-05 18:49:53,415][15444] Updated weights for policy 0, policy_version 44161 (0.0021) [2024-08-05 18:49:56,738][15444] Updated weights for policy 0, policy_version 44171 (0.0012) [2024-08-05 18:49:58,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24303.1, 300 sec: 24187.2). Total num frames: 361881600. Throughput: 0: 6087.6. Samples: 90469860. Policy #0 lag: (min: 1.0, avg: 3.8, max: 7.0) [2024-08-05 18:49:58,119][15372] Avg episode reward: [(0, '43.146')] [2024-08-05 18:50:00,215][15444] Updated weights for policy 0, policy_version 44181 (0.0020) [2024-08-05 18:50:03,119][15372] Fps is (10 sec: 23756.7, 60 sec: 24302.9, 300 sec: 24131.7). Total num frames: 361996288. Throughput: 0: 6069.8. Samples: 90505440. Policy #0 lag: (min: 1.0, avg: 3.8, max: 7.0) [2024-08-05 18:50:03,126][15372] Avg episode reward: [(0, '43.439')] [2024-08-05 18:50:03,579][15444] Updated weights for policy 0, policy_version 44191 (0.0016) [2024-08-05 18:50:04,555][15417] Signal inference workers to stop experience collection... (16350 times) [2024-08-05 18:50:04,556][15417] Signal inference workers to resume experience collection... (16350 times) [2024-08-05 18:50:04,599][15444] InferenceWorker_p0-w0: stopping experience collection (16350 times) [2024-08-05 18:50:04,599][15444] InferenceWorker_p0-w0: resuming experience collection (16350 times) [2024-08-05 18:50:06,853][15444] Updated weights for policy 0, policy_version 44201 (0.0018) [2024-08-05 18:50:08,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 362119168. Throughput: 0: 6056.0. Samples: 90523580. Policy #0 lag: (min: 1.0, avg: 3.8, max: 7.0) [2024-08-05 18:50:08,126][15372] Avg episode reward: [(0, '42.640')] [2024-08-05 18:50:10,432][15444] Updated weights for policy 0, policy_version 44211 (0.0024) [2024-08-05 18:50:13,118][15372] Fps is (10 sec: 24576.4, 60 sec: 24439.6, 300 sec: 24159.5). Total num frames: 362242048. Throughput: 0: 6067.8. Samples: 90560110. Policy #0 lag: (min: 1.0, avg: 3.8, max: 7.0) [2024-08-05 18:50:13,119][15372] Avg episode reward: [(0, '42.674')] [2024-08-05 18:50:13,715][15444] Updated weights for policy 0, policy_version 44221 (0.0011) [2024-08-05 18:50:17,365][15444] Updated weights for policy 0, policy_version 44231 (0.0024) [2024-08-05 18:50:18,119][15372] Fps is (10 sec: 23756.5, 60 sec: 24303.0, 300 sec: 24159.4). Total num frames: 362356736. Throughput: 0: 6048.7. Samples: 90595870. Policy #0 lag: (min: 1.0, avg: 3.8, max: 7.0) [2024-08-05 18:50:18,119][15372] Avg episode reward: [(0, '43.682')] [2024-08-05 18:50:18,146][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000044234_362364928.pth... [2024-08-05 18:50:18,263][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000043525_356556800.pth [2024-08-05 18:50:20,549][15444] Updated weights for policy 0, policy_version 44241 (0.0017) [2024-08-05 18:50:23,119][15372] Fps is (10 sec: 23756.4, 60 sec: 24166.3, 300 sec: 24159.5). Total num frames: 362479616. Throughput: 0: 6044.9. Samples: 90614110. Policy #0 lag: (min: 1.0, avg: 3.8, max: 7.0) [2024-08-05 18:50:23,119][15372] Avg episode reward: [(0, '42.257')] [2024-08-05 18:50:23,997][15444] Updated weights for policy 0, policy_version 44251 (0.0031) [2024-08-05 18:50:27,487][15444] Updated weights for policy 0, policy_version 44261 (0.0018) [2024-08-05 18:50:28,119][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 362594304. Throughput: 0: 6027.8. Samples: 90650180. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 18:50:28,119][15372] Avg episode reward: [(0, '42.765')] [2024-08-05 18:50:30,788][15444] Updated weights for policy 0, policy_version 44271 (0.0025) [2024-08-05 18:50:33,118][15372] Fps is (10 sec: 23757.1, 60 sec: 24166.5, 300 sec: 24159.5). Total num frames: 362717184. Throughput: 0: 6030.9. Samples: 90685920. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 18:50:33,127][15372] Avg episode reward: [(0, '43.008')] [2024-08-05 18:50:34,387][15444] Updated weights for policy 0, policy_version 44281 (0.0020) [2024-08-05 18:50:37,782][15444] Updated weights for policy 0, policy_version 44291 (0.0025) [2024-08-05 18:50:38,119][15372] Fps is (10 sec: 23756.7, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 362831872. Throughput: 0: 5998.0. Samples: 90703690. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 18:50:38,119][15372] Avg episode reward: [(0, '43.078')] [2024-08-05 18:50:40,994][15444] Updated weights for policy 0, policy_version 44301 (0.0015) [2024-08-05 18:50:43,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 362954752. Throughput: 0: 5994.4. Samples: 90739610. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 18:50:43,119][15372] Avg episode reward: [(0, '42.342')] [2024-08-05 18:50:44,796][15444] Updated weights for policy 0, policy_version 44311 (0.0022) [2024-08-05 18:50:47,991][15444] Updated weights for policy 0, policy_version 44321 (0.0013) [2024-08-05 18:50:48,118][15372] Fps is (10 sec: 24576.5, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 363077632. Throughput: 0: 6001.4. Samples: 90775500. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 18:50:48,119][15372] Avg episode reward: [(0, '42.763')] [2024-08-05 18:50:51,438][15444] Updated weights for policy 0, policy_version 44331 (0.0012) [2024-08-05 18:50:53,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 363200512. Throughput: 0: 6014.0. Samples: 90794210. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 18:50:53,126][15372] Avg episode reward: [(0, '42.814')] [2024-08-05 18:50:54,437][15417] Signal inference workers to stop experience collection... (16400 times) [2024-08-05 18:50:54,439][15417] Signal inference workers to resume experience collection... (16400 times) [2024-08-05 18:50:54,491][15444] InferenceWorker_p0-w0: stopping experience collection (16400 times) [2024-08-05 18:50:54,491][15444] InferenceWorker_p0-w0: resuming experience collection (16400 times) [2024-08-05 18:50:54,528][15444] Updated weights for policy 0, policy_version 44341 (0.0024) [2024-08-05 18:50:58,079][15444] Updated weights for policy 0, policy_version 44351 (0.0019) [2024-08-05 18:50:58,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 363323392. Throughput: 0: 6012.0. Samples: 90830650. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 18:50:58,119][15372] Avg episode reward: [(0, '43.345')] [2024-08-05 18:51:01,440][15444] Updated weights for policy 0, policy_version 44361 (0.0024) [2024-08-05 18:51:03,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24166.5, 300 sec: 24159.5). Total num frames: 363446272. Throughput: 0: 6020.7. Samples: 90866800. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 18:51:03,126][15372] Avg episode reward: [(0, '43.828')] [2024-08-05 18:51:04,750][15444] Updated weights for policy 0, policy_version 44371 (0.0014) [2024-08-05 18:51:08,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 363560960. Throughput: 0: 6028.2. Samples: 90885380. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 18:51:08,126][15372] Avg episode reward: [(0, '43.818')] [2024-08-05 18:51:08,560][15444] Updated weights for policy 0, policy_version 44381 (0.0014) [2024-08-05 18:51:11,482][15444] Updated weights for policy 0, policy_version 44391 (0.0021) [2024-08-05 18:51:13,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 363683840. Throughput: 0: 6016.5. Samples: 90920920. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 18:51:13,126][15372] Avg episode reward: [(0, '43.377')] [2024-08-05 18:51:15,071][15444] Updated weights for policy 0, policy_version 44401 (0.0016) [2024-08-05 18:51:18,132][15372] Fps is (10 sec: 24542.4, 60 sec: 24160.9, 300 sec: 24130.6). Total num frames: 363806720. Throughput: 0: 6039.1. Samples: 90957760. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 18:51:18,140][15372] Avg episode reward: [(0, '42.786')] [2024-08-05 18:51:18,291][15444] Updated weights for policy 0, policy_version 44411 (0.0023) [2024-08-05 18:51:21,726][15444] Updated weights for policy 0, policy_version 44421 (0.0027) [2024-08-05 18:51:23,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24166.5, 300 sec: 24159.5). Total num frames: 363929600. Throughput: 0: 6059.1. Samples: 90976350. Policy #0 lag: (min: 1.0, avg: 4.4, max: 9.0) [2024-08-05 18:51:23,119][15372] Avg episode reward: [(0, '42.396')] [2024-08-05 18:51:25,163][15444] Updated weights for policy 0, policy_version 44431 (0.0012) [2024-08-05 18:51:28,118][15372] Fps is (10 sec: 23789.3, 60 sec: 24166.5, 300 sec: 24131.7). Total num frames: 364044288. Throughput: 0: 6056.2. Samples: 91012140. Policy #0 lag: (min: 1.0, avg: 4.4, max: 9.0) [2024-08-05 18:51:28,126][15372] Avg episode reward: [(0, '43.045')] [2024-08-05 18:51:28,450][15444] Updated weights for policy 0, policy_version 44441 (0.0021) [2024-08-05 18:51:32,250][15444] Updated weights for policy 0, policy_version 44451 (0.0021) [2024-08-05 18:51:33,119][15372] Fps is (10 sec: 24575.8, 60 sec: 24302.9, 300 sec: 24159.5). Total num frames: 364175360. Throughput: 0: 6054.0. Samples: 91047930. Policy #0 lag: (min: 1.0, avg: 4.4, max: 9.0) [2024-08-05 18:51:33,119][15372] Avg episode reward: [(0, '42.919')] [2024-08-05 18:51:35,118][15444] Updated weights for policy 0, policy_version 44461 (0.0019) [2024-08-05 18:51:36,632][15417] Signal inference workers to stop experience collection... (16450 times) [2024-08-05 18:51:36,633][15417] Signal inference workers to resume experience collection... (16450 times) [2024-08-05 18:51:36,661][15444] InferenceWorker_p0-w0: stopping experience collection (16450 times) [2024-08-05 18:51:36,665][15444] InferenceWorker_p0-w0: resuming experience collection (16450 times) [2024-08-05 18:51:38,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.5, 300 sec: 24131.7). Total num frames: 364281856. Throughput: 0: 6048.0. Samples: 91066370. Policy #0 lag: (min: 1.0, avg: 4.4, max: 9.0) [2024-08-05 18:51:38,119][15372] Avg episode reward: [(0, '42.217')] [2024-08-05 18:51:38,728][15444] Updated weights for policy 0, policy_version 44471 (0.0018) [2024-08-05 18:51:42,211][15444] Updated weights for policy 0, policy_version 44481 (0.0016) [2024-08-05 18:51:43,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24302.9, 300 sec: 24131.7). Total num frames: 364412928. Throughput: 0: 6047.3. Samples: 91102780. Policy #0 lag: (min: 1.0, avg: 4.4, max: 9.0) [2024-08-05 18:51:43,119][15372] Avg episode reward: [(0, '42.267')] [2024-08-05 18:51:45,323][15444] Updated weights for policy 0, policy_version 44491 (0.0018) [2024-08-05 18:51:48,118][15372] Fps is (10 sec: 25395.1, 60 sec: 24302.9, 300 sec: 24159.5). Total num frames: 364535808. Throughput: 0: 6038.0. Samples: 91138510. Policy #0 lag: (min: 0.0, avg: 4.7, max: 10.0) [2024-08-05 18:51:48,119][15372] Avg episode reward: [(0, '42.534')] [2024-08-05 18:51:49,136][15444] Updated weights for policy 0, policy_version 44501 (0.0018) [2024-08-05 18:51:52,150][15444] Updated weights for policy 0, policy_version 44511 (0.0011) [2024-08-05 18:51:53,118][15372] Fps is (10 sec: 22937.7, 60 sec: 24029.9, 300 sec: 24103.9). Total num frames: 364642304. Throughput: 0: 6037.8. Samples: 91157080. Policy #0 lag: (min: 0.0, avg: 4.7, max: 10.0) [2024-08-05 18:51:53,126][15372] Avg episode reward: [(0, '43.602')] [2024-08-05 18:51:55,779][15444] Updated weights for policy 0, policy_version 44521 (0.0022) [2024-08-05 18:51:58,119][15372] Fps is (10 sec: 23756.6, 60 sec: 24166.4, 300 sec: 24159.4). Total num frames: 364773376. Throughput: 0: 6045.5. Samples: 91192970. Policy #0 lag: (min: 0.0, avg: 4.7, max: 10.0) [2024-08-05 18:51:58,119][15372] Avg episode reward: [(0, '43.558')] [2024-08-05 18:51:59,241][15444] Updated weights for policy 0, policy_version 44531 (0.0012) [2024-08-05 18:52:02,558][15444] Updated weights for policy 0, policy_version 44541 (0.0012) [2024-08-05 18:52:03,118][15372] Fps is (10 sec: 25395.2, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 364896256. Throughput: 0: 6016.3. Samples: 91228410. Policy #0 lag: (min: 0.0, avg: 4.7, max: 10.0) [2024-08-05 18:52:03,119][15372] Avg episode reward: [(0, '42.366')] [2024-08-05 18:52:06,048][15444] Updated weights for policy 0, policy_version 44551 (0.0016) [2024-08-05 18:52:08,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 365010944. Throughput: 0: 6013.8. Samples: 91246970. Policy #0 lag: (min: 0.0, avg: 4.7, max: 10.0) [2024-08-05 18:52:08,119][15372] Avg episode reward: [(0, '43.288')] [2024-08-05 18:52:09,418][15444] Updated weights for policy 0, policy_version 44561 (0.0010) [2024-08-05 18:52:12,908][15444] Updated weights for policy 0, policy_version 44571 (0.0031) [2024-08-05 18:52:13,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 365133824. Throughput: 0: 6020.4. Samples: 91283060. Policy #0 lag: (min: 0.0, avg: 4.7, max: 10.0) [2024-08-05 18:52:13,119][15372] Avg episode reward: [(0, '43.210')] [2024-08-05 18:52:16,152][15444] Updated weights for policy 0, policy_version 44581 (0.0025) [2024-08-05 18:52:16,563][15417] Signal inference workers to stop experience collection... (16500 times) [2024-08-05 18:52:16,564][15417] Signal inference workers to resume experience collection... (16500 times) [2024-08-05 18:52:16,601][15444] InferenceWorker_p0-w0: stopping experience collection (16500 times) [2024-08-05 18:52:16,602][15444] InferenceWorker_p0-w0: resuming experience collection (16500 times) [2024-08-05 18:52:18,119][15372] Fps is (10 sec: 23756.5, 60 sec: 24035.3, 300 sec: 24103.9). Total num frames: 365248512. Throughput: 0: 6023.6. Samples: 91318990. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 18:52:18,119][15372] Avg episode reward: [(0, '42.996')] [2024-08-05 18:52:18,133][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000044587_365256704.pth... [2024-08-05 18:52:18,283][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000043880_359464960.pth [2024-08-05 18:52:19,537][15444] Updated weights for policy 0, policy_version 44591 (0.0021) [2024-08-05 18:52:22,964][15444] Updated weights for policy 0, policy_version 44601 (0.0021) [2024-08-05 18:52:23,119][15372] Fps is (10 sec: 23756.6, 60 sec: 24029.8, 300 sec: 24131.7). Total num frames: 365371392. Throughput: 0: 6017.8. Samples: 91337170. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 18:52:23,119][15372] Avg episode reward: [(0, '43.461')] [2024-08-05 18:52:26,514][15444] Updated weights for policy 0, policy_version 44611 (0.0017) [2024-08-05 18:52:28,118][15372] Fps is (10 sec: 24576.3, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 365494272. Throughput: 0: 6006.4. Samples: 91373070. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 18:52:28,126][15372] Avg episode reward: [(0, '43.828')] [2024-08-05 18:52:29,775][15444] Updated weights for policy 0, policy_version 44621 (0.0011) [2024-08-05 18:52:33,118][15372] Fps is (10 sec: 23757.1, 60 sec: 23893.4, 300 sec: 24131.7). Total num frames: 365608960. Throughput: 0: 6032.9. Samples: 91409990. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 18:52:33,119][15372] Avg episode reward: [(0, '42.457')] [2024-08-05 18:52:33,234][15444] Updated weights for policy 0, policy_version 44631 (0.0017) [2024-08-05 18:52:36,418][15444] Updated weights for policy 0, policy_version 44641 (0.0022) [2024-08-05 18:52:38,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24132.5). Total num frames: 365731840. Throughput: 0: 6021.8. Samples: 91428060. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 18:52:38,126][15372] Avg episode reward: [(0, '42.685')] [2024-08-05 18:52:39,866][15444] Updated weights for policy 0, policy_version 44651 (0.0020) [2024-08-05 18:52:43,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 365854720. Throughput: 0: 6035.3. Samples: 91464560. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 18:52:43,126][15372] Avg episode reward: [(0, '43.232')] [2024-08-05 18:52:43,276][15444] Updated weights for policy 0, policy_version 44661 (0.0018) [2024-08-05 18:52:46,640][15444] Updated weights for policy 0, policy_version 44671 (0.0029) [2024-08-05 18:52:48,119][15372] Fps is (10 sec: 24575.8, 60 sec: 24029.8, 300 sec: 24159.5). Total num frames: 365977600. Throughput: 0: 6042.9. Samples: 91500340. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 18:52:48,126][15372] Avg episode reward: [(0, '42.846')] [2024-08-05 18:52:49,970][15444] Updated weights for policy 0, policy_version 44681 (0.0010) [2024-08-05 18:52:53,119][15372] Fps is (10 sec: 24575.5, 60 sec: 24302.8, 300 sec: 24131.7). Total num frames: 366100480. Throughput: 0: 6054.6. Samples: 91519430. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 18:52:53,126][15372] Avg episode reward: [(0, '43.509')] [2024-08-05 18:52:53,464][15444] Updated weights for policy 0, policy_version 44691 (0.0021) [2024-08-05 18:52:56,632][15444] Updated weights for policy 0, policy_version 44701 (0.0022) [2024-08-05 18:52:58,119][15372] Fps is (10 sec: 24575.6, 60 sec: 24166.3, 300 sec: 24159.4). Total num frames: 366223360. Throughput: 0: 6037.1. Samples: 91554730. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 18:52:58,126][15372] Avg episode reward: [(0, '43.387')] [2024-08-05 18:53:00,196][15444] Updated weights for policy 0, policy_version 44711 (0.0016) [2024-08-05 18:53:03,119][15372] Fps is (10 sec: 24576.2, 60 sec: 24166.3, 300 sec: 24159.4). Total num frames: 366346240. Throughput: 0: 6064.9. Samples: 91591910. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 18:53:03,127][15372] Avg episode reward: [(0, '42.709')] [2024-08-05 18:53:03,636][15444] Updated weights for policy 0, policy_version 44721 (0.0023) [2024-08-05 18:53:06,701][15444] Updated weights for policy 0, policy_version 44731 (0.0016) [2024-08-05 18:53:08,119][15372] Fps is (10 sec: 23757.2, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 366460928. Throughput: 0: 6068.2. Samples: 91610240. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 18:53:08,119][15372] Avg episode reward: [(0, '42.737')] [2024-08-05 18:53:10,227][15444] Updated weights for policy 0, policy_version 44741 (0.0028) [2024-08-05 18:53:13,118][15372] Fps is (10 sec: 24576.3, 60 sec: 24302.9, 300 sec: 24159.5). Total num frames: 366592000. Throughput: 0: 6080.4. Samples: 91646690. Policy #0 lag: (min: 1.0, avg: 3.4, max: 8.0) [2024-08-05 18:53:13,126][15372] Avg episode reward: [(0, '42.680')] [2024-08-05 18:53:13,519][15444] Updated weights for policy 0, policy_version 44751 (0.0020) [2024-08-05 18:53:16,920][15444] Updated weights for policy 0, policy_version 44761 (0.0017) [2024-08-05 18:53:18,119][15372] Fps is (10 sec: 23756.5, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 366698496. Throughput: 0: 6055.8. Samples: 91682500. Policy #0 lag: (min: 1.0, avg: 3.4, max: 8.0) [2024-08-05 18:53:18,119][15372] Avg episode reward: [(0, '43.894')] [2024-08-05 18:53:20,553][15444] Updated weights for policy 0, policy_version 44771 (0.0027) [2024-08-05 18:53:20,865][15417] Signal inference workers to stop experience collection... (16550 times) [2024-08-05 18:53:20,865][15417] Signal inference workers to resume experience collection... (16550 times) [2024-08-05 18:53:20,932][15444] InferenceWorker_p0-w0: stopping experience collection (16550 times) [2024-08-05 18:53:20,937][15444] InferenceWorker_p0-w0: resuming experience collection (16550 times) [2024-08-05 18:53:23,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24303.0, 300 sec: 24159.5). Total num frames: 366829568. Throughput: 0: 6069.3. Samples: 91701180. Policy #0 lag: (min: 1.0, avg: 3.4, max: 8.0) [2024-08-05 18:53:23,119][15372] Avg episode reward: [(0, '43.375')] [2024-08-05 18:53:23,485][15444] Updated weights for policy 0, policy_version 44781 (0.0025) [2024-08-05 18:53:27,121][15444] Updated weights for policy 0, policy_version 44791 (0.0011) [2024-08-05 18:53:28,119][15372] Fps is (10 sec: 25395.3, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 366952448. Throughput: 0: 6078.4. Samples: 91738090. Policy #0 lag: (min: 1.0, avg: 3.4, max: 8.0) [2024-08-05 18:53:28,119][15372] Avg episode reward: [(0, '42.205')] [2024-08-05 18:53:30,175][15444] Updated weights for policy 0, policy_version 44801 (0.0020) [2024-08-05 18:53:33,119][15372] Fps is (10 sec: 23756.3, 60 sec: 24302.9, 300 sec: 24160.1). Total num frames: 367067136. Throughput: 0: 6076.4. Samples: 91773780. Policy #0 lag: (min: 1.0, avg: 3.4, max: 8.0) [2024-08-05 18:53:33,126][15372] Avg episode reward: [(0, '42.492')] [2024-08-05 18:53:33,892][15444] Updated weights for policy 0, policy_version 44811 (0.0018) [2024-08-05 18:53:37,446][15444] Updated weights for policy 0, policy_version 44821 (0.0033) [2024-08-05 18:53:38,121][15372] Fps is (10 sec: 23750.9, 60 sec: 24301.9, 300 sec: 24159.2). Total num frames: 367190016. Throughput: 0: 6066.6. Samples: 91792440. Policy #0 lag: (min: 1.0, avg: 3.4, max: 8.0) [2024-08-05 18:53:38,121][15372] Avg episode reward: [(0, '42.731')] [2024-08-05 18:53:40,605][15444] Updated weights for policy 0, policy_version 44831 (0.0013) [2024-08-05 18:53:43,119][15372] Fps is (10 sec: 24576.4, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 367312896. Throughput: 0: 6069.4. Samples: 91827850. Policy #0 lag: (min: 1.0, avg: 3.6, max: 8.0) [2024-08-05 18:53:43,126][15372] Avg episode reward: [(0, '43.352')] [2024-08-05 18:53:44,412][15444] Updated weights for policy 0, policy_version 44841 (0.0021) [2024-08-05 18:53:47,340][15444] Updated weights for policy 0, policy_version 44851 (0.0026) [2024-08-05 18:53:48,118][15372] Fps is (10 sec: 23763.1, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 367427584. Throughput: 0: 6030.2. Samples: 91863270. Policy #0 lag: (min: 1.0, avg: 3.6, max: 8.0) [2024-08-05 18:53:48,126][15372] Avg episode reward: [(0, '43.035')] [2024-08-05 18:53:51,015][15444] Updated weights for policy 0, policy_version 44861 (0.0023) [2024-08-05 18:53:53,119][15372] Fps is (10 sec: 23756.3, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 367550464. Throughput: 0: 6023.8. Samples: 91881310. Policy #0 lag: (min: 1.0, avg: 3.6, max: 8.0) [2024-08-05 18:53:53,120][15372] Avg episode reward: [(0, '42.514')] [2024-08-05 18:53:54,764][15444] Updated weights for policy 0, policy_version 44871 (0.0019) [2024-08-05 18:53:57,691][15444] Updated weights for policy 0, policy_version 44881 (0.0018) [2024-08-05 18:53:58,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24166.5, 300 sec: 24187.2). Total num frames: 367673344. Throughput: 0: 6018.5. Samples: 91917520. Policy #0 lag: (min: 1.0, avg: 3.6, max: 8.0) [2024-08-05 18:53:58,119][15372] Avg episode reward: [(0, '41.932')] [2024-08-05 18:54:01,256][15417] Signal inference workers to stop experience collection... (16600 times) [2024-08-05 18:54:01,257][15417] Signal inference workers to resume experience collection... (16600 times) [2024-08-05 18:54:01,296][15444] InferenceWorker_p0-w0: stopping experience collection (16600 times) [2024-08-05 18:54:01,296][15444] InferenceWorker_p0-w0: resuming experience collection (16600 times) [2024-08-05 18:54:01,345][15444] Updated weights for policy 0, policy_version 44891 (0.0022) [2024-08-05 18:54:03,118][15372] Fps is (10 sec: 22938.1, 60 sec: 23893.4, 300 sec: 24103.9). Total num frames: 367779840. Throughput: 0: 6012.3. Samples: 91953050. Policy #0 lag: (min: 1.0, avg: 3.6, max: 8.0) [2024-08-05 18:54:03,119][15372] Avg episode reward: [(0, '42.101')] [2024-08-05 18:54:04,541][15444] Updated weights for policy 0, policy_version 44901 (0.0029) [2024-08-05 18:54:08,030][15444] Updated weights for policy 0, policy_version 44911 (0.0010) [2024-08-05 18:54:08,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 367910912. Throughput: 0: 6014.2. Samples: 91971820. Policy #0 lag: (min: 1.0, avg: 3.6, max: 8.0) [2024-08-05 18:54:08,119][15372] Avg episode reward: [(0, '43.358')] [2024-08-05 18:54:11,616][15444] Updated weights for policy 0, policy_version 44921 (0.0014) [2024-08-05 18:54:13,119][15372] Fps is (10 sec: 25394.4, 60 sec: 24029.8, 300 sec: 24187.2). Total num frames: 368033792. Throughput: 0: 6004.0. Samples: 92008270. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 18:54:13,119][15372] Avg episode reward: [(0, '41.930')] [2024-08-05 18:54:14,656][15444] Updated weights for policy 0, policy_version 44931 (0.0018) [2024-08-05 18:54:18,119][15372] Fps is (10 sec: 23756.3, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 368148480. Throughput: 0: 6022.9. Samples: 92044810. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 18:54:18,126][15372] Avg episode reward: [(0, '41.614')] [2024-08-05 18:54:18,167][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000044941_368156672.pth... [2024-08-05 18:54:18,171][15444] Updated weights for policy 0, policy_version 44941 (0.0020) [2024-08-05 18:54:18,347][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000044234_362364928.pth [2024-08-05 18:54:21,370][15444] Updated weights for policy 0, policy_version 44951 (0.0013) [2024-08-05 18:54:23,118][15372] Fps is (10 sec: 23757.4, 60 sec: 24029.8, 300 sec: 24159.5). Total num frames: 368271360. Throughput: 0: 6012.6. Samples: 92062990. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 18:54:23,126][15372] Avg episode reward: [(0, '42.747')] [2024-08-05 18:54:24,927][15444] Updated weights for policy 0, policy_version 44961 (0.0032) [2024-08-05 18:54:28,118][15372] Fps is (10 sec: 24576.5, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 368394240. Throughput: 0: 6028.4. Samples: 92099130. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 18:54:28,126][15372] Avg episode reward: [(0, '43.449')] [2024-08-05 18:54:28,592][15444] Updated weights for policy 0, policy_version 44971 (0.0013) [2024-08-05 18:54:31,652][15444] Updated weights for policy 0, policy_version 44981 (0.0020) [2024-08-05 18:54:33,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.5, 300 sec: 24159.5). Total num frames: 368517120. Throughput: 0: 6036.9. Samples: 92134930. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 18:54:33,126][15372] Avg episode reward: [(0, '43.623')] [2024-08-05 18:54:35,062][15444] Updated weights for policy 0, policy_version 44991 (0.0030) [2024-08-05 18:54:38,124][15372] Fps is (10 sec: 24562.6, 60 sec: 24165.3, 300 sec: 24159.0). Total num frames: 368640000. Throughput: 0: 6035.1. Samples: 92152920. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 18:54:38,132][15372] Avg episode reward: [(0, '43.344')] [2024-08-05 18:54:38,502][15444] Updated weights for policy 0, policy_version 45001 (0.0026) [2024-08-05 18:54:41,932][15444] Updated weights for policy 0, policy_version 45011 (0.0026) [2024-08-05 18:54:43,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 368754688. Throughput: 0: 6037.3. Samples: 92189200. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 18:54:43,119][15372] Avg episode reward: [(0, '42.328')] [2024-08-05 18:54:45,224][15444] Updated weights for policy 0, policy_version 45021 (0.0020) [2024-08-05 18:54:48,119][15372] Fps is (10 sec: 23768.5, 60 sec: 24166.2, 300 sec: 24131.7). Total num frames: 368877568. Throughput: 0: 6064.8. Samples: 92225970. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 18:54:48,127][15372] Avg episode reward: [(0, '41.955')] [2024-08-05 18:54:48,663][15444] Updated weights for policy 0, policy_version 45031 (0.0018) [2024-08-05 18:54:51,891][15444] Updated weights for policy 0, policy_version 45041 (0.0019) [2024-08-05 18:54:53,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24166.5, 300 sec: 24131.7). Total num frames: 369000448. Throughput: 0: 6053.6. Samples: 92244230. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 18:54:53,119][15372] Avg episode reward: [(0, '42.532')] [2024-08-05 18:54:55,526][15444] Updated weights for policy 0, policy_version 45051 (0.0019) [2024-08-05 18:54:57,938][15417] Signal inference workers to stop experience collection... (16650 times) [2024-08-05 18:54:57,938][15417] Signal inference workers to resume experience collection... (16650 times) [2024-08-05 18:54:57,987][15444] InferenceWorker_p0-w0: stopping experience collection (16650 times) [2024-08-05 18:54:58,000][15444] InferenceWorker_p0-w0: resuming experience collection (16650 times) [2024-08-05 18:54:58,119][15372] Fps is (10 sec: 24576.9, 60 sec: 24166.3, 300 sec: 24159.5). Total num frames: 369123328. Throughput: 0: 6056.9. Samples: 92280830. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 18:54:58,119][15372] Avg episode reward: [(0, '42.431')] [2024-08-05 18:54:58,640][15444] Updated weights for policy 0, policy_version 45061 (0.0017) [2024-08-05 18:55:02,146][15444] Updated weights for policy 0, policy_version 45071 (0.0016) [2024-08-05 18:55:03,119][15372] Fps is (10 sec: 24574.8, 60 sec: 24439.3, 300 sec: 24159.4). Total num frames: 369246208. Throughput: 0: 6046.6. Samples: 92316910. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 18:55:03,119][15372] Avg episode reward: [(0, '42.969')] [2024-08-05 18:55:05,537][15444] Updated weights for policy 0, policy_version 45081 (0.0012) [2024-08-05 18:55:08,119][15372] Fps is (10 sec: 23757.0, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 369360896. Throughput: 0: 6049.5. Samples: 92335220. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 18:55:08,119][15372] Avg episode reward: [(0, '41.887')] [2024-08-05 18:55:08,841][15444] Updated weights for policy 0, policy_version 45091 (0.0016) [2024-08-05 18:55:12,344][15444] Updated weights for policy 0, policy_version 45101 (0.0012) [2024-08-05 18:55:13,118][15372] Fps is (10 sec: 23758.0, 60 sec: 24166.5, 300 sec: 24159.5). Total num frames: 369483776. Throughput: 0: 6043.6. Samples: 92371090. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 18:55:13,119][15372] Avg episode reward: [(0, '43.321')] [2024-08-05 18:55:15,527][15444] Updated weights for policy 0, policy_version 45111 (0.0010) [2024-08-05 18:55:18,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24303.0, 300 sec: 24159.5). Total num frames: 369606656. Throughput: 0: 6057.3. Samples: 92407510. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 18:55:18,128][15372] Avg episode reward: [(0, '44.123')] [2024-08-05 18:55:19,126][15444] Updated weights for policy 0, policy_version 45121 (0.0020) [2024-08-05 18:55:22,799][15444] Updated weights for policy 0, policy_version 45131 (0.0020) [2024-08-05 18:55:23,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 369721344. Throughput: 0: 6064.3. Samples: 92425780. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 18:55:23,119][15372] Avg episode reward: [(0, '43.464')] [2024-08-05 18:55:25,769][15444] Updated weights for policy 0, policy_version 45141 (0.0017) [2024-08-05 18:55:28,119][15372] Fps is (10 sec: 23756.1, 60 sec: 24166.3, 300 sec: 24159.4). Total num frames: 369844224. Throughput: 0: 6040.2. Samples: 92461010. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 18:55:28,120][15372] Avg episode reward: [(0, '43.175')] [2024-08-05 18:55:29,469][15444] Updated weights for policy 0, policy_version 45151 (0.0022) [2024-08-05 18:55:32,858][15444] Updated weights for policy 0, policy_version 45161 (0.0012) [2024-08-05 18:55:33,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 369967104. Throughput: 0: 6023.9. Samples: 92497040. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 18:55:33,119][15372] Avg episode reward: [(0, '42.914')] [2024-08-05 18:55:36,116][15444] Updated weights for policy 0, policy_version 45171 (0.0012) [2024-08-05 18:55:38,119][15372] Fps is (10 sec: 23757.3, 60 sec: 24032.0, 300 sec: 24159.5). Total num frames: 370081792. Throughput: 0: 6030.2. Samples: 92515590. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 18:55:38,126][15372] Avg episode reward: [(0, '42.951')] [2024-08-05 18:55:39,525][15444] Updated weights for policy 0, policy_version 45181 (0.0037) [2024-08-05 18:55:42,746][15444] Updated weights for policy 0, policy_version 45191 (0.0014) [2024-08-05 18:55:43,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 370204672. Throughput: 0: 6042.5. Samples: 92552740. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 18:55:43,119][15372] Avg episode reward: [(0, '43.361')] [2024-08-05 18:55:46,267][15444] Updated weights for policy 0, policy_version 45201 (0.0026) [2024-08-05 18:55:48,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24166.6, 300 sec: 24159.5). Total num frames: 370327552. Throughput: 0: 6034.7. Samples: 92588470. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 18:55:48,126][15372] Avg episode reward: [(0, '42.507')] [2024-08-05 18:55:49,514][15444] Updated weights for policy 0, policy_version 45211 (0.0022) [2024-08-05 18:55:51,470][15417] Signal inference workers to stop experience collection... (16700 times) [2024-08-05 18:55:51,472][15417] Signal inference workers to resume experience collection... (16700 times) [2024-08-05 18:55:51,516][15444] InferenceWorker_p0-w0: stopping experience collection (16700 times) [2024-08-05 18:55:51,517][15444] InferenceWorker_p0-w0: resuming experience collection (16700 times) [2024-08-05 18:55:52,973][15444] Updated weights for policy 0, policy_version 45221 (0.0014) [2024-08-05 18:55:53,118][15372] Fps is (10 sec: 24575.8, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 370450432. Throughput: 0: 6038.4. Samples: 92606950. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 18:55:53,119][15372] Avg episode reward: [(0, '42.942')] [2024-08-05 18:55:56,058][15444] Updated weights for policy 0, policy_version 45231 (0.0016) [2024-08-05 18:55:58,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24166.5, 300 sec: 24159.5). Total num frames: 370573312. Throughput: 0: 6058.0. Samples: 92643700. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 18:55:58,119][15372] Avg episode reward: [(0, '43.234')] [2024-08-05 18:55:59,639][15444] Updated weights for policy 0, policy_version 45241 (0.0013) [2024-08-05 18:56:03,119][15372] Fps is (10 sec: 23756.6, 60 sec: 24030.0, 300 sec: 24159.4). Total num frames: 370688000. Throughput: 0: 6046.2. Samples: 92679590. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 18:56:03,127][15372] Avg episode reward: [(0, '42.843')] [2024-08-05 18:56:03,134][15444] Updated weights for policy 0, policy_version 45251 (0.0014) [2024-08-05 18:56:06,650][15444] Updated weights for policy 0, policy_version 45261 (0.0021) [2024-08-05 18:56:08,119][15372] Fps is (10 sec: 23756.3, 60 sec: 24166.4, 300 sec: 24159.4). Total num frames: 370810880. Throughput: 0: 6051.1. Samples: 92698080. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 18:56:08,126][15372] Avg episode reward: [(0, '42.331')] [2024-08-05 18:56:10,093][15444] Updated weights for policy 0, policy_version 45271 (0.0020) [2024-08-05 18:56:13,119][15372] Fps is (10 sec: 24575.9, 60 sec: 24166.3, 300 sec: 24160.6). Total num frames: 370933760. Throughput: 0: 6063.6. Samples: 92733870. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 18:56:13,126][15372] Avg episode reward: [(0, '43.022')] [2024-08-05 18:56:13,485][15444] Updated weights for policy 0, policy_version 45281 (0.0018) [2024-08-05 18:56:16,736][15444] Updated weights for policy 0, policy_version 45291 (0.0010) [2024-08-05 18:56:18,118][15372] Fps is (10 sec: 24576.5, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 371056640. Throughput: 0: 6060.7. Samples: 92769770. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 18:56:18,119][15372] Avg episode reward: [(0, '42.400')] [2024-08-05 18:56:18,122][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000045295_371056640.pth... [2024-08-05 18:56:18,284][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000044587_365256704.pth [2024-08-05 18:56:20,138][15444] Updated weights for policy 0, policy_version 45301 (0.0017) [2024-08-05 18:56:23,118][15372] Fps is (10 sec: 24576.4, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 371179520. Throughput: 0: 6050.7. Samples: 92787870. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 18:56:23,126][15372] Avg episode reward: [(0, '42.107')] [2024-08-05 18:56:23,651][15444] Updated weights for policy 0, policy_version 45311 (0.0012) [2024-08-05 18:56:26,885][15444] Updated weights for policy 0, policy_version 45321 (0.0027) [2024-08-05 18:56:28,119][15372] Fps is (10 sec: 23756.1, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 371294208. Throughput: 0: 6021.3. Samples: 92823700. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 18:56:28,119][15372] Avg episode reward: [(0, '42.497')] [2024-08-05 18:56:30,244][15444] Updated weights for policy 0, policy_version 45331 (0.0011) [2024-08-05 18:56:33,119][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 371417088. Throughput: 0: 6029.1. Samples: 92859780. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 18:56:33,126][15372] Avg episode reward: [(0, '43.297')] [2024-08-05 18:56:33,866][15444] Updated weights for policy 0, policy_version 45341 (0.0038) [2024-08-05 18:56:37,255][15444] Updated weights for policy 0, policy_version 45351 (0.0031) [2024-08-05 18:56:38,119][15372] Fps is (10 sec: 24576.3, 60 sec: 24302.9, 300 sec: 24159.4). Total num frames: 371539968. Throughput: 0: 6026.6. Samples: 92878150. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 18:56:38,119][15372] Avg episode reward: [(0, '43.078')] [2024-08-05 18:56:40,484][15444] Updated weights for policy 0, policy_version 45361 (0.0029) [2024-08-05 18:56:43,119][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 371654656. Throughput: 0: 6004.4. Samples: 92913900. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 18:56:43,126][15372] Avg episode reward: [(0, '43.744')] [2024-08-05 18:56:44,015][15444] Updated weights for policy 0, policy_version 45371 (0.0019) [2024-08-05 18:56:47,306][15444] Updated weights for policy 0, policy_version 45381 (0.0011) [2024-08-05 18:56:48,118][15372] Fps is (10 sec: 23757.2, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 371777536. Throughput: 0: 6012.9. Samples: 92950170. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 18:56:48,119][15372] Avg episode reward: [(0, '43.137')] [2024-08-05 18:56:50,709][15444] Updated weights for policy 0, policy_version 45391 (0.0011) [2024-08-05 18:56:53,119][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 371892224. Throughput: 0: 6011.8. Samples: 92968610. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 18:56:53,126][15372] Avg episode reward: [(0, '43.841')] [2024-08-05 18:56:54,239][15444] Updated weights for policy 0, policy_version 45401 (0.0012) [2024-08-05 18:56:56,711][15417] Signal inference workers to stop experience collection... (16750 times) [2024-08-05 18:56:56,711][15417] Signal inference workers to resume experience collection... (16750 times) [2024-08-05 18:56:56,761][15444] InferenceWorker_p0-w0: stopping experience collection (16750 times) [2024-08-05 18:56:56,761][15444] InferenceWorker_p0-w0: resuming experience collection (16750 times) [2024-08-05 18:56:57,547][15444] Updated weights for policy 0, policy_version 45411 (0.0021) [2024-08-05 18:56:58,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 372015104. Throughput: 0: 6018.0. Samples: 93004680. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 18:56:58,126][15372] Avg episode reward: [(0, '43.248')] [2024-08-05 18:57:00,799][15444] Updated weights for policy 0, policy_version 45421 (0.0017) [2024-08-05 18:57:03,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24166.5, 300 sec: 24159.5). Total num frames: 372137984. Throughput: 0: 6033.8. Samples: 93041290. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 18:57:03,119][15372] Avg episode reward: [(0, '42.083')] [2024-08-05 18:57:04,514][15444] Updated weights for policy 0, policy_version 45431 (0.0021) [2024-08-05 18:57:07,787][15444] Updated weights for policy 0, policy_version 45441 (0.0014) [2024-08-05 18:57:08,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 372252672. Throughput: 0: 6018.5. Samples: 93058700. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 18:57:08,119][15372] Avg episode reward: [(0, '42.227')] [2024-08-05 18:57:11,386][15444] Updated weights for policy 0, policy_version 45451 (0.0014) [2024-08-05 18:57:13,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 372375552. Throughput: 0: 6007.8. Samples: 93094050. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 18:57:13,119][15372] Avg episode reward: [(0, '42.673')] [2024-08-05 18:57:14,498][15444] Updated weights for policy 0, policy_version 45461 (0.0022) [2024-08-05 18:57:17,964][15444] Updated weights for policy 0, policy_version 45471 (0.0021) [2024-08-05 18:57:18,119][15372] Fps is (10 sec: 24575.7, 60 sec: 24029.8, 300 sec: 24159.5). Total num frames: 372498432. Throughput: 0: 6026.0. Samples: 93130950. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 18:57:18,119][15372] Avg episode reward: [(0, '42.594')] [2024-08-05 18:57:21,511][15444] Updated weights for policy 0, policy_version 45481 (0.0011) [2024-08-05 18:57:23,119][15372] Fps is (10 sec: 24575.8, 60 sec: 24029.8, 300 sec: 24159.4). Total num frames: 372621312. Throughput: 0: 6019.8. Samples: 93149040. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 18:57:23,119][15372] Avg episode reward: [(0, '43.567')] [2024-08-05 18:57:24,886][15444] Updated weights for policy 0, policy_version 45491 (0.0031) [2024-08-05 18:57:28,118][15372] Fps is (10 sec: 23757.1, 60 sec: 24030.0, 300 sec: 24159.5). Total num frames: 372736000. Throughput: 0: 6041.6. Samples: 93185770. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 18:57:28,126][15372] Avg episode reward: [(0, '43.060')] [2024-08-05 18:57:28,134][15444] Updated weights for policy 0, policy_version 45501 (0.0034) [2024-08-05 18:57:31,336][15444] Updated weights for policy 0, policy_version 45511 (0.0017) [2024-08-05 18:57:33,118][15372] Fps is (10 sec: 23757.1, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 372858880. Throughput: 0: 6036.0. Samples: 93221790. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 18:57:33,126][15372] Avg episode reward: [(0, '42.940')] [2024-08-05 18:57:34,800][15444] Updated weights for policy 0, policy_version 45521 (0.0013) [2024-08-05 18:57:38,119][15372] Fps is (10 sec: 24575.9, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 372981760. Throughput: 0: 6035.1. Samples: 93240190. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 18:57:38,126][15372] Avg episode reward: [(0, '42.824')] [2024-08-05 18:57:38,207][15444] Updated weights for policy 0, policy_version 45531 (0.0012) [2024-08-05 18:57:41,523][15444] Updated weights for policy 0, policy_version 45541 (0.0024) [2024-08-05 18:57:43,119][15372] Fps is (10 sec: 24575.0, 60 sec: 24166.3, 300 sec: 24159.4). Total num frames: 373104640. Throughput: 0: 6051.1. Samples: 93276980. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 18:57:43,127][15372] Avg episode reward: [(0, '43.033')] [2024-08-05 18:57:44,862][15444] Updated weights for policy 0, policy_version 45551 (0.0020) [2024-08-05 18:57:48,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 373227520. Throughput: 0: 6043.8. Samples: 93313260. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 18:57:48,126][15372] Avg episode reward: [(0, '41.962')] [2024-08-05 18:57:48,467][15444] Updated weights for policy 0, policy_version 45561 (0.0026) [2024-08-05 18:57:51,631][15444] Updated weights for policy 0, policy_version 45571 (0.0012) [2024-08-05 18:57:53,118][15372] Fps is (10 sec: 24577.0, 60 sec: 24303.0, 300 sec: 24159.5). Total num frames: 373350400. Throughput: 0: 6059.6. Samples: 93331380. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 18:57:53,128][15372] Avg episode reward: [(0, '42.772')] [2024-08-05 18:57:55,046][15444] Updated weights for policy 0, policy_version 45581 (0.0014) [2024-08-05 18:57:58,128][15372] Fps is (10 sec: 24552.7, 60 sec: 24299.1, 300 sec: 24158.7). Total num frames: 373473280. Throughput: 0: 6085.8. Samples: 93367970. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 18:57:58,136][15372] Avg episode reward: [(0, '43.394')] [2024-08-05 18:57:58,345][15444] Updated weights for policy 0, policy_version 45591 (0.0011) [2024-08-05 18:58:01,952][15444] Updated weights for policy 0, policy_version 45601 (0.0031) [2024-08-05 18:58:03,119][15372] Fps is (10 sec: 23756.3, 60 sec: 24166.3, 300 sec: 24159.5). Total num frames: 373587968. Throughput: 0: 6066.7. Samples: 93403950. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 18:58:03,119][15372] Avg episode reward: [(0, '42.353')] [2024-08-05 18:58:04,623][15417] Signal inference workers to stop experience collection... (16800 times) [2024-08-05 18:58:04,623][15417] Signal inference workers to resume experience collection... (16800 times) [2024-08-05 18:58:04,669][15444] InferenceWorker_p0-w0: stopping experience collection (16800 times) [2024-08-05 18:58:04,669][15444] InferenceWorker_p0-w0: resuming experience collection (16800 times) [2024-08-05 18:58:05,206][15444] Updated weights for policy 0, policy_version 45611 (0.0014) [2024-08-05 18:58:08,119][15372] Fps is (10 sec: 23779.0, 60 sec: 24302.9, 300 sec: 24131.7). Total num frames: 373710848. Throughput: 0: 6078.7. Samples: 93422580. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 18:58:08,119][15372] Avg episode reward: [(0, '42.295')] [2024-08-05 18:58:08,545][15444] Updated weights for policy 0, policy_version 45621 (0.0021) [2024-08-05 18:58:12,143][15444] Updated weights for policy 0, policy_version 45631 (0.0017) [2024-08-05 18:58:13,119][15372] Fps is (10 sec: 24576.5, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 373833728. Throughput: 0: 6056.7. Samples: 93458320. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 18:58:13,119][15372] Avg episode reward: [(0, '43.350')] [2024-08-05 18:58:15,443][15444] Updated weights for policy 0, policy_version 45641 (0.0018) [2024-08-05 18:58:18,119][15372] Fps is (10 sec: 24575.2, 60 sec: 24302.8, 300 sec: 24159.4). Total num frames: 373956608. Throughput: 0: 6081.0. Samples: 93495440. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 18:58:18,119][15372] Avg episode reward: [(0, '44.544')] [2024-08-05 18:58:18,123][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000045649_373956608.pth... [2024-08-05 18:58:18,258][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000044941_368156672.pth [2024-08-05 18:58:18,265][15417] Saving new best policy, reward=44.544! [2024-08-05 18:58:18,723][15444] Updated weights for policy 0, policy_version 45651 (0.0017) [2024-08-05 18:58:22,242][15444] Updated weights for policy 0, policy_version 45661 (0.0022) [2024-08-05 18:58:23,119][15372] Fps is (10 sec: 23756.2, 60 sec: 24166.3, 300 sec: 24131.7). Total num frames: 374071296. Throughput: 0: 6057.3. Samples: 93512770. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 18:58:23,119][15372] Avg episode reward: [(0, '43.663')] [2024-08-05 18:58:25,480][15444] Updated weights for policy 0, policy_version 45671 (0.0015) [2024-08-05 18:58:28,120][15372] Fps is (10 sec: 23754.4, 60 sec: 24302.3, 300 sec: 24159.4). Total num frames: 374194176. Throughput: 0: 6048.7. Samples: 93549180. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 18:58:28,120][15372] Avg episode reward: [(0, '43.289')] [2024-08-05 18:58:28,896][15444] Updated weights for policy 0, policy_version 45681 (0.0030) [2024-08-05 18:58:32,335][15444] Updated weights for policy 0, policy_version 45691 (0.0024) [2024-08-05 18:58:33,118][15372] Fps is (10 sec: 24576.7, 60 sec: 24302.9, 300 sec: 24159.7). Total num frames: 374317056. Throughput: 0: 6048.9. Samples: 93585460. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 18:58:33,119][15372] Avg episode reward: [(0, '43.576')] [2024-08-05 18:58:35,574][15444] Updated weights for policy 0, policy_version 45701 (0.0022) [2024-08-05 18:58:38,118][15372] Fps is (10 sec: 24579.6, 60 sec: 24303.0, 300 sec: 24159.5). Total num frames: 374439936. Throughput: 0: 6045.8. Samples: 93603440. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 18:58:38,126][15372] Avg episode reward: [(0, '42.845')] [2024-08-05 18:58:39,357][15444] Updated weights for policy 0, policy_version 45711 (0.0020) [2024-08-05 18:58:42,844][15444] Updated weights for policy 0, policy_version 45721 (0.0020) [2024-08-05 18:58:43,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.5, 300 sec: 24159.5). Total num frames: 374554624. Throughput: 0: 6032.2. Samples: 93639360. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 18:58:43,119][15372] Avg episode reward: [(0, '42.669')] [2024-08-05 18:58:45,915][15444] Updated weights for policy 0, policy_version 45731 (0.0016) [2024-08-05 18:58:48,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 374677504. Throughput: 0: 6027.8. Samples: 93675200. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 18:58:48,126][15372] Avg episode reward: [(0, '43.231')] [2024-08-05 18:58:49,581][15444] Updated weights for policy 0, policy_version 45741 (0.0017) [2024-08-05 18:58:52,855][15444] Updated weights for policy 0, policy_version 45751 (0.0030) [2024-08-05 18:58:53,119][15372] Fps is (10 sec: 24575.8, 60 sec: 24166.3, 300 sec: 24159.4). Total num frames: 374800384. Throughput: 0: 6010.2. Samples: 93693040. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 18:58:53,119][15372] Avg episode reward: [(0, '43.368')] [2024-08-05 18:58:55,014][15417] Signal inference workers to stop experience collection... (16850 times) [2024-08-05 18:58:55,014][15417] Signal inference workers to resume experience collection... (16850 times) [2024-08-05 18:58:55,044][15444] InferenceWorker_p0-w0: stopping experience collection (16850 times) [2024-08-05 18:58:55,044][15444] InferenceWorker_p0-w0: resuming experience collection (16850 times) [2024-08-05 18:58:56,313][15444] Updated weights for policy 0, policy_version 45761 (0.0018) [2024-08-05 18:58:58,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24033.7, 300 sec: 24187.2). Total num frames: 374915072. Throughput: 0: 6025.8. Samples: 93729480. Policy #0 lag: (min: 0.0, avg: 3.2, max: 7.0) [2024-08-05 18:58:58,119][15372] Avg episode reward: [(0, '42.597')] [2024-08-05 18:58:59,487][15444] Updated weights for policy 0, policy_version 45771 (0.0017) [2024-08-05 18:59:02,968][15444] Updated weights for policy 0, policy_version 45781 (0.0011) [2024-08-05 18:59:03,118][15372] Fps is (10 sec: 23757.1, 60 sec: 24166.5, 300 sec: 24159.5). Total num frames: 375037952. Throughput: 0: 6002.1. Samples: 93765530. Policy #0 lag: (min: 0.0, avg: 3.2, max: 7.0) [2024-08-05 18:59:03,119][15372] Avg episode reward: [(0, '43.008')] [2024-08-05 18:59:06,519][15444] Updated weights for policy 0, policy_version 45791 (0.0018) [2024-08-05 18:59:08,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.5, 300 sec: 24159.5). Total num frames: 375160832. Throughput: 0: 6028.9. Samples: 93784070. Policy #0 lag: (min: 0.0, avg: 3.2, max: 7.0) [2024-08-05 18:59:08,126][15372] Avg episode reward: [(0, '43.469')] [2024-08-05 18:59:09,654][15444] Updated weights for policy 0, policy_version 45801 (0.0021) [2024-08-05 18:59:13,081][15444] Updated weights for policy 0, policy_version 45811 (0.0027) [2024-08-05 18:59:13,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 375283712. Throughput: 0: 6037.5. Samples: 93820860. Policy #0 lag: (min: 0.0, avg: 3.2, max: 7.0) [2024-08-05 18:59:13,119][15372] Avg episode reward: [(0, '43.693')] [2024-08-05 18:59:16,279][15444] Updated weights for policy 0, policy_version 45821 (0.0014) [2024-08-05 18:59:18,119][15372] Fps is (10 sec: 23755.6, 60 sec: 24029.9, 300 sec: 24159.4). Total num frames: 375398400. Throughput: 0: 6023.9. Samples: 93856540. Policy #0 lag: (min: 0.0, avg: 3.2, max: 7.0) [2024-08-05 18:59:18,127][15372] Avg episode reward: [(0, '42.811')] [2024-08-05 18:59:19,975][15444] Updated weights for policy 0, policy_version 45831 (0.0023) [2024-08-05 18:59:23,119][15372] Fps is (10 sec: 23756.3, 60 sec: 24166.4, 300 sec: 24159.4). Total num frames: 375521280. Throughput: 0: 6038.4. Samples: 93875170. Policy #0 lag: (min: 0.0, avg: 3.2, max: 7.0) [2024-08-05 18:59:23,126][15372] Avg episode reward: [(0, '42.828')] [2024-08-05 18:59:23,558][15444] Updated weights for policy 0, policy_version 45841 (0.0016) [2024-08-05 18:59:26,548][15444] Updated weights for policy 0, policy_version 45851 (0.0034) [2024-08-05 18:59:28,118][15372] Fps is (10 sec: 24577.2, 60 sec: 24167.0, 300 sec: 24159.5). Total num frames: 375644160. Throughput: 0: 6035.6. Samples: 93910960. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 18:59:28,126][15372] Avg episode reward: [(0, '42.713')] [2024-08-05 18:59:30,022][15444] Updated weights for policy 0, policy_version 45861 (0.0012) [2024-08-05 18:59:33,119][15372] Fps is (10 sec: 24575.7, 60 sec: 24166.3, 300 sec: 24159.9). Total num frames: 375767040. Throughput: 0: 6065.3. Samples: 93948140. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 18:59:33,127][15372] Avg episode reward: [(0, '42.806')] [2024-08-05 18:59:33,172][15444] Updated weights for policy 0, policy_version 45871 (0.0011) [2024-08-05 18:59:36,845][15444] Updated weights for policy 0, policy_version 45881 (0.0011) [2024-08-05 18:59:37,470][15417] Signal inference workers to stop experience collection... (16900 times) [2024-08-05 18:59:37,478][15417] Signal inference workers to resume experience collection... (16900 times) [2024-08-05 18:59:37,546][15444] InferenceWorker_p0-w0: stopping experience collection (16900 times) [2024-08-05 18:59:37,546][15444] InferenceWorker_p0-w0: resuming experience collection (16900 times) [2024-08-05 18:59:38,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 375881728. Throughput: 0: 6068.0. Samples: 93966100. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 18:59:38,119][15372] Avg episode reward: [(0, '43.099')] [2024-08-05 18:59:40,223][15444] Updated weights for policy 0, policy_version 45891 (0.0016) [2024-08-05 18:59:43,118][15372] Fps is (10 sec: 24576.8, 60 sec: 24302.9, 300 sec: 24187.3). Total num frames: 376012800. Throughput: 0: 6069.3. Samples: 94002600. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 18:59:43,126][15372] Avg episode reward: [(0, '42.976')] [2024-08-05 18:59:43,434][15444] Updated weights for policy 0, policy_version 45901 (0.0011) [2024-08-05 18:59:46,776][15444] Updated weights for policy 0, policy_version 45911 (0.0023) [2024-08-05 18:59:48,118][15372] Fps is (10 sec: 25395.3, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 376135680. Throughput: 0: 6078.4. Samples: 94039060. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 18:59:48,119][15372] Avg episode reward: [(0, '42.572')] [2024-08-05 18:59:50,090][15444] Updated weights for policy 0, policy_version 45921 (0.0019) [2024-08-05 18:59:53,124][15372] Fps is (10 sec: 24561.8, 60 sec: 24300.6, 300 sec: 24186.8). Total num frames: 376258560. Throughput: 0: 6077.4. Samples: 94057590. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 18:59:53,138][15372] Avg episode reward: [(0, '42.019')] [2024-08-05 18:59:53,563][15444] Updated weights for policy 0, policy_version 45931 (0.0035) [2024-08-05 18:59:56,994][15444] Updated weights for policy 0, policy_version 45941 (0.0022) [2024-08-05 18:59:58,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24302.9, 300 sec: 24159.5). Total num frames: 376373248. Throughput: 0: 6064.7. Samples: 94093770. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 18:59:58,119][15372] Avg episode reward: [(0, '42.157')] [2024-08-05 19:00:00,283][15444] Updated weights for policy 0, policy_version 45951 (0.0021) [2024-08-05 19:00:03,118][15372] Fps is (10 sec: 23770.5, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 376496128. Throughput: 0: 6069.8. Samples: 94129680. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 19:00:03,126][15372] Avg episode reward: [(0, '41.813')] [2024-08-05 19:00:03,765][15444] Updated weights for policy 0, policy_version 45961 (0.0017) [2024-08-05 19:00:07,104][15444] Updated weights for policy 0, policy_version 45971 (0.0010) [2024-08-05 19:00:08,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 376610816. Throughput: 0: 6062.0. Samples: 94147960. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 19:00:08,119][15372] Avg episode reward: [(0, '42.074')] [2024-08-05 19:00:10,578][15444] Updated weights for policy 0, policy_version 45981 (0.0011) [2024-08-05 19:00:13,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 376733696. Throughput: 0: 6074.4. Samples: 94184310. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 19:00:13,119][15372] Avg episode reward: [(0, '42.427')] [2024-08-05 19:00:14,051][15444] Updated weights for policy 0, policy_version 45991 (0.0013) [2024-08-05 19:00:17,318][15444] Updated weights for policy 0, policy_version 46001 (0.0012) [2024-08-05 19:00:18,124][15372] Fps is (10 sec: 24562.2, 60 sec: 24300.9, 300 sec: 24186.8). Total num frames: 376856576. Throughput: 0: 6043.5. Samples: 94220130. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 19:00:18,132][15372] Avg episode reward: [(0, '43.772')] [2024-08-05 19:00:18,135][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000046003_376856576.pth... [2024-08-05 19:00:18,282][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000045295_371056640.pth [2024-08-05 19:00:20,877][15444] Updated weights for policy 0, policy_version 46011 (0.0011) [2024-08-05 19:00:21,441][15417] Signal inference workers to stop experience collection... (16950 times) [2024-08-05 19:00:21,446][15417] Signal inference workers to resume experience collection... (16950 times) [2024-08-05 19:00:21,501][15444] InferenceWorker_p0-w0: stopping experience collection (16950 times) [2024-08-05 19:00:21,501][15444] InferenceWorker_p0-w0: resuming experience collection (16950 times) [2024-08-05 19:00:23,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.5, 300 sec: 24159.5). Total num frames: 376971264. Throughput: 0: 6050.9. Samples: 94238390. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 19:00:23,119][15372] Avg episode reward: [(0, '44.459')] [2024-08-05 19:00:23,918][15444] Updated weights for policy 0, policy_version 46021 (0.0012) [2024-08-05 19:00:27,473][15444] Updated weights for policy 0, policy_version 46031 (0.0012) [2024-08-05 19:00:28,118][15372] Fps is (10 sec: 24589.8, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 377102336. Throughput: 0: 6048.7. Samples: 94274790. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 19:00:28,119][15372] Avg episode reward: [(0, '42.175')] [2024-08-05 19:00:30,843][15444] Updated weights for policy 0, policy_version 46041 (0.0015) [2024-08-05 19:00:33,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24166.5, 300 sec: 24187.2). Total num frames: 377217024. Throughput: 0: 6054.9. Samples: 94311530. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 19:00:33,126][15372] Avg episode reward: [(0, '41.956')] [2024-08-05 19:00:34,189][15444] Updated weights for policy 0, policy_version 46051 (0.0017) [2024-08-05 19:00:37,702][15444] Updated weights for policy 0, policy_version 46061 (0.0013) [2024-08-05 19:00:38,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 377339904. Throughput: 0: 6039.2. Samples: 94329320. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 19:00:38,119][15372] Avg episode reward: [(0, '43.032')] [2024-08-05 19:00:41,110][15444] Updated weights for policy 0, policy_version 46071 (0.0016) [2024-08-05 19:00:43,119][15372] Fps is (10 sec: 24575.5, 60 sec: 24166.3, 300 sec: 24187.2). Total num frames: 377462784. Throughput: 0: 6021.1. Samples: 94364720. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 19:00:43,126][15372] Avg episode reward: [(0, '42.334')] [2024-08-05 19:00:44,578][15444] Updated weights for policy 0, policy_version 46081 (0.0019) [2024-08-05 19:00:47,961][15444] Updated weights for policy 0, policy_version 46091 (0.0014) [2024-08-05 19:00:48,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 377577472. Throughput: 0: 6010.7. Samples: 94400160. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 19:00:48,119][15372] Avg episode reward: [(0, '43.581')] [2024-08-05 19:00:51,252][15444] Updated weights for policy 0, policy_version 46101 (0.0015) [2024-08-05 19:00:53,118][15372] Fps is (10 sec: 22938.0, 60 sec: 23895.6, 300 sec: 24131.7). Total num frames: 377692160. Throughput: 0: 6024.2. Samples: 94419050. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 19:00:53,126][15372] Avg episode reward: [(0, '43.333')] [2024-08-05 19:00:54,789][15444] Updated weights for policy 0, policy_version 46111 (0.0016) [2024-08-05 19:00:57,959][15444] Updated weights for policy 0, policy_version 46121 (0.0011) [2024-08-05 19:00:58,118][15372] Fps is (10 sec: 24575.8, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 377823232. Throughput: 0: 6021.5. Samples: 94455280. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 19:00:58,119][15372] Avg episode reward: [(0, '43.507')] [2024-08-05 19:01:01,795][15444] Updated weights for policy 0, policy_version 46131 (0.0013) [2024-08-05 19:01:03,119][15372] Fps is (10 sec: 24575.6, 60 sec: 24029.8, 300 sec: 24159.5). Total num frames: 377937920. Throughput: 0: 6018.7. Samples: 94490940. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 19:01:03,119][15372] Avg episode reward: [(0, '43.135')] [2024-08-05 19:01:04,719][15444] Updated weights for policy 0, policy_version 46141 (0.0013) [2024-08-05 19:01:06,625][15417] Signal inference workers to stop experience collection... (17000 times) [2024-08-05 19:01:06,626][15417] Signal inference workers to resume experience collection... (17000 times) [2024-08-05 19:01:06,658][15444] InferenceWorker_p0-w0: stopping experience collection (17000 times) [2024-08-05 19:01:06,677][15444] InferenceWorker_p0-w0: resuming experience collection (17000 times) [2024-08-05 19:01:08,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 378060800. Throughput: 0: 6040.0. Samples: 94510190. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 19:01:08,119][15372] Avg episode reward: [(0, '42.363')] [2024-08-05 19:01:08,199][15444] Updated weights for policy 0, policy_version 46151 (0.0028) [2024-08-05 19:01:11,918][15444] Updated weights for policy 0, policy_version 46161 (0.0040) [2024-08-05 19:01:13,119][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.3, 300 sec: 24159.4). Total num frames: 378183680. Throughput: 0: 6033.8. Samples: 94546310. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 19:01:13,119][15372] Avg episode reward: [(0, '42.811')] [2024-08-05 19:01:14,921][15444] Updated weights for policy 0, policy_version 46171 (0.0024) [2024-08-05 19:01:18,119][15372] Fps is (10 sec: 23756.5, 60 sec: 24032.1, 300 sec: 24131.7). Total num frames: 378298368. Throughput: 0: 6018.7. Samples: 94582370. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 19:01:18,126][15372] Avg episode reward: [(0, '42.457')] [2024-08-05 19:01:18,504][15444] Updated weights for policy 0, policy_version 46181 (0.0019) [2024-08-05 19:01:21,639][15444] Updated weights for policy 0, policy_version 46191 (0.0010) [2024-08-05 19:01:23,118][15372] Fps is (10 sec: 24576.5, 60 sec: 24302.9, 300 sec: 24187.3). Total num frames: 378429440. Throughput: 0: 6042.4. Samples: 94601230. Policy #0 lag: (min: 1.0, avg: 4.3, max: 8.0) [2024-08-05 19:01:23,126][15372] Avg episode reward: [(0, '42.753')] [2024-08-05 19:01:25,136][15444] Updated weights for policy 0, policy_version 46201 (0.0011) [2024-08-05 19:01:28,118][15372] Fps is (10 sec: 25395.3, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 378552320. Throughput: 0: 6072.2. Samples: 94637970. Policy #0 lag: (min: 1.0, avg: 4.3, max: 8.0) [2024-08-05 19:01:28,126][15372] Avg episode reward: [(0, '43.839')] [2024-08-05 19:01:28,365][15444] Updated weights for policy 0, policy_version 46211 (0.0011) [2024-08-05 19:01:31,995][15444] Updated weights for policy 0, policy_version 46221 (0.0013) [2024-08-05 19:01:33,119][15372] Fps is (10 sec: 24575.5, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 378675200. Throughput: 0: 6067.1. Samples: 94673180. Policy #0 lag: (min: 1.0, avg: 4.3, max: 8.0) [2024-08-05 19:01:33,119][15372] Avg episode reward: [(0, '43.890')] [2024-08-05 19:01:35,218][15444] Updated weights for policy 0, policy_version 46231 (0.0013) [2024-08-05 19:01:38,119][15372] Fps is (10 sec: 22937.4, 60 sec: 24029.8, 300 sec: 24159.5). Total num frames: 378781696. Throughput: 0: 6065.3. Samples: 94691990. Policy #0 lag: (min: 1.0, avg: 4.3, max: 8.0) [2024-08-05 19:01:38,127][15372] Avg episode reward: [(0, '43.984')] [2024-08-05 19:01:38,763][15444] Updated weights for policy 0, policy_version 46241 (0.0014) [2024-08-05 19:01:42,324][15444] Updated weights for policy 0, policy_version 46251 (0.0020) [2024-08-05 19:01:43,124][15372] Fps is (10 sec: 22924.2, 60 sec: 24027.5, 300 sec: 24159.0). Total num frames: 378904576. Throughput: 0: 6041.2. Samples: 94727170. Policy #0 lag: (min: 1.0, avg: 4.3, max: 8.0) [2024-08-05 19:01:43,125][15372] Avg episode reward: [(0, '44.161')] [2024-08-05 19:01:45,444][15444] Updated weights for policy 0, policy_version 46261 (0.0011) [2024-08-05 19:01:48,119][15372] Fps is (10 sec: 24576.1, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 379027456. Throughput: 0: 6050.2. Samples: 94763200. Policy #0 lag: (min: 1.0, avg: 4.3, max: 8.0) [2024-08-05 19:01:48,126][15372] Avg episode reward: [(0, '43.504')] [2024-08-05 19:01:49,109][15444] Updated weights for policy 0, policy_version 46271 (0.0012) [2024-08-05 19:01:49,853][15417] Signal inference workers to stop experience collection... (17050 times) [2024-08-05 19:01:49,854][15417] Signal inference workers to resume experience collection... (17050 times) [2024-08-05 19:01:49,920][15444] InferenceWorker_p0-w0: stopping experience collection (17050 times) [2024-08-05 19:01:49,920][15444] InferenceWorker_p0-w0: resuming experience collection (17050 times) [2024-08-05 19:01:52,389][15444] Updated weights for policy 0, policy_version 46281 (0.0014) [2024-08-05 19:01:53,119][15372] Fps is (10 sec: 23770.7, 60 sec: 24166.3, 300 sec: 24159.4). Total num frames: 379142144. Throughput: 0: 6028.6. Samples: 94781480. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 19:01:53,126][15372] Avg episode reward: [(0, '43.230')] [2024-08-05 19:01:55,676][15444] Updated weights for policy 0, policy_version 46291 (0.0012) [2024-08-05 19:01:58,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 379273216. Throughput: 0: 6034.7. Samples: 94817870. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 19:01:58,126][15372] Avg episode reward: [(0, '42.747')] [2024-08-05 19:01:59,222][15444] Updated weights for policy 0, policy_version 46301 (0.0017) [2024-08-05 19:02:02,305][15444] Updated weights for policy 0, policy_version 46311 (0.0020) [2024-08-05 19:02:03,118][15372] Fps is (10 sec: 25395.7, 60 sec: 24303.0, 300 sec: 24215.0). Total num frames: 379396096. Throughput: 0: 6032.2. Samples: 94853820. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 19:02:03,126][15372] Avg episode reward: [(0, '43.443')] [2024-08-05 19:02:05,889][15444] Updated weights for policy 0, policy_version 46321 (0.0011) [2024-08-05 19:02:08,119][15372] Fps is (10 sec: 23755.4, 60 sec: 24166.1, 300 sec: 24187.2). Total num frames: 379510784. Throughput: 0: 6039.5. Samples: 94873010. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 19:02:08,119][15372] Avg episode reward: [(0, '43.059')] [2024-08-05 19:02:09,042][15444] Updated weights for policy 0, policy_version 46331 (0.0015) [2024-08-05 19:02:12,477][15444] Updated weights for policy 0, policy_version 46341 (0.0012) [2024-08-05 19:02:13,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.5, 300 sec: 24187.2). Total num frames: 379633664. Throughput: 0: 6028.7. Samples: 94909260. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 19:02:13,119][15372] Avg episode reward: [(0, '42.297')] [2024-08-05 19:02:16,218][15444] Updated weights for policy 0, policy_version 46351 (0.0011) [2024-08-05 19:02:18,118][15372] Fps is (10 sec: 23758.2, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 379748352. Throughput: 0: 6036.9. Samples: 94944840. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 19:02:18,119][15372] Avg episode reward: [(0, '43.020')] [2024-08-05 19:02:18,161][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000046357_379756544.pth... [2024-08-05 19:02:18,293][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000045649_373956608.pth [2024-08-05 19:02:19,283][15444] Updated weights for policy 0, policy_version 46361 (0.0014) [2024-08-05 19:02:22,925][15444] Updated weights for policy 0, policy_version 46371 (0.0023) [2024-08-05 19:02:23,119][15372] Fps is (10 sec: 23756.6, 60 sec: 24029.8, 300 sec: 24187.2). Total num frames: 379871232. Throughput: 0: 6027.6. Samples: 94963230. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 19:02:23,119][15372] Avg episode reward: [(0, '43.869')] [2024-08-05 19:02:26,009][15444] Updated weights for policy 0, policy_version 46381 (0.0012) [2024-08-05 19:02:28,119][15372] Fps is (10 sec: 24575.1, 60 sec: 24029.7, 300 sec: 24187.2). Total num frames: 379994112. Throughput: 0: 6044.5. Samples: 94999140. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 19:02:28,119][15372] Avg episode reward: [(0, '44.162')] [2024-08-05 19:02:29,699][15444] Updated weights for policy 0, policy_version 46391 (0.0027) [2024-08-05 19:02:33,119][15372] Fps is (10 sec: 23756.2, 60 sec: 23893.3, 300 sec: 24159.4). Total num frames: 380108800. Throughput: 0: 6027.3. Samples: 95034430. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 19:02:33,126][15372] Avg episode reward: [(0, '43.403')] [2024-08-05 19:02:33,351][15444] Updated weights for policy 0, policy_version 46401 (0.0030) [2024-08-05 19:02:35,775][15417] Signal inference workers to stop experience collection... (17100 times) [2024-08-05 19:02:35,776][15417] Signal inference workers to resume experience collection... (17100 times) [2024-08-05 19:02:35,819][15444] InferenceWorker_p0-w0: stopping experience collection (17100 times) [2024-08-05 19:02:35,820][15444] InferenceWorker_p0-w0: resuming experience collection (17100 times) [2024-08-05 19:02:36,583][15444] Updated weights for policy 0, policy_version 46411 (0.0020) [2024-08-05 19:02:38,119][15372] Fps is (10 sec: 24576.4, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 380239872. Throughput: 0: 6028.2. Samples: 95052750. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 19:02:38,119][15372] Avg episode reward: [(0, '43.060')] [2024-08-05 19:02:39,758][15444] Updated weights for policy 0, policy_version 46421 (0.0019) [2024-08-05 19:02:43,118][15372] Fps is (10 sec: 24576.6, 60 sec: 24168.8, 300 sec: 24159.5). Total num frames: 380354560. Throughput: 0: 6056.9. Samples: 95090430. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 19:02:43,119][15372] Avg episode reward: [(0, '43.120')] [2024-08-05 19:02:43,226][15444] Updated weights for policy 0, policy_version 46431 (0.0022) [2024-08-05 19:02:46,241][15444] Updated weights for policy 0, policy_version 46441 (0.0025) [2024-08-05 19:02:48,118][15372] Fps is (10 sec: 24576.6, 60 sec: 24303.0, 300 sec: 24187.2). Total num frames: 380485632. Throughput: 0: 6062.0. Samples: 95126610. Policy #0 lag: (min: 0.0, avg: 4.5, max: 10.0) [2024-08-05 19:02:48,126][15372] Avg episode reward: [(0, '42.958')] [2024-08-05 19:02:49,834][15444] Updated weights for policy 0, policy_version 46451 (0.0026) [2024-08-05 19:02:53,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24303.0, 300 sec: 24160.2). Total num frames: 380600320. Throughput: 0: 6046.7. Samples: 95145110. Policy #0 lag: (min: 0.0, avg: 4.5, max: 10.0) [2024-08-05 19:02:53,126][15372] Avg episode reward: [(0, '42.263')] [2024-08-05 19:02:53,405][15444] Updated weights for policy 0, policy_version 46461 (0.0011) [2024-08-05 19:02:56,369][15444] Updated weights for policy 0, policy_version 46471 (0.0024) [2024-08-05 19:02:58,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 380723200. Throughput: 0: 6037.1. Samples: 95180930. Policy #0 lag: (min: 0.0, avg: 4.5, max: 10.0) [2024-08-05 19:02:58,126][15372] Avg episode reward: [(0, '43.165')] [2024-08-05 19:03:00,040][15444] Updated weights for policy 0, policy_version 46481 (0.0021) [2024-08-05 19:03:03,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 380846080. Throughput: 0: 6059.8. Samples: 95217530. Policy #0 lag: (min: 0.0, avg: 4.5, max: 10.0) [2024-08-05 19:03:03,126][15372] Avg episode reward: [(0, '43.455')] [2024-08-05 19:03:03,293][15444] Updated weights for policy 0, policy_version 46491 (0.0013) [2024-08-05 19:03:06,715][15444] Updated weights for policy 0, policy_version 46501 (0.0013) [2024-08-05 19:03:08,119][15372] Fps is (10 sec: 24573.7, 60 sec: 24302.8, 300 sec: 24187.1). Total num frames: 380968960. Throughput: 0: 6061.4. Samples: 95236000. Policy #0 lag: (min: 0.0, avg: 4.5, max: 10.0) [2024-08-05 19:03:08,120][15372] Avg episode reward: [(0, '42.924')] [2024-08-05 19:03:10,095][15444] Updated weights for policy 0, policy_version 46511 (0.0012) [2024-08-05 19:03:13,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 381083648. Throughput: 0: 6072.3. Samples: 95272390. Policy #0 lag: (min: 0.0, avg: 4.5, max: 10.0) [2024-08-05 19:03:13,126][15372] Avg episode reward: [(0, '42.181')] [2024-08-05 19:03:13,540][15444] Updated weights for policy 0, policy_version 46521 (0.0015) [2024-08-05 19:03:15,305][15417] Signal inference workers to stop experience collection... (17150 times) [2024-08-05 19:03:15,306][15417] Signal inference workers to resume experience collection... (17150 times) [2024-08-05 19:03:15,359][15444] InferenceWorker_p0-w0: stopping experience collection (17150 times) [2024-08-05 19:03:15,364][15444] InferenceWorker_p0-w0: resuming experience collection (17150 times) [2024-08-05 19:03:16,926][15444] Updated weights for policy 0, policy_version 46531 (0.0017) [2024-08-05 19:03:18,118][15372] Fps is (10 sec: 24578.4, 60 sec: 24439.5, 300 sec: 24215.0). Total num frames: 381214720. Throughput: 0: 6110.5. Samples: 95309400. Policy #0 lag: (min: 0.0, avg: 4.1, max: 9.0) [2024-08-05 19:03:18,119][15372] Avg episode reward: [(0, '42.982')] [2024-08-05 19:03:20,110][15444] Updated weights for policy 0, policy_version 46541 (0.0012) [2024-08-05 19:03:23,118][15372] Fps is (10 sec: 25395.1, 60 sec: 24439.5, 300 sec: 24215.1). Total num frames: 381337600. Throughput: 0: 6109.8. Samples: 95327690. Policy #0 lag: (min: 0.0, avg: 4.1, max: 9.0) [2024-08-05 19:03:23,126][15372] Avg episode reward: [(0, '43.262')] [2024-08-05 19:03:23,476][15444] Updated weights for policy 0, policy_version 46551 (0.0024) [2024-08-05 19:03:26,871][15444] Updated weights for policy 0, policy_version 46561 (0.0012) [2024-08-05 19:03:28,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24439.6, 300 sec: 24215.0). Total num frames: 381460480. Throughput: 0: 6080.0. Samples: 95364030. Policy #0 lag: (min: 0.0, avg: 4.1, max: 9.0) [2024-08-05 19:03:28,119][15372] Avg episode reward: [(0, '42.391')] [2024-08-05 19:03:30,128][15444] Updated weights for policy 0, policy_version 46571 (0.0023) [2024-08-05 19:03:33,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24439.6, 300 sec: 24187.2). Total num frames: 381575168. Throughput: 0: 6093.8. Samples: 95400830. Policy #0 lag: (min: 0.0, avg: 4.1, max: 9.0) [2024-08-05 19:03:33,126][15372] Avg episode reward: [(0, '43.034')] [2024-08-05 19:03:33,655][15444] Updated weights for policy 0, policy_version 46581 (0.0013) [2024-08-05 19:03:36,991][15444] Updated weights for policy 0, policy_version 46591 (0.0017) [2024-08-05 19:03:38,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24303.0, 300 sec: 24215.0). Total num frames: 381698048. Throughput: 0: 6082.2. Samples: 95418810. Policy #0 lag: (min: 0.0, avg: 4.1, max: 9.0) [2024-08-05 19:03:38,119][15372] Avg episode reward: [(0, '42.910')] [2024-08-05 19:03:40,233][15444] Updated weights for policy 0, policy_version 46601 (0.0021) [2024-08-05 19:03:43,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24439.5, 300 sec: 24215.0). Total num frames: 381820928. Throughput: 0: 6101.6. Samples: 95455500. Policy #0 lag: (min: 0.0, avg: 4.1, max: 9.0) [2024-08-05 19:03:43,126][15372] Avg episode reward: [(0, '42.733')] [2024-08-05 19:03:43,770][15444] Updated weights for policy 0, policy_version 46611 (0.0024) [2024-08-05 19:03:47,019][15444] Updated weights for policy 0, policy_version 46621 (0.0011) [2024-08-05 19:03:48,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 381943808. Throughput: 0: 6097.8. Samples: 95491930. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 19:03:48,119][15372] Avg episode reward: [(0, '41.680')] [2024-08-05 19:03:50,409][15444] Updated weights for policy 0, policy_version 46631 (0.0016) [2024-08-05 19:03:53,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24439.5, 300 sec: 24242.8). Total num frames: 382066688. Throughput: 0: 6091.7. Samples: 95510120. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 19:03:53,126][15372] Avg episode reward: [(0, '42.119')] [2024-08-05 19:03:53,862][15444] Updated weights for policy 0, policy_version 46641 (0.0012) [2024-08-05 19:03:57,237][15444] Updated weights for policy 0, policy_version 46651 (0.0021) [2024-08-05 19:03:58,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 382181376. Throughput: 0: 6071.1. Samples: 95545590. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 19:03:58,119][15372] Avg episode reward: [(0, '43.280')] [2024-08-05 19:04:00,681][15444] Updated weights for policy 0, policy_version 46661 (0.0020) [2024-08-05 19:04:03,119][15372] Fps is (10 sec: 23755.8, 60 sec: 24302.8, 300 sec: 24215.0). Total num frames: 382304256. Throughput: 0: 6055.9. Samples: 95581920. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 19:04:03,127][15372] Avg episode reward: [(0, '42.937')] [2024-08-05 19:04:04,253][15444] Updated weights for policy 0, policy_version 46671 (0.0012) [2024-08-05 19:04:07,395][15444] Updated weights for policy 0, policy_version 46681 (0.0039) [2024-08-05 19:04:08,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.8, 300 sec: 24187.2). Total num frames: 382418944. Throughput: 0: 6045.6. Samples: 95599740. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 19:04:08,119][15372] Avg episode reward: [(0, '43.155')] [2024-08-05 19:04:11,007][15444] Updated weights for policy 0, policy_version 46691 (0.0016) [2024-08-05 19:04:13,118][15372] Fps is (10 sec: 23757.9, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 382541824. Throughput: 0: 6027.8. Samples: 95635280. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 19:04:13,119][15372] Avg episode reward: [(0, '43.033')] [2024-08-05 19:04:14,541][15444] Updated weights for policy 0, policy_version 46701 (0.0012) [2024-08-05 19:04:17,691][15444] Updated weights for policy 0, policy_version 46711 (0.0025) [2024-08-05 19:04:18,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 382664704. Throughput: 0: 6012.2. Samples: 95671380. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 19:04:18,119][15372] Avg episode reward: [(0, '42.542')] [2024-08-05 19:04:18,122][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000046712_382664704.pth... [2024-08-05 19:04:18,227][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000046003_376856576.pth [2024-08-05 19:04:21,206][15444] Updated weights for policy 0, policy_version 46721 (0.0012) [2024-08-05 19:04:22,332][15417] Signal inference workers to stop experience collection... (17200 times) [2024-08-05 19:04:22,344][15417] Signal inference workers to resume experience collection... (17200 times) [2024-08-05 19:04:22,409][15444] InferenceWorker_p0-w0: stopping experience collection (17200 times) [2024-08-05 19:04:22,410][15444] InferenceWorker_p0-w0: resuming experience collection (17200 times) [2024-08-05 19:04:23,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.9, 300 sec: 24187.2). Total num frames: 382779392. Throughput: 0: 6018.4. Samples: 95689640. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 19:04:23,119][15372] Avg episode reward: [(0, '42.715')] [2024-08-05 19:04:24,490][15444] Updated weights for policy 0, policy_version 46731 (0.0019) [2024-08-05 19:04:28,107][15444] Updated weights for policy 0, policy_version 46741 (0.0017) [2024-08-05 19:04:28,119][15372] Fps is (10 sec: 23756.6, 60 sec: 24029.8, 300 sec: 24187.2). Total num frames: 382902272. Throughput: 0: 6008.0. Samples: 95725860. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 19:04:28,119][15372] Avg episode reward: [(0, '42.700')] [2024-08-05 19:04:31,334][15444] Updated weights for policy 0, policy_version 46751 (0.0026) [2024-08-05 19:04:33,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 383025152. Throughput: 0: 5998.4. Samples: 95761860. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 19:04:33,126][15372] Avg episode reward: [(0, '41.914')] [2024-08-05 19:04:34,686][15444] Updated weights for policy 0, policy_version 46761 (0.0014) [2024-08-05 19:04:38,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 383139840. Throughput: 0: 6014.2. Samples: 95780760. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 19:04:38,126][15372] Avg episode reward: [(0, '42.610')] [2024-08-05 19:04:38,282][15444] Updated weights for policy 0, policy_version 46771 (0.0021) [2024-08-05 19:04:41,320][15444] Updated weights for policy 0, policy_version 46781 (0.0034) [2024-08-05 19:04:43,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 383262720. Throughput: 0: 6018.2. Samples: 95816410. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 19:04:43,126][15372] Avg episode reward: [(0, '42.240')] [2024-08-05 19:04:45,003][15444] Updated weights for policy 0, policy_version 46791 (0.0012) [2024-08-05 19:04:48,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24029.9, 300 sec: 24159.9). Total num frames: 383385600. Throughput: 0: 6021.8. Samples: 95852900. Policy #0 lag: (min: 0.0, avg: 4.2, max: 9.0) [2024-08-05 19:04:48,126][15372] Avg episode reward: [(0, '43.236')] [2024-08-05 19:04:48,367][15444] Updated weights for policy 0, policy_version 46801 (0.0013) [2024-08-05 19:04:51,685][15444] Updated weights for policy 0, policy_version 46811 (0.0015) [2024-08-05 19:04:53,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24029.9, 300 sec: 24187.2). Total num frames: 383508480. Throughput: 0: 6035.6. Samples: 95871340. Policy #0 lag: (min: 0.0, avg: 4.2, max: 9.0) [2024-08-05 19:04:53,126][15372] Avg episode reward: [(0, '42.483')] [2024-08-05 19:04:54,894][15444] Updated weights for policy 0, policy_version 46821 (0.0012) [2024-08-05 19:04:58,119][15372] Fps is (10 sec: 24575.7, 60 sec: 24166.3, 300 sec: 24187.2). Total num frames: 383631360. Throughput: 0: 6060.9. Samples: 95908020. Policy #0 lag: (min: 0.0, avg: 4.2, max: 9.0) [2024-08-05 19:04:58,126][15372] Avg episode reward: [(0, '42.102')] [2024-08-05 19:04:58,347][15444] Updated weights for policy 0, policy_version 46831 (0.0015) [2024-08-05 19:05:01,869][15444] Updated weights for policy 0, policy_version 46841 (0.0033) [2024-08-05 19:05:03,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24166.6, 300 sec: 24215.0). Total num frames: 383754240. Throughput: 0: 6050.2. Samples: 95943640. Policy #0 lag: (min: 0.0, avg: 4.2, max: 9.0) [2024-08-05 19:05:03,119][15372] Avg episode reward: [(0, '42.976')] [2024-08-05 19:05:04,929][15444] Updated weights for policy 0, policy_version 46851 (0.0029) [2024-08-05 19:05:08,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 383868928. Throughput: 0: 6070.7. Samples: 95962820. Policy #0 lag: (min: 0.0, avg: 4.2, max: 9.0) [2024-08-05 19:05:08,126][15372] Avg episode reward: [(0, '43.019')] [2024-08-05 19:05:08,476][15444] Updated weights for policy 0, policy_version 46861 (0.0024) [2024-08-05 19:05:12,033][15444] Updated weights for policy 0, policy_version 46871 (0.0012) [2024-08-05 19:05:13,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24187.7). Total num frames: 383991808. Throughput: 0: 6061.8. Samples: 95998640. Policy #0 lag: (min: 0.0, avg: 4.2, max: 9.0) [2024-08-05 19:05:13,119][15372] Avg episode reward: [(0, '43.681')] [2024-08-05 19:05:15,118][15444] Updated weights for policy 0, policy_version 46881 (0.0014) [2024-08-05 19:05:17,305][15417] Signal inference workers to stop experience collection... (17250 times) [2024-08-05 19:05:17,305][15417] Signal inference workers to resume experience collection... (17250 times) [2024-08-05 19:05:17,342][15444] InferenceWorker_p0-w0: stopping experience collection (17250 times) [2024-08-05 19:05:17,342][15444] InferenceWorker_p0-w0: resuming experience collection (17250 times) [2024-08-05 19:05:18,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 384114688. Throughput: 0: 6069.3. Samples: 96034980. Policy #0 lag: (min: 1.0, avg: 4.5, max: 8.0) [2024-08-05 19:05:18,126][15372] Avg episode reward: [(0, '44.294')] [2024-08-05 19:05:18,838][15444] Updated weights for policy 0, policy_version 46891 (0.0019) [2024-08-05 19:05:21,867][15444] Updated weights for policy 0, policy_version 46901 (0.0022) [2024-08-05 19:05:23,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 384229376. Throughput: 0: 6063.6. Samples: 96053620. Policy #0 lag: (min: 1.0, avg: 4.5, max: 8.0) [2024-08-05 19:05:23,119][15372] Avg episode reward: [(0, '43.010')] [2024-08-05 19:05:25,435][15444] Updated weights for policy 0, policy_version 46911 (0.0013) [2024-08-05 19:05:28,120][15372] Fps is (10 sec: 24572.4, 60 sec: 24302.4, 300 sec: 24214.9). Total num frames: 384360448. Throughput: 0: 6077.4. Samples: 96089900. Policy #0 lag: (min: 1.0, avg: 4.5, max: 8.0) [2024-08-05 19:05:28,120][15372] Avg episode reward: [(0, '43.474')] [2024-08-05 19:05:28,981][15444] Updated weights for policy 0, policy_version 46921 (0.0041) [2024-08-05 19:05:32,007][15444] Updated weights for policy 0, policy_version 46931 (0.0024) [2024-08-05 19:05:33,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 384475136. Throughput: 0: 6059.1. Samples: 96125560. Policy #0 lag: (min: 1.0, avg: 4.5, max: 8.0) [2024-08-05 19:05:33,126][15372] Avg episode reward: [(0, '43.204')] [2024-08-05 19:05:35,640][15444] Updated weights for policy 0, policy_version 46941 (0.0023) [2024-08-05 19:05:38,119][15372] Fps is (10 sec: 23759.7, 60 sec: 24302.8, 300 sec: 24187.2). Total num frames: 384598016. Throughput: 0: 6054.2. Samples: 96143780. Policy #0 lag: (min: 1.0, avg: 4.5, max: 8.0) [2024-08-05 19:05:38,127][15372] Avg episode reward: [(0, '43.309')] [2024-08-05 19:05:38,988][15444] Updated weights for policy 0, policy_version 46951 (0.0013) [2024-08-05 19:05:42,396][15444] Updated weights for policy 0, policy_version 46961 (0.0034) [2024-08-05 19:05:43,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 384720896. Throughput: 0: 6044.0. Samples: 96180000. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 19:05:43,126][15372] Avg episode reward: [(0, '42.604')] [2024-08-05 19:05:45,562][15444] Updated weights for policy 0, policy_version 46971 (0.0018) [2024-08-05 19:05:48,119][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.3, 300 sec: 24215.0). Total num frames: 384835584. Throughput: 0: 6047.7. Samples: 96215790. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 19:05:48,148][15372] Avg episode reward: [(0, '42.329')] [2024-08-05 19:05:49,324][15444] Updated weights for policy 0, policy_version 46981 (0.0024) [2024-08-05 19:05:52,768][15444] Updated weights for policy 0, policy_version 46991 (0.0032) [2024-08-05 19:05:53,118][15372] Fps is (10 sec: 22937.5, 60 sec: 24029.8, 300 sec: 24159.5). Total num frames: 384950272. Throughput: 0: 6027.3. Samples: 96234050. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 19:05:53,119][15372] Avg episode reward: [(0, '43.032')] [2024-08-05 19:05:55,979][15444] Updated weights for policy 0, policy_version 47001 (0.0017) [2024-08-05 19:05:58,119][15372] Fps is (10 sec: 24574.9, 60 sec: 24166.2, 300 sec: 24215.0). Total num frames: 385081344. Throughput: 0: 6027.7. Samples: 96269890. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 19:05:58,120][15372] Avg episode reward: [(0, '43.518')] [2024-08-05 19:05:59,722][15444] Updated weights for policy 0, policy_version 47011 (0.0011) [2024-08-05 19:06:02,596][15417] Signal inference workers to stop experience collection... (17300 times) [2024-08-05 19:06:02,604][15417] Signal inference workers to resume experience collection... (17300 times) [2024-08-05 19:06:02,667][15444] InferenceWorker_p0-w0: stopping experience collection (17300 times) [2024-08-05 19:06:02,667][15444] InferenceWorker_p0-w0: resuming experience collection (17300 times) [2024-08-05 19:06:02,695][15444] Updated weights for policy 0, policy_version 47021 (0.0012) [2024-08-05 19:06:03,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24029.9, 300 sec: 24187.2). Total num frames: 385196032. Throughput: 0: 6000.7. Samples: 96305010. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 19:06:03,119][15372] Avg episode reward: [(0, '43.271')] [2024-08-05 19:06:06,400][15444] Updated weights for policy 0, policy_version 47031 (0.0031) [2024-08-05 19:06:08,118][15372] Fps is (10 sec: 22939.0, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 385310720. Throughput: 0: 6006.4. Samples: 96323910. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 19:06:08,119][15372] Avg episode reward: [(0, '42.577')] [2024-08-05 19:06:09,892][15444] Updated weights for policy 0, policy_version 47041 (0.0015) [2024-08-05 19:06:13,123][15372] Fps is (10 sec: 23746.1, 60 sec: 24028.1, 300 sec: 24186.9). Total num frames: 385433600. Throughput: 0: 6010.3. Samples: 96360380. Policy #0 lag: (min: 1.0, avg: 3.6, max: 7.0) [2024-08-05 19:06:13,131][15372] Avg episode reward: [(0, '42.923')] [2024-08-05 19:06:13,134][15444] Updated weights for policy 0, policy_version 47051 (0.0011) [2024-08-05 19:06:16,431][15444] Updated weights for policy 0, policy_version 47061 (0.0040) [2024-08-05 19:06:18,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 385556480. Throughput: 0: 6013.1. Samples: 96396150. Policy #0 lag: (min: 1.0, avg: 3.6, max: 7.0) [2024-08-05 19:06:18,126][15372] Avg episode reward: [(0, '43.114')] [2024-08-05 19:06:18,130][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000047065_385556480.pth... [2024-08-05 19:06:18,279][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000046357_379756544.pth [2024-08-05 19:06:19,680][15444] Updated weights for policy 0, policy_version 47071 (0.0012) [2024-08-05 19:06:23,118][15372] Fps is (10 sec: 24587.1, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 385679360. Throughput: 0: 6017.6. Samples: 96414570. Policy #0 lag: (min: 1.0, avg: 3.6, max: 7.0) [2024-08-05 19:06:23,126][15372] Avg episode reward: [(0, '42.203')] [2024-08-05 19:06:23,399][15444] Updated weights for policy 0, policy_version 47081 (0.0012) [2024-08-05 19:06:26,716][15444] Updated weights for policy 0, policy_version 47091 (0.0014) [2024-08-05 19:06:28,119][15372] Fps is (10 sec: 24574.2, 60 sec: 24030.2, 300 sec: 24159.4). Total num frames: 385802240. Throughput: 0: 6012.6. Samples: 96450570. Policy #0 lag: (min: 1.0, avg: 3.6, max: 7.0) [2024-08-05 19:06:28,120][15372] Avg episode reward: [(0, '42.356')] [2024-08-05 19:06:29,935][15444] Updated weights for policy 0, policy_version 47101 (0.0010) [2024-08-05 19:06:33,118][15372] Fps is (10 sec: 23756.6, 60 sec: 24029.8, 300 sec: 24187.2). Total num frames: 385916928. Throughput: 0: 6020.9. Samples: 96486730. Policy #0 lag: (min: 1.0, avg: 3.6, max: 7.0) [2024-08-05 19:06:33,126][15372] Avg episode reward: [(0, '42.659')] [2024-08-05 19:06:33,707][15444] Updated weights for policy 0, policy_version 47111 (0.0032) [2024-08-05 19:06:36,651][15444] Updated weights for policy 0, policy_version 47121 (0.0022) [2024-08-05 19:06:38,118][15372] Fps is (10 sec: 23758.6, 60 sec: 24030.0, 300 sec: 24187.7). Total num frames: 386039808. Throughput: 0: 6033.3. Samples: 96505550. Policy #0 lag: (min: 1.0, avg: 3.6, max: 7.0) [2024-08-05 19:06:38,126][15372] Avg episode reward: [(0, '42.588')] [2024-08-05 19:06:38,383][15417] Signal inference workers to stop experience collection... (17350 times) [2024-08-05 19:06:38,384][15417] Signal inference workers to resume experience collection... (17350 times) [2024-08-05 19:06:38,423][15444] InferenceWorker_p0-w0: stopping experience collection (17350 times) [2024-08-05 19:06:38,424][15444] InferenceWorker_p0-w0: resuming experience collection (17350 times) [2024-08-05 19:06:40,216][15444] Updated weights for policy 0, policy_version 47131 (0.0013) [2024-08-05 19:06:43,119][15372] Fps is (10 sec: 25395.3, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 386170880. Throughput: 0: 6040.3. Samples: 96541700. Policy #0 lag: (min: 0.0, avg: 3.3, max: 8.0) [2024-08-05 19:06:43,126][15372] Avg episode reward: [(0, '42.515')] [2024-08-05 19:06:43,698][15444] Updated weights for policy 0, policy_version 47141 (0.0018) [2024-08-05 19:06:46,905][15444] Updated weights for policy 0, policy_version 47151 (0.0011) [2024-08-05 19:06:48,119][15372] Fps is (10 sec: 24575.7, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 386285568. Throughput: 0: 6052.9. Samples: 96577390. Policy #0 lag: (min: 0.0, avg: 3.3, max: 8.0) [2024-08-05 19:06:48,119][15372] Avg episode reward: [(0, '43.352')] [2024-08-05 19:06:50,478][15444] Updated weights for policy 0, policy_version 47161 (0.0011) [2024-08-05 19:06:53,119][15372] Fps is (10 sec: 23755.9, 60 sec: 24302.8, 300 sec: 24187.2). Total num frames: 386408448. Throughput: 0: 6054.0. Samples: 96596340. Policy #0 lag: (min: 0.0, avg: 3.3, max: 8.0) [2024-08-05 19:06:53,120][15372] Avg episode reward: [(0, '43.541')] [2024-08-05 19:06:53,583][15444] Updated weights for policy 0, policy_version 47171 (0.0018) [2024-08-05 19:06:57,285][15444] Updated weights for policy 0, policy_version 47181 (0.0018) [2024-08-05 19:06:58,119][15372] Fps is (10 sec: 24575.6, 60 sec: 24166.6, 300 sec: 24187.2). Total num frames: 386531328. Throughput: 0: 6031.5. Samples: 96631770. Policy #0 lag: (min: 0.0, avg: 3.3, max: 8.0) [2024-08-05 19:06:58,119][15372] Avg episode reward: [(0, '43.860')] [2024-08-05 19:07:00,393][15444] Updated weights for policy 0, policy_version 47191 (0.0012) [2024-08-05 19:07:03,118][15372] Fps is (10 sec: 23757.6, 60 sec: 24166.4, 300 sec: 24187.3). Total num frames: 386646016. Throughput: 0: 6066.7. Samples: 96669150. Policy #0 lag: (min: 0.0, avg: 3.3, max: 8.0) [2024-08-05 19:07:03,119][15372] Avg episode reward: [(0, '43.356')] [2024-08-05 19:07:03,843][15444] Updated weights for policy 0, policy_version 47201 (0.0027) [2024-08-05 19:07:07,563][15444] Updated weights for policy 0, policy_version 47211 (0.0021) [2024-08-05 19:07:08,118][15372] Fps is (10 sec: 23757.4, 60 sec: 24303.0, 300 sec: 24187.2). Total num frames: 386768896. Throughput: 0: 6055.8. Samples: 96687080. Policy #0 lag: (min: 0.0, avg: 3.3, max: 8.0) [2024-08-05 19:07:08,119][15372] Avg episode reward: [(0, '42.447')] [2024-08-05 19:07:10,622][15444] Updated weights for policy 0, policy_version 47221 (0.0028) [2024-08-05 19:07:13,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24304.7, 300 sec: 24215.0). Total num frames: 386891776. Throughput: 0: 6047.7. Samples: 96722710. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 19:07:13,126][15372] Avg episode reward: [(0, '42.239')] [2024-08-05 19:07:14,189][15444] Updated weights for policy 0, policy_version 47231 (0.0024) [2024-08-05 19:07:16,389][15417] Signal inference workers to stop experience collection... (17400 times) [2024-08-05 19:07:16,389][15417] Signal inference workers to resume experience collection... (17400 times) [2024-08-05 19:07:16,429][15444] InferenceWorker_p0-w0: stopping experience collection (17400 times) [2024-08-05 19:07:16,430][15444] InferenceWorker_p0-w0: resuming experience collection (17400 times) [2024-08-05 19:07:17,336][15444] Updated weights for policy 0, policy_version 47241 (0.0017) [2024-08-05 19:07:18,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 387006464. Throughput: 0: 6030.2. Samples: 96758090. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 19:07:18,119][15372] Avg episode reward: [(0, '42.693')] [2024-08-05 19:07:21,011][15444] Updated weights for policy 0, policy_version 47251 (0.0013) [2024-08-05 19:07:23,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.4, 300 sec: 24187.3). Total num frames: 387129344. Throughput: 0: 6026.9. Samples: 96776760. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 19:07:23,119][15372] Avg episode reward: [(0, '43.046')] [2024-08-05 19:07:24,547][15444] Updated weights for policy 0, policy_version 47261 (0.0018) [2024-08-05 19:07:27,722][15444] Updated weights for policy 0, policy_version 47271 (0.0021) [2024-08-05 19:07:28,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.7, 300 sec: 24215.0). Total num frames: 387252224. Throughput: 0: 6018.2. Samples: 96812520. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 19:07:28,119][15372] Avg episode reward: [(0, '42.439')] [2024-08-05 19:07:31,482][15444] Updated weights for policy 0, policy_version 47281 (0.0021) [2024-08-05 19:07:33,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 387366912. Throughput: 0: 5997.8. Samples: 96847290. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 19:07:33,119][15372] Avg episode reward: [(0, '42.443')] [2024-08-05 19:07:34,623][15444] Updated weights for policy 0, policy_version 47291 (0.0021) [2024-08-05 19:07:37,989][15444] Updated weights for policy 0, policy_version 47301 (0.0012) [2024-08-05 19:07:38,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 387489792. Throughput: 0: 6002.3. Samples: 96866440. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 19:07:38,119][15372] Avg episode reward: [(0, '41.874')] [2024-08-05 19:07:41,576][15444] Updated weights for policy 0, policy_version 47311 (0.0011) [2024-08-05 19:07:43,119][15372] Fps is (10 sec: 24575.5, 60 sec: 24029.8, 300 sec: 24159.4). Total num frames: 387612672. Throughput: 0: 6026.2. Samples: 96902950. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 19:07:43,119][15372] Avg episode reward: [(0, '41.701')] [2024-08-05 19:07:44,749][15444] Updated weights for policy 0, policy_version 47321 (0.0011) [2024-08-05 19:07:48,051][15444] Updated weights for policy 0, policy_version 47331 (0.0012) [2024-08-05 19:07:48,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 387735552. Throughput: 0: 6000.7. Samples: 96939180. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 19:07:48,119][15372] Avg episode reward: [(0, '42.232')] [2024-08-05 19:07:51,383][15444] Updated weights for policy 0, policy_version 47341 (0.0015) [2024-08-05 19:07:53,119][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.9, 300 sec: 24159.4). Total num frames: 387850240. Throughput: 0: 6012.0. Samples: 96957620. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 19:07:53,119][15372] Avg episode reward: [(0, '42.996')] [2024-08-05 19:07:54,983][15444] Updated weights for policy 0, policy_version 47351 (0.0025) [2024-08-05 19:07:58,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24030.0, 300 sec: 24159.5). Total num frames: 387973120. Throughput: 0: 6028.2. Samples: 96993980. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 19:07:58,126][15372] Avg episode reward: [(0, '43.015')] [2024-08-05 19:07:58,261][15444] Updated weights for policy 0, policy_version 47361 (0.0013) [2024-08-05 19:08:01,798][15444] Updated weights for policy 0, policy_version 47371 (0.0021) [2024-08-05 19:08:03,118][15372] Fps is (10 sec: 24576.4, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 388096000. Throughput: 0: 6033.8. Samples: 97029610. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 19:08:03,119][15372] Avg episode reward: [(0, '42.432')] [2024-08-05 19:08:04,886][15444] Updated weights for policy 0, policy_version 47381 (0.0012) [2024-08-05 19:08:08,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 388210688. Throughput: 0: 6044.4. Samples: 97048760. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 19:08:08,126][15372] Avg episode reward: [(0, '42.964')] [2024-08-05 19:08:08,481][15444] Updated weights for policy 0, policy_version 47391 (0.0012) [2024-08-05 19:08:08,947][15417] Signal inference workers to stop experience collection... (17450 times) [2024-08-05 19:08:08,948][15417] Signal inference workers to resume experience collection... (17450 times) [2024-08-05 19:08:09,005][15444] InferenceWorker_p0-w0: stopping experience collection (17450 times) [2024-08-05 19:08:09,005][15444] InferenceWorker_p0-w0: resuming experience collection (17450 times) [2024-08-05 19:08:11,515][15444] Updated weights for policy 0, policy_version 47401 (0.0027) [2024-08-05 19:08:13,122][15372] Fps is (10 sec: 24566.4, 60 sec: 24164.8, 300 sec: 24159.1). Total num frames: 388341760. Throughput: 0: 6052.4. Samples: 97084900. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 19:08:13,123][15372] Avg episode reward: [(0, '43.521')] [2024-08-05 19:08:15,001][15444] Updated weights for policy 0, policy_version 47411 (0.0010) [2024-08-05 19:08:18,119][15372] Fps is (10 sec: 25394.8, 60 sec: 24302.9, 300 sec: 24159.4). Total num frames: 388464640. Throughput: 0: 6116.9. Samples: 97122550. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 19:08:18,119][15372] Avg episode reward: [(0, '43.880')] [2024-08-05 19:08:18,124][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000047420_388464640.pth... [2024-08-05 19:08:18,275][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000046712_382664704.pth [2024-08-05 19:08:18,396][15444] Updated weights for policy 0, policy_version 47421 (0.0023) [2024-08-05 19:08:21,721][15444] Updated weights for policy 0, policy_version 47431 (0.0011) [2024-08-05 19:08:23,118][15372] Fps is (10 sec: 23766.2, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 388579328. Throughput: 0: 6093.3. Samples: 97140640. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 19:08:23,119][15372] Avg episode reward: [(0, '43.715')] [2024-08-05 19:08:25,186][15444] Updated weights for policy 0, policy_version 47441 (0.0024) [2024-08-05 19:08:28,119][15372] Fps is (10 sec: 24576.2, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 388710400. Throughput: 0: 6082.2. Samples: 97176650. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 19:08:28,126][15372] Avg episode reward: [(0, '43.296')] [2024-08-05 19:08:28,634][15444] Updated weights for policy 0, policy_version 47451 (0.0016) [2024-08-05 19:08:31,863][15444] Updated weights for policy 0, policy_version 47461 (0.0022) [2024-08-05 19:08:33,119][15372] Fps is (10 sec: 24575.8, 60 sec: 24302.9, 300 sec: 24159.5). Total num frames: 388825088. Throughput: 0: 6067.3. Samples: 97212210. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 19:08:33,119][15372] Avg episode reward: [(0, '42.739')] [2024-08-05 19:08:35,368][15444] Updated weights for policy 0, policy_version 47471 (0.0012) [2024-08-05 19:08:38,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24302.9, 300 sec: 24159.5). Total num frames: 388947968. Throughput: 0: 6066.0. Samples: 97230590. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 19:08:38,119][15372] Avg episode reward: [(0, '42.391')] [2024-08-05 19:08:38,876][15444] Updated weights for policy 0, policy_version 47481 (0.0025) [2024-08-05 19:08:42,376][15444] Updated weights for policy 0, policy_version 47491 (0.0019) [2024-08-05 19:08:43,119][15372] Fps is (10 sec: 23756.5, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 389062656. Throughput: 0: 6050.9. Samples: 97266270. Policy #0 lag: (min: 1.0, avg: 3.6, max: 8.0) [2024-08-05 19:08:43,120][15372] Avg episode reward: [(0, '42.173')] [2024-08-05 19:08:45,402][15444] Updated weights for policy 0, policy_version 47501 (0.0032) [2024-08-05 19:08:48,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 389185536. Throughput: 0: 6057.6. Samples: 97302200. Policy #0 lag: (min: 1.0, avg: 3.6, max: 8.0) [2024-08-05 19:08:48,126][15372] Avg episode reward: [(0, '42.099')] [2024-08-05 19:08:49,148][15444] Updated weights for policy 0, policy_version 47511 (0.0024) [2024-08-05 19:08:52,480][15444] Updated weights for policy 0, policy_version 47521 (0.0019) [2024-08-05 19:08:53,119][15372] Fps is (10 sec: 24575.7, 60 sec: 24302.9, 300 sec: 24159.4). Total num frames: 389308416. Throughput: 0: 6041.5. Samples: 97320630. Policy #0 lag: (min: 1.0, avg: 3.6, max: 8.0) [2024-08-05 19:08:53,119][15372] Avg episode reward: [(0, '43.304')] [2024-08-05 19:08:55,803][15444] Updated weights for policy 0, policy_version 47531 (0.0020) [2024-08-05 19:08:58,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 389423104. Throughput: 0: 6032.3. Samples: 97356330. Policy #0 lag: (min: 1.0, avg: 3.6, max: 8.0) [2024-08-05 19:08:58,120][15372] Avg episode reward: [(0, '43.606')] [2024-08-05 19:08:59,496][15444] Updated weights for policy 0, policy_version 47541 (0.0028) [2024-08-05 19:09:01,595][15417] Signal inference workers to stop experience collection... (17500 times) [2024-08-05 19:09:01,595][15417] Signal inference workers to resume experience collection... (17500 times) [2024-08-05 19:09:01,653][15444] InferenceWorker_p0-w0: stopping experience collection (17500 times) [2024-08-05 19:09:01,653][15444] InferenceWorker_p0-w0: resuming experience collection (17500 times) [2024-08-05 19:09:02,866][15444] Updated weights for policy 0, policy_version 47551 (0.0018) [2024-08-05 19:09:03,118][15372] Fps is (10 sec: 22938.3, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 389537792. Throughput: 0: 5982.2. Samples: 97391750. Policy #0 lag: (min: 1.0, avg: 3.6, max: 8.0) [2024-08-05 19:09:03,119][15372] Avg episode reward: [(0, '42.946')] [2024-08-05 19:09:06,077][15444] Updated weights for policy 0, policy_version 47561 (0.0012) [2024-08-05 19:09:08,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 389660672. Throughput: 0: 5986.7. Samples: 97410040. Policy #0 lag: (min: 1.0, avg: 3.6, max: 8.0) [2024-08-05 19:09:08,119][15372] Avg episode reward: [(0, '43.120')] [2024-08-05 19:09:09,848][15444] Updated weights for policy 0, policy_version 47571 (0.0014) [2024-08-05 19:09:12,829][15444] Updated weights for policy 0, policy_version 47581 (0.0031) [2024-08-05 19:09:13,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24031.4, 300 sec: 24131.7). Total num frames: 389783552. Throughput: 0: 5990.9. Samples: 97446240. Policy #0 lag: (min: 1.0, avg: 4.5, max: 8.0) [2024-08-05 19:09:13,119][15372] Avg episode reward: [(0, '43.002')] [2024-08-05 19:09:16,394][15444] Updated weights for policy 0, policy_version 47591 (0.0012) [2024-08-05 19:09:18,118][15372] Fps is (10 sec: 23756.9, 60 sec: 23893.4, 300 sec: 24131.7). Total num frames: 389898240. Throughput: 0: 6002.2. Samples: 97482310. Policy #0 lag: (min: 1.0, avg: 4.5, max: 8.0) [2024-08-05 19:09:18,119][15372] Avg episode reward: [(0, '43.227')] [2024-08-05 19:09:19,780][15444] Updated weights for policy 0, policy_version 47601 (0.0025) [2024-08-05 19:09:23,058][15444] Updated weights for policy 0, policy_version 47611 (0.0022) [2024-08-05 19:09:23,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 390029312. Throughput: 0: 6012.0. Samples: 97501130. Policy #0 lag: (min: 1.0, avg: 4.5, max: 8.0) [2024-08-05 19:09:23,119][15372] Avg episode reward: [(0, '43.492')] [2024-08-05 19:09:26,512][15444] Updated weights for policy 0, policy_version 47621 (0.0012) [2024-08-05 19:09:28,118][15372] Fps is (10 sec: 24576.1, 60 sec: 23893.4, 300 sec: 24131.7). Total num frames: 390144000. Throughput: 0: 6017.6. Samples: 97537060. Policy #0 lag: (min: 1.0, avg: 4.5, max: 8.0) [2024-08-05 19:09:28,126][15372] Avg episode reward: [(0, '42.102')] [2024-08-05 19:09:29,674][15444] Updated weights for policy 0, policy_version 47631 (0.0012) [2024-08-05 19:09:33,118][15372] Fps is (10 sec: 23756.6, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 390266880. Throughput: 0: 6036.2. Samples: 97573830. Policy #0 lag: (min: 1.0, avg: 4.5, max: 8.0) [2024-08-05 19:09:33,126][15372] Avg episode reward: [(0, '43.227')] [2024-08-05 19:09:33,185][15444] Updated weights for policy 0, policy_version 47641 (0.0020) [2024-08-05 19:09:36,616][15444] Updated weights for policy 0, policy_version 47651 (0.0024) [2024-08-05 19:09:38,119][15372] Fps is (10 sec: 24575.2, 60 sec: 24029.7, 300 sec: 24159.4). Total num frames: 390389760. Throughput: 0: 6043.8. Samples: 97592600. Policy #0 lag: (min: 1.0, avg: 4.5, max: 8.0) [2024-08-05 19:09:38,126][15372] Avg episode reward: [(0, '42.883')] [2024-08-05 19:09:40,067][15444] Updated weights for policy 0, policy_version 47661 (0.0013) [2024-08-05 19:09:43,119][15372] Fps is (10 sec: 24575.9, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 390512640. Throughput: 0: 6048.4. Samples: 97628510. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 19:09:43,126][15372] Avg episode reward: [(0, '43.171')] [2024-08-05 19:09:43,283][15444] Updated weights for policy 0, policy_version 47671 (0.0040) [2024-08-05 19:09:46,753][15444] Updated weights for policy 0, policy_version 47681 (0.0023) [2024-08-05 19:09:48,118][15372] Fps is (10 sec: 24576.7, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 390635520. Throughput: 0: 6049.8. Samples: 97663990. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 19:09:48,119][15372] Avg episode reward: [(0, '43.389')] [2024-08-05 19:09:49,965][15444] Updated weights for policy 0, policy_version 47691 (0.0019) [2024-08-05 19:09:50,775][15417] Signal inference workers to stop experience collection... (17550 times) [2024-08-05 19:09:50,780][15417] Signal inference workers to resume experience collection... (17550 times) [2024-08-05 19:09:50,832][15444] InferenceWorker_p0-w0: stopping experience collection (17550 times) [2024-08-05 19:09:50,833][15444] InferenceWorker_p0-w0: resuming experience collection (17550 times) [2024-08-05 19:09:53,119][15372] Fps is (10 sec: 24575.7, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 390758400. Throughput: 0: 6059.5. Samples: 97682720. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 19:09:53,119][15372] Avg episode reward: [(0, '43.346')] [2024-08-05 19:09:53,414][15444] Updated weights for policy 0, policy_version 47701 (0.0024) [2024-08-05 19:09:56,978][15444] Updated weights for policy 0, policy_version 47711 (0.0019) [2024-08-05 19:09:58,119][15372] Fps is (10 sec: 23756.5, 60 sec: 24166.3, 300 sec: 24131.7). Total num frames: 390873088. Throughput: 0: 6070.2. Samples: 97719400. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 19:09:58,119][15372] Avg episode reward: [(0, '43.152')] [2024-08-05 19:10:00,139][15444] Updated weights for policy 0, policy_version 47721 (0.0018) [2024-08-05 19:10:03,118][15372] Fps is (10 sec: 23757.3, 60 sec: 24302.9, 300 sec: 24159.5). Total num frames: 390995968. Throughput: 0: 6087.3. Samples: 97756240. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 19:10:03,126][15372] Avg episode reward: [(0, '42.677')] [2024-08-05 19:10:03,559][15444] Updated weights for policy 0, policy_version 47731 (0.0012) [2024-08-05 19:10:06,977][15444] Updated weights for policy 0, policy_version 47741 (0.0019) [2024-08-05 19:10:08,118][15372] Fps is (10 sec: 24576.3, 60 sec: 24302.9, 300 sec: 24159.5). Total num frames: 391118848. Throughput: 0: 6076.0. Samples: 97774550. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 19:10:08,119][15372] Avg episode reward: [(0, '43.489')] [2024-08-05 19:10:10,269][15444] Updated weights for policy 0, policy_version 47751 (0.0013) [2024-08-05 19:10:13,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 391233536. Throughput: 0: 6058.0. Samples: 97809670. Policy #0 lag: (min: 1.0, avg: 4.0, max: 8.0) [2024-08-05 19:10:13,119][15372] Avg episode reward: [(0, '43.573')] [2024-08-05 19:10:13,797][15444] Updated weights for policy 0, policy_version 47761 (0.0015) [2024-08-05 19:10:17,265][15444] Updated weights for policy 0, policy_version 47771 (0.0013) [2024-08-05 19:10:18,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24302.9, 300 sec: 24159.5). Total num frames: 391356416. Throughput: 0: 6037.3. Samples: 97845510. Policy #0 lag: (min: 1.0, avg: 4.0, max: 8.0) [2024-08-05 19:10:18,119][15372] Avg episode reward: [(0, '42.869')] [2024-08-05 19:10:18,189][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000047774_391364608.pth... [2024-08-05 19:10:18,306][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000047065_385556480.pth [2024-08-05 19:10:20,680][15444] Updated weights for policy 0, policy_version 47781 (0.0010) [2024-08-05 19:10:23,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24166.4, 300 sec: 24131.8). Total num frames: 391479296. Throughput: 0: 6035.6. Samples: 97864200. Policy #0 lag: (min: 1.0, avg: 4.0, max: 8.0) [2024-08-05 19:10:23,119][15372] Avg episode reward: [(0, '42.644')] [2024-08-05 19:10:24,246][15444] Updated weights for policy 0, policy_version 47791 (0.0012) [2024-08-05 19:10:27,509][15444] Updated weights for policy 0, policy_version 47801 (0.0017) [2024-08-05 19:10:28,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24302.9, 300 sec: 24159.5). Total num frames: 391602176. Throughput: 0: 6028.0. Samples: 97899770. Policy #0 lag: (min: 1.0, avg: 4.0, max: 8.0) [2024-08-05 19:10:28,119][15372] Avg episode reward: [(0, '42.542')] [2024-08-05 19:10:30,788][15444] Updated weights for policy 0, policy_version 47811 (0.0011) [2024-08-05 19:10:33,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 391716864. Throughput: 0: 6049.8. Samples: 97936230. Policy #0 lag: (min: 1.0, avg: 4.0, max: 8.0) [2024-08-05 19:10:33,127][15372] Avg episode reward: [(0, '42.968')] [2024-08-05 19:10:34,131][15444] Updated weights for policy 0, policy_version 47821 (0.0021) [2024-08-05 19:10:37,673][15444] Updated weights for policy 0, policy_version 47831 (0.0028) [2024-08-05 19:10:38,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.5, 300 sec: 24131.7). Total num frames: 391839744. Throughput: 0: 6032.9. Samples: 97954200. Policy #0 lag: (min: 1.0, avg: 4.0, max: 8.0) [2024-08-05 19:10:38,119][15372] Avg episode reward: [(0, '41.915')] [2024-08-05 19:10:41,044][15444] Updated weights for policy 0, policy_version 47841 (0.0017) [2024-08-05 19:10:43,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 391962624. Throughput: 0: 6006.5. Samples: 97989690. Policy #0 lag: (min: 1.0, avg: 4.0, max: 8.0) [2024-08-05 19:10:43,119][15372] Avg episode reward: [(0, '42.098')] [2024-08-05 19:10:44,620][15444] Updated weights for policy 0, policy_version 47851 (0.0012) [2024-08-05 19:10:44,695][15417] Signal inference workers to stop experience collection... (17600 times) [2024-08-05 19:10:44,696][15417] Signal inference workers to resume experience collection... (17600 times) [2024-08-05 19:10:44,759][15444] InferenceWorker_p0-w0: stopping experience collection (17600 times) [2024-08-05 19:10:44,759][15444] InferenceWorker_p0-w0: resuming experience collection (17600 times) [2024-08-05 19:10:47,875][15444] Updated weights for policy 0, policy_version 47861 (0.0021) [2024-08-05 19:10:48,128][15372] Fps is (10 sec: 23733.5, 60 sec: 24025.9, 300 sec: 24158.7). Total num frames: 392077312. Throughput: 0: 5998.2. Samples: 98026220. Policy #0 lag: (min: 1.0, avg: 4.0, max: 8.0) [2024-08-05 19:10:48,129][15372] Avg episode reward: [(0, '43.573')] [2024-08-05 19:10:51,090][15444] Updated weights for policy 0, policy_version 47871 (0.0011) [2024-08-05 19:10:53,119][15372] Fps is (10 sec: 23756.6, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 392200192. Throughput: 0: 5992.7. Samples: 98044220. Policy #0 lag: (min: 1.0, avg: 4.0, max: 8.0) [2024-08-05 19:10:53,126][15372] Avg episode reward: [(0, '43.170')] [2024-08-05 19:10:54,904][15444] Updated weights for policy 0, policy_version 47881 (0.0013) [2024-08-05 19:10:58,051][15444] Updated weights for policy 0, policy_version 47891 (0.0029) [2024-08-05 19:10:58,118][15372] Fps is (10 sec: 24600.3, 60 sec: 24166.5, 300 sec: 24159.5). Total num frames: 392323072. Throughput: 0: 6010.9. Samples: 98080160. Policy #0 lag: (min: 1.0, avg: 4.0, max: 8.0) [2024-08-05 19:10:58,119][15372] Avg episode reward: [(0, '41.810')] [2024-08-05 19:11:01,497][15444] Updated weights for policy 0, policy_version 47901 (0.0028) [2024-08-05 19:11:03,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24029.8, 300 sec: 24159.5). Total num frames: 392437760. Throughput: 0: 6010.7. Samples: 98115990. Policy #0 lag: (min: 1.0, avg: 4.0, max: 8.0) [2024-08-05 19:11:03,126][15372] Avg episode reward: [(0, '41.488')] [2024-08-05 19:11:05,108][15444] Updated weights for policy 0, policy_version 47911 (0.0037) [2024-08-05 19:11:08,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.4, 300 sec: 24187.6). Total num frames: 392568832. Throughput: 0: 6010.2. Samples: 98134660. Policy #0 lag: (min: 1.0, avg: 4.0, max: 8.0) [2024-08-05 19:11:08,126][15372] Avg episode reward: [(0, '42.850')] [2024-08-05 19:11:08,131][15444] Updated weights for policy 0, policy_version 47921 (0.0025) [2024-08-05 19:11:11,929][15444] Updated weights for policy 0, policy_version 47931 (0.0016) [2024-08-05 19:11:13,119][15372] Fps is (10 sec: 24575.5, 60 sec: 24166.3, 300 sec: 24159.4). Total num frames: 392683520. Throughput: 0: 6018.4. Samples: 98170600. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 19:11:13,119][15372] Avg episode reward: [(0, '43.280')] [2024-08-05 19:11:15,067][15444] Updated weights for policy 0, policy_version 47941 (0.0013) [2024-08-05 19:11:18,119][15372] Fps is (10 sec: 22936.9, 60 sec: 24029.7, 300 sec: 24131.7). Total num frames: 392798208. Throughput: 0: 6026.0. Samples: 98207400. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 19:11:18,127][15372] Avg episode reward: [(0, '43.308')] [2024-08-05 19:11:18,480][15444] Updated weights for policy 0, policy_version 47951 (0.0026) [2024-08-05 19:11:21,832][15444] Updated weights for policy 0, policy_version 47961 (0.0013) [2024-08-05 19:11:23,119][15372] Fps is (10 sec: 24576.3, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 392929280. Throughput: 0: 6028.0. Samples: 98225460. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 19:11:23,119][15372] Avg episode reward: [(0, '43.100')] [2024-08-05 19:11:25,110][15444] Updated weights for policy 0, policy_version 47971 (0.0019) [2024-08-05 19:11:27,218][15417] Signal inference workers to stop experience collection... (17650 times) [2024-08-05 19:11:27,218][15417] Signal inference workers to resume experience collection... (17650 times) [2024-08-05 19:11:27,266][15444] InferenceWorker_p0-w0: stopping experience collection (17650 times) [2024-08-05 19:11:27,276][15444] InferenceWorker_p0-w0: resuming experience collection (17650 times) [2024-08-05 19:11:28,119][15372] Fps is (10 sec: 25395.5, 60 sec: 24166.3, 300 sec: 24187.2). Total num frames: 393052160. Throughput: 0: 6059.7. Samples: 98262380. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 19:11:28,126][15372] Avg episode reward: [(0, '44.088')] [2024-08-05 19:11:28,606][15444] Updated weights for policy 0, policy_version 47981 (0.0013) [2024-08-05 19:11:32,104][15444] Updated weights for policy 0, policy_version 47991 (0.0014) [2024-08-05 19:11:33,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 393166848. Throughput: 0: 6037.5. Samples: 98297850. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 19:11:33,119][15372] Avg episode reward: [(0, '43.565')] [2024-08-05 19:11:35,317][15444] Updated weights for policy 0, policy_version 48001 (0.0022) [2024-08-05 19:11:38,118][15372] Fps is (10 sec: 23757.2, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 393289728. Throughput: 0: 6055.1. Samples: 98316700. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 19:11:38,119][15372] Avg episode reward: [(0, '43.328')] [2024-08-05 19:11:38,708][15444] Updated weights for policy 0, policy_version 48011 (0.0013) [2024-08-05 19:11:42,155][15444] Updated weights for policy 0, policy_version 48021 (0.0020) [2024-08-05 19:11:43,119][15372] Fps is (10 sec: 24575.9, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 393412608. Throughput: 0: 6050.9. Samples: 98352450. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 19:11:43,119][15372] Avg episode reward: [(0, '43.413')] [2024-08-05 19:11:45,401][15444] Updated weights for policy 0, policy_version 48031 (0.0025) [2024-08-05 19:11:48,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24306.9, 300 sec: 24159.5). Total num frames: 393535488. Throughput: 0: 6082.4. Samples: 98389700. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 19:11:48,119][15372] Avg episode reward: [(0, '43.382')] [2024-08-05 19:11:48,958][15444] Updated weights for policy 0, policy_version 48041 (0.0018) [2024-08-05 19:11:51,985][15444] Updated weights for policy 0, policy_version 48051 (0.0011) [2024-08-05 19:11:53,118][15372] Fps is (10 sec: 23757.2, 60 sec: 24166.5, 300 sec: 24131.7). Total num frames: 393650176. Throughput: 0: 6076.7. Samples: 98408110. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 19:11:53,119][15372] Avg episode reward: [(0, '42.961')] [2024-08-05 19:11:55,456][15444] Updated weights for policy 0, policy_version 48061 (0.0026) [2024-08-05 19:11:58,119][15372] Fps is (10 sec: 24575.7, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 393781248. Throughput: 0: 6088.5. Samples: 98444580. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 19:11:58,119][15372] Avg episode reward: [(0, '44.604')] [2024-08-05 19:11:58,122][15417] Saving new best policy, reward=44.604! [2024-08-05 19:11:58,845][15444] Updated weights for policy 0, policy_version 48071 (0.0015) [2024-08-05 19:12:02,212][15444] Updated weights for policy 0, policy_version 48081 (0.0013) [2024-08-05 19:12:03,118][15372] Fps is (10 sec: 24575.8, 60 sec: 24302.9, 300 sec: 24159.5). Total num frames: 393895936. Throughput: 0: 6060.5. Samples: 98480120. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 19:12:03,119][15372] Avg episode reward: [(0, '43.903')] [2024-08-05 19:12:05,781][15444] Updated weights for policy 0, policy_version 48091 (0.0012) [2024-08-05 19:12:08,119][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.3, 300 sec: 24159.4). Total num frames: 394018816. Throughput: 0: 6078.0. Samples: 98498970. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 19:12:08,119][15372] Avg episode reward: [(0, '41.124')] [2024-08-05 19:12:08,935][15444] Updated weights for policy 0, policy_version 48101 (0.0017) [2024-08-05 19:12:10,540][15417] Signal inference workers to stop experience collection... (17700 times) [2024-08-05 19:12:10,541][15417] Signal inference workers to resume experience collection... (17700 times) [2024-08-05 19:12:10,588][15444] InferenceWorker_p0-w0: stopping experience collection (17700 times) [2024-08-05 19:12:10,589][15444] InferenceWorker_p0-w0: resuming experience collection (17700 times) [2024-08-05 19:12:12,464][15444] Updated weights for policy 0, policy_version 48111 (0.0018) [2024-08-05 19:12:13,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24303.0, 300 sec: 24187.2). Total num frames: 394141696. Throughput: 0: 6063.4. Samples: 98535230. Policy #0 lag: (min: 1.0, avg: 3.3, max: 7.0) [2024-08-05 19:12:13,119][15372] Avg episode reward: [(0, '42.470')] [2024-08-05 19:12:15,665][15444] Updated weights for policy 0, policy_version 48121 (0.0013) [2024-08-05 19:12:18,119][15372] Fps is (10 sec: 23757.0, 60 sec: 24303.0, 300 sec: 24159.5). Total num frames: 394256384. Throughput: 0: 6076.2. Samples: 98571280. Policy #0 lag: (min: 1.0, avg: 3.3, max: 7.0) [2024-08-05 19:12:18,126][15372] Avg episode reward: [(0, '43.080')] [2024-08-05 19:12:18,174][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000048128_394264576.pth... [2024-08-05 19:12:18,285][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000047420_388464640.pth [2024-08-05 19:12:19,119][15444] Updated weights for policy 0, policy_version 48131 (0.0014) [2024-08-05 19:12:22,698][15444] Updated weights for policy 0, policy_version 48141 (0.0024) [2024-08-05 19:12:23,119][15372] Fps is (10 sec: 23756.5, 60 sec: 24166.4, 300 sec: 24159.4). Total num frames: 394379264. Throughput: 0: 6061.5. Samples: 98589470. Policy #0 lag: (min: 1.0, avg: 3.3, max: 7.0) [2024-08-05 19:12:23,119][15372] Avg episode reward: [(0, '43.261')] [2024-08-05 19:12:25,895][15444] Updated weights for policy 0, policy_version 48151 (0.0042) [2024-08-05 19:12:28,120][15372] Fps is (10 sec: 24572.2, 60 sec: 24165.8, 300 sec: 24187.1). Total num frames: 394502144. Throughput: 0: 6052.7. Samples: 98624830. Policy #0 lag: (min: 1.0, avg: 3.3, max: 7.0) [2024-08-05 19:12:28,128][15372] Avg episode reward: [(0, '42.723')] [2024-08-05 19:12:29,685][15444] Updated weights for policy 0, policy_version 48161 (0.0021) [2024-08-05 19:12:32,983][15444] Updated weights for policy 0, policy_version 48171 (0.0016) [2024-08-05 19:12:33,119][15372] Fps is (10 sec: 23756.4, 60 sec: 24166.3, 300 sec: 24159.4). Total num frames: 394616832. Throughput: 0: 6014.2. Samples: 98660340. Policy #0 lag: (min: 1.0, avg: 3.3, max: 7.0) [2024-08-05 19:12:33,119][15372] Avg episode reward: [(0, '43.042')] [2024-08-05 19:12:36,457][15444] Updated weights for policy 0, policy_version 48181 (0.0012) [2024-08-05 19:12:38,118][15372] Fps is (10 sec: 22941.3, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 394731520. Throughput: 0: 6026.2. Samples: 98679290. Policy #0 lag: (min: 1.0, avg: 3.3, max: 7.0) [2024-08-05 19:12:38,126][15372] Avg episode reward: [(0, '43.327')] [2024-08-05 19:12:40,053][15444] Updated weights for policy 0, policy_version 48191 (0.0015) [2024-08-05 19:12:42,437][15417] Signal inference workers to stop experience collection... (17750 times) [2024-08-05 19:12:42,440][15417] Signal inference workers to resume experience collection... (17750 times) [2024-08-05 19:12:42,495][15444] InferenceWorker_p0-w0: stopping experience collection (17750 times) [2024-08-05 19:12:42,496][15444] InferenceWorker_p0-w0: resuming experience collection (17750 times) [2024-08-05 19:12:42,980][15444] Updated weights for policy 0, policy_version 48201 (0.0033) [2024-08-05 19:12:43,118][15372] Fps is (10 sec: 24576.8, 60 sec: 24166.5, 300 sec: 24159.5). Total num frames: 394862592. Throughput: 0: 6004.9. Samples: 98714800. Policy #0 lag: (min: 1.0, avg: 3.8, max: 8.0) [2024-08-05 19:12:43,119][15372] Avg episode reward: [(0, '43.103')] [2024-08-05 19:12:46,882][15444] Updated weights for policy 0, policy_version 48211 (0.0030) [2024-08-05 19:12:48,118][15372] Fps is (10 sec: 23756.8, 60 sec: 23893.4, 300 sec: 24131.7). Total num frames: 394969088. Throughput: 0: 5996.9. Samples: 98749980. Policy #0 lag: (min: 1.0, avg: 3.8, max: 8.0) [2024-08-05 19:12:48,119][15372] Avg episode reward: [(0, '42.854')] [2024-08-05 19:12:50,190][15444] Updated weights for policy 0, policy_version 48221 (0.0026) [2024-08-05 19:12:53,119][15372] Fps is (10 sec: 22936.9, 60 sec: 24029.7, 300 sec: 24131.7). Total num frames: 395091968. Throughput: 0: 6007.8. Samples: 98769320. Policy #0 lag: (min: 1.0, avg: 3.8, max: 8.0) [2024-08-05 19:12:53,127][15372] Avg episode reward: [(0, '43.097')] [2024-08-05 19:12:53,346][15444] Updated weights for policy 0, policy_version 48231 (0.0020) [2024-08-05 19:12:56,950][15444] Updated weights for policy 0, policy_version 48241 (0.0029) [2024-08-05 19:12:58,118][15372] Fps is (10 sec: 25395.1, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 395223040. Throughput: 0: 5997.3. Samples: 98805110. Policy #0 lag: (min: 1.0, avg: 3.8, max: 8.0) [2024-08-05 19:12:58,119][15372] Avg episode reward: [(0, '43.380')] [2024-08-05 19:12:59,923][15444] Updated weights for policy 0, policy_version 48251 (0.0013) [2024-08-05 19:13:03,118][15372] Fps is (10 sec: 24576.8, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 395337728. Throughput: 0: 5999.8. Samples: 98841270. Policy #0 lag: (min: 1.0, avg: 3.8, max: 8.0) [2024-08-05 19:13:03,126][15372] Avg episode reward: [(0, '42.661')] [2024-08-05 19:13:03,703][15444] Updated weights for policy 0, policy_version 48261 (0.0014) [2024-08-05 19:13:07,013][15444] Updated weights for policy 0, policy_version 48271 (0.0015) [2024-08-05 19:13:08,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24029.9, 300 sec: 24132.0). Total num frames: 395460608. Throughput: 0: 6004.9. Samples: 98859690. Policy #0 lag: (min: 1.0, avg: 3.8, max: 8.0) [2024-08-05 19:13:08,120][15372] Avg episode reward: [(0, '44.013')] [2024-08-05 19:13:10,282][15444] Updated weights for policy 0, policy_version 48281 (0.0026) [2024-08-05 19:13:13,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24029.8, 300 sec: 24131.7). Total num frames: 395583488. Throughput: 0: 6030.2. Samples: 98896180. Policy #0 lag: (min: 1.0, avg: 4.4, max: 8.0) [2024-08-05 19:13:13,119][15372] Avg episode reward: [(0, '43.814')] [2024-08-05 19:13:13,747][15444] Updated weights for policy 0, policy_version 48291 (0.0035) [2024-08-05 19:13:16,932][15444] Updated weights for policy 0, policy_version 48301 (0.0024) [2024-08-05 19:13:18,119][15372] Fps is (10 sec: 24575.6, 60 sec: 24166.4, 300 sec: 24159.4). Total num frames: 395706368. Throughput: 0: 6045.4. Samples: 98932380. Policy #0 lag: (min: 1.0, avg: 4.4, max: 8.0) [2024-08-05 19:13:18,119][15372] Avg episode reward: [(0, '44.034')] [2024-08-05 19:13:20,380][15444] Updated weights for policy 0, policy_version 48311 (0.0019) [2024-08-05 19:13:23,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 395829248. Throughput: 0: 6040.7. Samples: 98951120. Policy #0 lag: (min: 1.0, avg: 4.4, max: 8.0) [2024-08-05 19:13:23,119][15372] Avg episode reward: [(0, '43.636')] [2024-08-05 19:13:24,023][15444] Updated weights for policy 0, policy_version 48321 (0.0017) [2024-08-05 19:13:24,858][15417] Signal inference workers to stop experience collection... (17800 times) [2024-08-05 19:13:24,858][15417] Signal inference workers to resume experience collection... (17800 times) [2024-08-05 19:13:24,940][15444] InferenceWorker_p0-w0: stopping experience collection (17800 times) [2024-08-05 19:13:24,941][15444] InferenceWorker_p0-w0: resuming experience collection (17800 times) [2024-08-05 19:13:27,061][15444] Updated weights for policy 0, policy_version 48331 (0.0017) [2024-08-05 19:13:28,118][15372] Fps is (10 sec: 23757.1, 60 sec: 24030.5, 300 sec: 24131.7). Total num frames: 395943936. Throughput: 0: 6058.2. Samples: 98987420. Policy #0 lag: (min: 1.0, avg: 4.4, max: 8.0) [2024-08-05 19:13:28,119][15372] Avg episode reward: [(0, '42.570')] [2024-08-05 19:13:30,531][15444] Updated weights for policy 0, policy_version 48341 (0.0019) [2024-08-05 19:13:33,119][15372] Fps is (10 sec: 24575.8, 60 sec: 24303.0, 300 sec: 24159.4). Total num frames: 396075008. Throughput: 0: 6106.2. Samples: 99024760. Policy #0 lag: (min: 1.0, avg: 4.4, max: 8.0) [2024-08-05 19:13:33,119][15372] Avg episode reward: [(0, '42.998')] [2024-08-05 19:13:33,658][15444] Updated weights for policy 0, policy_version 48351 (0.0019) [2024-08-05 19:13:37,077][15444] Updated weights for policy 0, policy_version 48361 (0.0013) [2024-08-05 19:13:38,119][15372] Fps is (10 sec: 24575.6, 60 sec: 24302.8, 300 sec: 24159.5). Total num frames: 396189696. Throughput: 0: 6074.7. Samples: 99042680. Policy #0 lag: (min: 1.0, avg: 4.4, max: 8.0) [2024-08-05 19:13:38,119][15372] Avg episode reward: [(0, '43.089')] [2024-08-05 19:13:40,463][15444] Updated weights for policy 0, policy_version 48371 (0.0020) [2024-08-05 19:13:43,119][15372] Fps is (10 sec: 23756.2, 60 sec: 24166.3, 300 sec: 24159.4). Total num frames: 396312576. Throughput: 0: 6086.6. Samples: 99079010. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 19:13:43,119][15372] Avg episode reward: [(0, '43.036')] [2024-08-05 19:13:44,009][15444] Updated weights for policy 0, policy_version 48381 (0.0011) [2024-08-05 19:13:47,300][15444] Updated weights for policy 0, policy_version 48391 (0.0023) [2024-08-05 19:13:48,118][15372] Fps is (10 sec: 23757.2, 60 sec: 24302.9, 300 sec: 24131.7). Total num frames: 396427264. Throughput: 0: 6080.9. Samples: 99114910. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 19:13:48,126][15372] Avg episode reward: [(0, '43.344')] [2024-08-05 19:13:50,778][15444] Updated weights for policy 0, policy_version 48401 (0.0017) [2024-08-05 19:13:53,118][15372] Fps is (10 sec: 24576.7, 60 sec: 24439.6, 300 sec: 24187.2). Total num frames: 396558336. Throughput: 0: 6084.0. Samples: 99133470. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 19:13:53,126][15372] Avg episode reward: [(0, '43.967')] [2024-08-05 19:13:54,077][15444] Updated weights for policy 0, policy_version 48411 (0.0017) [2024-08-05 19:13:57,531][15444] Updated weights for policy 0, policy_version 48421 (0.0011) [2024-08-05 19:13:58,118][15372] Fps is (10 sec: 25395.3, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 396681216. Throughput: 0: 6075.3. Samples: 99169570. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 19:13:58,119][15372] Avg episode reward: [(0, '42.793')] [2024-08-05 19:14:00,906][15444] Updated weights for policy 0, policy_version 48431 (0.0018) [2024-08-05 19:14:03,123][15372] Fps is (10 sec: 23748.4, 60 sec: 24301.5, 300 sec: 24186.9). Total num frames: 396795904. Throughput: 0: 6061.1. Samples: 99205150. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 19:14:03,134][15372] Avg episode reward: [(0, '42.449')] [2024-08-05 19:14:04,288][15444] Updated weights for policy 0, policy_version 48441 (0.0014) [2024-08-05 19:14:05,964][15417] Signal inference workers to stop experience collection... (17850 times) [2024-08-05 19:14:05,965][15417] Signal inference workers to resume experience collection... (17850 times) [2024-08-05 19:14:06,005][15444] InferenceWorker_p0-w0: stopping experience collection (17850 times) [2024-08-05 19:14:06,005][15444] InferenceWorker_p0-w0: resuming experience collection (17850 times) [2024-08-05 19:14:07,808][15444] Updated weights for policy 0, policy_version 48451 (0.0024) [2024-08-05 19:14:08,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 396918784. Throughput: 0: 6052.7. Samples: 99223490. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 19:14:08,119][15372] Avg episode reward: [(0, '43.701')] [2024-08-05 19:14:10,836][15444] Updated weights for policy 0, policy_version 48461 (0.0011) [2024-08-05 19:14:13,118][15372] Fps is (10 sec: 24584.8, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 397041664. Throughput: 0: 6057.1. Samples: 99259990. Policy #0 lag: (min: 0.0, avg: 4.3, max: 7.0) [2024-08-05 19:14:13,126][15372] Avg episode reward: [(0, '43.838')] [2024-08-05 19:14:14,518][15444] Updated weights for policy 0, policy_version 48471 (0.0026) [2024-08-05 19:14:18,037][15444] Updated weights for policy 0, policy_version 48481 (0.0015) [2024-08-05 19:14:18,119][15372] Fps is (10 sec: 23755.7, 60 sec: 24166.3, 300 sec: 24159.4). Total num frames: 397156352. Throughput: 0: 6007.7. Samples: 99295110. Policy #0 lag: (min: 0.0, avg: 4.3, max: 7.0) [2024-08-05 19:14:18,119][15372] Avg episode reward: [(0, '43.555')] [2024-08-05 19:14:18,122][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000048481_397156352.pth... [2024-08-05 19:14:18,240][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000047774_391364608.pth [2024-08-05 19:14:21,276][15444] Updated weights for policy 0, policy_version 48491 (0.0015) [2024-08-05 19:14:23,119][15372] Fps is (10 sec: 23756.6, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 397279232. Throughput: 0: 6020.7. Samples: 99313610. Policy #0 lag: (min: 0.0, avg: 4.3, max: 7.0) [2024-08-05 19:14:23,126][15372] Avg episode reward: [(0, '42.876')] [2024-08-05 19:14:24,819][15444] Updated weights for policy 0, policy_version 48501 (0.0021) [2024-08-05 19:14:28,118][15372] Fps is (10 sec: 23757.9, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 397393920. Throughput: 0: 6020.3. Samples: 99349920. Policy #0 lag: (min: 0.0, avg: 4.3, max: 7.0) [2024-08-05 19:14:28,126][15372] Avg episode reward: [(0, '43.754')] [2024-08-05 19:14:28,167][15444] Updated weights for policy 0, policy_version 48511 (0.0019) [2024-08-05 19:14:31,408][15444] Updated weights for policy 0, policy_version 48521 (0.0016) [2024-08-05 19:14:33,118][15372] Fps is (10 sec: 23757.1, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 397516800. Throughput: 0: 6010.0. Samples: 99385360. Policy #0 lag: (min: 0.0, avg: 4.3, max: 7.0) [2024-08-05 19:14:33,139][15372] Avg episode reward: [(0, '43.522')] [2024-08-05 19:14:35,001][15444] Updated weights for policy 0, policy_version 48531 (0.0013) [2024-08-05 19:14:38,119][15372] Fps is (10 sec: 24575.4, 60 sec: 24166.4, 300 sec: 24159.4). Total num frames: 397639680. Throughput: 0: 6012.4. Samples: 99404030. Policy #0 lag: (min: 0.0, avg: 4.3, max: 7.0) [2024-08-05 19:14:38,126][15372] Avg episode reward: [(0, '43.109')] [2024-08-05 19:14:38,359][15444] Updated weights for policy 0, policy_version 48541 (0.0010) [2024-08-05 19:14:41,728][15444] Updated weights for policy 0, policy_version 48551 (0.0022) [2024-08-05 19:14:43,119][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.5, 300 sec: 24159.5). Total num frames: 397762560. Throughput: 0: 6022.7. Samples: 99440590. Policy #0 lag: (min: 1.0, avg: 4.3, max: 8.0) [2024-08-05 19:14:43,119][15372] Avg episode reward: [(0, '42.271')] [2024-08-05 19:14:44,989][15444] Updated weights for policy 0, policy_version 48561 (0.0022) [2024-08-05 19:14:48,118][15372] Fps is (10 sec: 24576.5, 60 sec: 24302.9, 300 sec: 24159.5). Total num frames: 397885440. Throughput: 0: 6047.6. Samples: 99477270. Policy #0 lag: (min: 1.0, avg: 4.3, max: 8.0) [2024-08-05 19:14:48,126][15372] Avg episode reward: [(0, '42.733')] [2024-08-05 19:14:48,292][15444] Updated weights for policy 0, policy_version 48571 (0.0041) [2024-08-05 19:14:50,560][15417] Signal inference workers to stop experience collection... (17900 times) [2024-08-05 19:14:50,560][15417] Signal inference workers to resume experience collection... (17900 times) [2024-08-05 19:14:50,633][15444] InferenceWorker_p0-w0: stopping experience collection (17900 times) [2024-08-05 19:14:50,634][15444] InferenceWorker_p0-w0: resuming experience collection (17900 times) [2024-08-05 19:14:52,031][15444] Updated weights for policy 0, policy_version 48581 (0.0026) [2024-08-05 19:14:53,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 398000128. Throughput: 0: 6033.1. Samples: 99494980. Policy #0 lag: (min: 1.0, avg: 4.3, max: 8.0) [2024-08-05 19:14:53,119][15372] Avg episode reward: [(0, '43.119')] [2024-08-05 19:14:55,115][15444] Updated weights for policy 0, policy_version 48591 (0.0010) [2024-08-05 19:14:58,119][15372] Fps is (10 sec: 23756.7, 60 sec: 24029.8, 300 sec: 24159.5). Total num frames: 398123008. Throughput: 0: 6046.0. Samples: 99532060. Policy #0 lag: (min: 1.0, avg: 4.3, max: 8.0) [2024-08-05 19:14:58,126][15372] Avg episode reward: [(0, '43.217')] [2024-08-05 19:14:58,544][15444] Updated weights for policy 0, policy_version 48601 (0.0018) [2024-08-05 19:15:01,850][15444] Updated weights for policy 0, policy_version 48611 (0.0014) [2024-08-05 19:15:03,118][15372] Fps is (10 sec: 25395.2, 60 sec: 24304.4, 300 sec: 24187.2). Total num frames: 398254080. Throughput: 0: 6074.3. Samples: 99568450. Policy #0 lag: (min: 1.0, avg: 4.3, max: 8.0) [2024-08-05 19:15:03,119][15372] Avg episode reward: [(0, '43.134')] [2024-08-05 19:15:05,046][15444] Updated weights for policy 0, policy_version 48621 (0.0020) [2024-08-05 19:15:08,119][15372] Fps is (10 sec: 24575.5, 60 sec: 24166.3, 300 sec: 24187.2). Total num frames: 398368768. Throughput: 0: 6067.3. Samples: 99586640. Policy #0 lag: (min: 1.0, avg: 4.3, max: 8.0) [2024-08-05 19:15:08,127][15372] Avg episode reward: [(0, '42.767')] [2024-08-05 19:15:08,754][15444] Updated weights for policy 0, policy_version 48631 (0.0013) [2024-08-05 19:15:11,931][15444] Updated weights for policy 0, policy_version 48641 (0.0012) [2024-08-05 19:15:13,118][15372] Fps is (10 sec: 22937.6, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 398483456. Throughput: 0: 6070.2. Samples: 99623080. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 19:15:13,119][15372] Avg episode reward: [(0, '43.890')] [2024-08-05 19:15:15,411][15444] Updated weights for policy 0, policy_version 48651 (0.0014) [2024-08-05 19:15:18,118][15372] Fps is (10 sec: 24576.6, 60 sec: 24303.1, 300 sec: 24187.2). Total num frames: 398614528. Throughput: 0: 6101.1. Samples: 99659910. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 19:15:18,119][15372] Avg episode reward: [(0, '44.440')] [2024-08-05 19:15:18,730][15444] Updated weights for policy 0, policy_version 48661 (0.0018) [2024-08-05 19:15:22,068][15444] Updated weights for policy 0, policy_version 48671 (0.0018) [2024-08-05 19:15:23,118][15372] Fps is (10 sec: 25395.1, 60 sec: 24303.0, 300 sec: 24187.2). Total num frames: 398737408. Throughput: 0: 6086.9. Samples: 99677940. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 19:15:23,119][15372] Avg episode reward: [(0, '43.573')] [2024-08-05 19:15:25,596][15444] Updated weights for policy 0, policy_version 48681 (0.0025) [2024-08-05 19:15:28,119][15372] Fps is (10 sec: 23756.2, 60 sec: 24302.8, 300 sec: 24187.2). Total num frames: 398852096. Throughput: 0: 6083.5. Samples: 99714350. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 19:15:28,119][15372] Avg episode reward: [(0, '43.788')] [2024-08-05 19:15:28,816][15444] Updated weights for policy 0, policy_version 48691 (0.0011) [2024-08-05 19:15:32,381][15444] Updated weights for policy 0, policy_version 48701 (0.0020) [2024-08-05 19:15:33,119][15372] Fps is (10 sec: 23756.2, 60 sec: 24302.8, 300 sec: 24187.2). Total num frames: 398974976. Throughput: 0: 6070.2. Samples: 99750430. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 19:15:33,119][15372] Avg episode reward: [(0, '43.960')] [2024-08-05 19:15:35,516][15444] Updated weights for policy 0, policy_version 48711 (0.0011) [2024-08-05 19:15:37,101][15417] Signal inference workers to stop experience collection... (17950 times) [2024-08-05 19:15:37,109][15417] Signal inference workers to resume experience collection... (17950 times) [2024-08-05 19:15:37,143][15444] InferenceWorker_p0-w0: stopping experience collection (17950 times) [2024-08-05 19:15:37,143][15444] InferenceWorker_p0-w0: resuming experience collection (17950 times) [2024-08-05 19:15:38,119][15372] Fps is (10 sec: 24576.2, 60 sec: 24303.0, 300 sec: 24187.2). Total num frames: 399097856. Throughput: 0: 6081.1. Samples: 99768630. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 19:15:38,119][15372] Avg episode reward: [(0, '44.256')] [2024-08-05 19:15:38,895][15444] Updated weights for policy 0, policy_version 48721 (0.0011) [2024-08-05 19:15:42,375][15444] Updated weights for policy 0, policy_version 48731 (0.0029) [2024-08-05 19:15:43,119][15372] Fps is (10 sec: 24575.0, 60 sec: 24302.7, 300 sec: 24215.8). Total num frames: 399220736. Throughput: 0: 6072.8. Samples: 99805340. Policy #0 lag: (min: 1.0, avg: 3.7, max: 7.0) [2024-08-05 19:15:43,120][15372] Avg episode reward: [(0, '43.284')] [2024-08-05 19:15:45,847][15444] Updated weights for policy 0, policy_version 48741 (0.0016) [2024-08-05 19:15:48,118][15372] Fps is (10 sec: 23757.2, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 399335424. Throughput: 0: 6057.6. Samples: 99841040. Policy #0 lag: (min: 1.0, avg: 3.7, max: 7.0) [2024-08-05 19:15:48,126][15372] Avg episode reward: [(0, '42.718')] [2024-08-05 19:15:49,245][15444] Updated weights for policy 0, policy_version 48751 (0.0013) [2024-08-05 19:15:52,752][15444] Updated weights for policy 0, policy_version 48761 (0.0018) [2024-08-05 19:15:53,118][15372] Fps is (10 sec: 23758.4, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 399458304. Throughput: 0: 6047.6. Samples: 99858780. Policy #0 lag: (min: 1.0, avg: 3.7, max: 7.0) [2024-08-05 19:15:53,119][15372] Avg episode reward: [(0, '43.463')] [2024-08-05 19:15:55,855][15444] Updated weights for policy 0, policy_version 48771 (0.0024) [2024-08-05 19:15:58,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24303.0, 300 sec: 24215.0). Total num frames: 399581184. Throughput: 0: 6034.7. Samples: 99894640. Policy #0 lag: (min: 1.0, avg: 3.7, max: 7.0) [2024-08-05 19:15:58,126][15372] Avg episode reward: [(0, '42.577')] [2024-08-05 19:15:59,485][15444] Updated weights for policy 0, policy_version 48781 (0.0017) [2024-08-05 19:16:03,018][15444] Updated weights for policy 0, policy_version 48791 (0.0022) [2024-08-05 19:16:03,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 399695872. Throughput: 0: 6010.0. Samples: 99930360. Policy #0 lag: (min: 1.0, avg: 3.7, max: 7.0) [2024-08-05 19:16:03,119][15372] Avg episode reward: [(0, '43.084')] [2024-08-05 19:16:06,223][15444] Updated weights for policy 0, policy_version 48801 (0.0010) [2024-08-05 19:16:08,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.5, 300 sec: 24187.2). Total num frames: 399818752. Throughput: 0: 6021.3. Samples: 99948900. Policy #0 lag: (min: 1.0, avg: 3.7, max: 7.0) [2024-08-05 19:16:08,126][15372] Avg episode reward: [(0, '43.609')] [2024-08-05 19:16:09,603][15444] Updated weights for policy 0, policy_version 48811 (0.0011) [2024-08-05 19:16:13,010][15444] Updated weights for policy 0, policy_version 48821 (0.0014) [2024-08-05 19:16:13,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 399941632. Throughput: 0: 6024.0. Samples: 99985430. Policy #0 lag: (min: 1.0, avg: 3.8, max: 7.0) [2024-08-05 19:16:13,119][15372] Avg episode reward: [(0, '42.320')] [2024-08-05 19:16:16,308][15444] Updated weights for policy 0, policy_version 48831 (0.0027) [2024-08-05 19:16:18,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.8, 300 sec: 24159.5). Total num frames: 400056320. Throughput: 0: 6016.9. Samples: 100021190. Policy #0 lag: (min: 1.0, avg: 3.8, max: 7.0) [2024-08-05 19:16:18,126][15372] Avg episode reward: [(0, '42.545')] [2024-08-05 19:16:18,179][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000048836_400064512.pth... [2024-08-05 19:16:18,336][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000048128_394264576.pth [2024-08-05 19:16:19,783][15444] Updated weights for policy 0, policy_version 48841 (0.0022) [2024-08-05 19:16:23,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 400179200. Throughput: 0: 6023.4. Samples: 100039680. Policy #0 lag: (min: 1.0, avg: 3.8, max: 7.0) [2024-08-05 19:16:23,126][15372] Avg episode reward: [(0, '43.132')] [2024-08-05 19:16:23,239][15444] Updated weights for policy 0, policy_version 48851 (0.0022) [2024-08-05 19:16:26,662][15444] Updated weights for policy 0, policy_version 48861 (0.0011) [2024-08-05 19:16:28,118][15372] Fps is (10 sec: 25395.3, 60 sec: 24303.0, 300 sec: 24215.0). Total num frames: 400310272. Throughput: 0: 6010.5. Samples: 100075810. Policy #0 lag: (min: 1.0, avg: 3.8, max: 7.0) [2024-08-05 19:16:28,126][15372] Avg episode reward: [(0, '44.180')] [2024-08-05 19:16:29,830][15444] Updated weights for policy 0, policy_version 48871 (0.0017) [2024-08-05 19:16:33,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.5, 300 sec: 24187.2). Total num frames: 400424960. Throughput: 0: 6019.8. Samples: 100111930. Policy #0 lag: (min: 1.0, avg: 3.8, max: 7.0) [2024-08-05 19:16:33,126][15372] Avg episode reward: [(0, '43.456')] [2024-08-05 19:16:33,556][15444] Updated weights for policy 0, policy_version 48881 (0.0021) [2024-08-05 19:16:36,735][15444] Updated weights for policy 0, policy_version 48891 (0.0019) [2024-08-05 19:16:38,119][15372] Fps is (10 sec: 22936.9, 60 sec: 24029.8, 300 sec: 24159.4). Total num frames: 400539648. Throughput: 0: 6034.6. Samples: 100130340. Policy #0 lag: (min: 1.0, avg: 3.8, max: 7.0) [2024-08-05 19:16:38,120][15372] Avg episode reward: [(0, '44.364')] [2024-08-05 19:16:40,217][15444] Updated weights for policy 0, policy_version 48901 (0.0012) [2024-08-05 19:16:42,304][15417] Signal inference workers to stop experience collection... (18000 times) [2024-08-05 19:16:42,304][15417] Signal inference workers to resume experience collection... (18000 times) [2024-08-05 19:16:42,377][15444] InferenceWorker_p0-w0: stopping experience collection (18000 times) [2024-08-05 19:16:42,377][15444] InferenceWorker_p0-w0: resuming experience collection (18000 times) [2024-08-05 19:16:43,119][15372] Fps is (10 sec: 23756.4, 60 sec: 24030.1, 300 sec: 24159.4). Total num frames: 400662528. Throughput: 0: 6053.5. Samples: 100167050. Policy #0 lag: (min: 1.0, avg: 3.8, max: 7.0) [2024-08-05 19:16:43,126][15372] Avg episode reward: [(0, '43.615')] [2024-08-05 19:16:43,316][15444] Updated weights for policy 0, policy_version 48911 (0.0024) [2024-08-05 19:16:46,810][15444] Updated weights for policy 0, policy_version 48921 (0.0021) [2024-08-05 19:16:48,118][15372] Fps is (10 sec: 24576.7, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 400785408. Throughput: 0: 6047.6. Samples: 100202500. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 19:16:48,119][15372] Avg episode reward: [(0, '43.033')] [2024-08-05 19:16:50,492][15444] Updated weights for policy 0, policy_version 48931 (0.0012) [2024-08-05 19:16:53,118][15372] Fps is (10 sec: 24576.4, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 400908288. Throughput: 0: 6057.1. Samples: 100221470. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 19:16:53,119][15372] Avg episode reward: [(0, '42.969')] [2024-08-05 19:16:53,629][15444] Updated weights for policy 0, policy_version 48941 (0.0013) [2024-08-05 19:16:57,211][15444] Updated weights for policy 0, policy_version 48951 (0.0012) [2024-08-05 19:16:58,119][15372] Fps is (10 sec: 24575.7, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 401031168. Throughput: 0: 6038.4. Samples: 100257160. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 19:16:58,119][15372] Avg episode reward: [(0, '43.426')] [2024-08-05 19:17:00,334][15444] Updated weights for policy 0, policy_version 48961 (0.0034) [2024-08-05 19:17:03,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 401145856. Throughput: 0: 6054.0. Samples: 100293620. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 19:17:03,119][15372] Avg episode reward: [(0, '42.204')] [2024-08-05 19:17:03,821][15444] Updated weights for policy 0, policy_version 48971 (0.0018) [2024-08-05 19:17:07,567][15444] Updated weights for policy 0, policy_version 48981 (0.0032) [2024-08-05 19:17:08,119][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.3, 300 sec: 24159.4). Total num frames: 401268736. Throughput: 0: 6042.4. Samples: 100311590. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 19:17:08,119][15372] Avg episode reward: [(0, '42.399')] [2024-08-05 19:17:10,555][15444] Updated weights for policy 0, policy_version 48991 (0.0013) [2024-08-05 19:17:13,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 401383424. Throughput: 0: 6036.4. Samples: 100347450. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 19:17:13,126][15372] Avg episode reward: [(0, '43.258')] [2024-08-05 19:17:14,198][15444] Updated weights for policy 0, policy_version 49001 (0.0032) [2024-08-05 19:17:17,912][15444] Updated weights for policy 0, policy_version 49011 (0.0011) [2024-08-05 19:17:18,121][15372] Fps is (10 sec: 23750.2, 60 sec: 24165.2, 300 sec: 24159.2). Total num frames: 401506304. Throughput: 0: 6019.8. Samples: 100382840. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 19:17:18,122][15372] Avg episode reward: [(0, '43.210')] [2024-08-05 19:17:20,838][15444] Updated weights for policy 0, policy_version 49021 (0.0021) [2024-08-05 19:17:22,299][15417] Signal inference workers to stop experience collection... (18050 times) [2024-08-05 19:17:22,307][15417] Signal inference workers to resume experience collection... (18050 times) [2024-08-05 19:17:22,349][15444] InferenceWorker_p0-w0: stopping experience collection (18050 times) [2024-08-05 19:17:22,349][15444] InferenceWorker_p0-w0: resuming experience collection (18050 times) [2024-08-05 19:17:23,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24166.4, 300 sec: 24159.6). Total num frames: 401629184. Throughput: 0: 6017.2. Samples: 100401110. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 19:17:23,119][15372] Avg episode reward: [(0, '43.789')] [2024-08-05 19:17:24,392][15444] Updated weights for policy 0, policy_version 49031 (0.0019) [2024-08-05 19:17:27,782][15444] Updated weights for policy 0, policy_version 49041 (0.0029) [2024-08-05 19:17:28,119][15372] Fps is (10 sec: 23763.6, 60 sec: 23893.3, 300 sec: 24159.5). Total num frames: 401743872. Throughput: 0: 6022.9. Samples: 100438080. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 19:17:28,119][15372] Avg episode reward: [(0, '43.340')] [2024-08-05 19:17:31,094][15444] Updated weights for policy 0, policy_version 49051 (0.0013) [2024-08-05 19:17:33,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24029.9, 300 sec: 24187.2). Total num frames: 401866752. Throughput: 0: 6029.1. Samples: 100473810. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 19:17:33,126][15372] Avg episode reward: [(0, '43.559')] [2024-08-05 19:17:34,723][15444] Updated weights for policy 0, policy_version 49061 (0.0018) [2024-08-05 19:17:37,811][15444] Updated weights for policy 0, policy_version 49071 (0.0015) [2024-08-05 19:17:38,118][15372] Fps is (10 sec: 25395.4, 60 sec: 24303.0, 300 sec: 24187.2). Total num frames: 401997824. Throughput: 0: 6001.5. Samples: 100491540. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 19:17:38,119][15372] Avg episode reward: [(0, '43.918')] [2024-08-05 19:17:41,432][15444] Updated weights for policy 0, policy_version 49081 (0.0024) [2024-08-05 19:17:43,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.9, 300 sec: 24187.2). Total num frames: 402104320. Throughput: 0: 6012.2. Samples: 100527710. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 19:17:43,119][15372] Avg episode reward: [(0, '43.682')] [2024-08-05 19:17:44,860][15444] Updated weights for policy 0, policy_version 49091 (0.0013) [2024-08-05 19:17:47,951][15444] Updated weights for policy 0, policy_version 49101 (0.0015) [2024-08-05 19:17:48,119][15372] Fps is (10 sec: 23756.6, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 402235392. Throughput: 0: 6019.8. Samples: 100564510. Policy #0 lag: (min: 1.0, avg: 3.4, max: 7.0) [2024-08-05 19:17:48,119][15372] Avg episode reward: [(0, '42.946')] [2024-08-05 19:17:51,617][15444] Updated weights for policy 0, policy_version 49111 (0.0013) [2024-08-05 19:17:53,118][15372] Fps is (10 sec: 25395.2, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 402358272. Throughput: 0: 6015.1. Samples: 100582270. Policy #0 lag: (min: 1.0, avg: 3.4, max: 7.0) [2024-08-05 19:17:53,119][15372] Avg episode reward: [(0, '43.533')] [2024-08-05 19:17:54,708][15444] Updated weights for policy 0, policy_version 49121 (0.0013) [2024-08-05 19:17:58,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24029.9, 300 sec: 24187.2). Total num frames: 402472960. Throughput: 0: 6035.8. Samples: 100619060. Policy #0 lag: (min: 1.0, avg: 3.4, max: 7.0) [2024-08-05 19:17:58,126][15372] Avg episode reward: [(0, '44.160')] [2024-08-05 19:17:58,295][15444] Updated weights for policy 0, policy_version 49131 (0.0010) [2024-08-05 19:18:01,549][15444] Updated weights for policy 0, policy_version 49141 (0.0012) [2024-08-05 19:18:03,119][15372] Fps is (10 sec: 23756.1, 60 sec: 24166.3, 300 sec: 24187.2). Total num frames: 402595840. Throughput: 0: 6056.1. Samples: 100655350. Policy #0 lag: (min: 1.0, avg: 3.4, max: 7.0) [2024-08-05 19:18:03,127][15372] Avg episode reward: [(0, '43.469')] [2024-08-05 19:18:03,339][15417] Signal inference workers to stop experience collection... (18100 times) [2024-08-05 19:18:03,339][15417] Signal inference workers to resume experience collection... (18100 times) [2024-08-05 19:18:03,382][15444] InferenceWorker_p0-w0: stopping experience collection (18100 times) [2024-08-05 19:18:03,382][15444] InferenceWorker_p0-w0: resuming experience collection (18100 times) [2024-08-05 19:18:04,906][15444] Updated weights for policy 0, policy_version 49151 (0.0011) [2024-08-05 19:18:08,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.5, 300 sec: 24187.2). Total num frames: 402718720. Throughput: 0: 6065.1. Samples: 100674040. Policy #0 lag: (min: 1.0, avg: 3.4, max: 7.0) [2024-08-05 19:18:08,126][15372] Avg episode reward: [(0, '43.631')] [2024-08-05 19:18:08,424][15444] Updated weights for policy 0, policy_version 49161 (0.0024) [2024-08-05 19:18:11,584][15444] Updated weights for policy 0, policy_version 49171 (0.0012) [2024-08-05 19:18:13,119][15372] Fps is (10 sec: 24576.5, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 402841600. Throughput: 0: 6047.3. Samples: 100710210. Policy #0 lag: (min: 1.0, avg: 3.4, max: 7.0) [2024-08-05 19:18:13,126][15372] Avg episode reward: [(0, '43.660')] [2024-08-05 19:18:15,139][15444] Updated weights for policy 0, policy_version 49181 (0.0013) [2024-08-05 19:18:18,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24304.1, 300 sec: 24187.2). Total num frames: 402964480. Throughput: 0: 6060.9. Samples: 100746550. Policy #0 lag: (min: 0.0, avg: 4.3, max: 8.0) [2024-08-05 19:18:18,126][15372] Avg episode reward: [(0, '43.218')] [2024-08-05 19:18:18,129][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000049190_402964480.pth... [2024-08-05 19:18:18,280][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000048481_397156352.pth [2024-08-05 19:18:18,373][15444] Updated weights for policy 0, policy_version 49191 (0.0016) [2024-08-05 19:18:21,945][15444] Updated weights for policy 0, policy_version 49201 (0.0012) [2024-08-05 19:18:23,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 403079168. Throughput: 0: 6068.0. Samples: 100764600. Policy #0 lag: (min: 0.0, avg: 4.3, max: 8.0) [2024-08-05 19:18:23,119][15372] Avg episode reward: [(0, '43.156')] [2024-08-05 19:18:25,529][15444] Updated weights for policy 0, policy_version 49211 (0.0031) [2024-08-05 19:18:28,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24303.0, 300 sec: 24159.5). Total num frames: 403202048. Throughput: 0: 6066.0. Samples: 100800680. Policy #0 lag: (min: 0.0, avg: 4.3, max: 8.0) [2024-08-05 19:18:28,119][15372] Avg episode reward: [(0, '43.748')] [2024-08-05 19:18:28,515][15444] Updated weights for policy 0, policy_version 49221 (0.0016) [2024-08-05 19:18:32,129][15444] Updated weights for policy 0, policy_version 49231 (0.0016) [2024-08-05 19:18:33,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 403324928. Throughput: 0: 6051.3. Samples: 100836820. Policy #0 lag: (min: 0.0, avg: 4.3, max: 8.0) [2024-08-05 19:18:33,119][15372] Avg episode reward: [(0, '42.881')] [2024-08-05 19:18:35,311][15444] Updated weights for policy 0, policy_version 49241 (0.0019) [2024-08-05 19:18:38,121][15372] Fps is (10 sec: 23751.5, 60 sec: 24029.0, 300 sec: 24159.3). Total num frames: 403439616. Throughput: 0: 6078.6. Samples: 100855820. Policy #0 lag: (min: 0.0, avg: 4.3, max: 8.0) [2024-08-05 19:18:38,122][15372] Avg episode reward: [(0, '44.552')] [2024-08-05 19:18:38,790][15444] Updated weights for policy 0, policy_version 49251 (0.0014) [2024-08-05 19:18:42,143][15444] Updated weights for policy 0, policy_version 49261 (0.0013) [2024-08-05 19:18:42,704][15417] Signal inference workers to stop experience collection... (18150 times) [2024-08-05 19:18:42,704][15417] Signal inference workers to resume experience collection... (18150 times) [2024-08-05 19:18:42,774][15444] InferenceWorker_p0-w0: stopping experience collection (18150 times) [2024-08-05 19:18:42,784][15444] InferenceWorker_p0-w0: resuming experience collection (18150 times) [2024-08-05 19:18:43,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 403562496. Throughput: 0: 6060.2. Samples: 100891770. Policy #0 lag: (min: 0.0, avg: 4.3, max: 8.0) [2024-08-05 19:18:43,119][15372] Avg episode reward: [(0, '44.566')] [2024-08-05 19:18:45,369][15444] Updated weights for policy 0, policy_version 49271 (0.0012) [2024-08-05 19:18:48,119][15372] Fps is (10 sec: 25400.5, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 403693568. Throughput: 0: 6073.8. Samples: 100928670. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 19:18:48,119][15372] Avg episode reward: [(0, '43.441')] [2024-08-05 19:18:49,003][15444] Updated weights for policy 0, policy_version 49281 (0.0015) [2024-08-05 19:18:52,121][15444] Updated weights for policy 0, policy_version 49291 (0.0020) [2024-08-05 19:18:53,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 403800064. Throughput: 0: 6067.6. Samples: 100947080. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 19:18:53,119][15372] Avg episode reward: [(0, '43.519')] [2024-08-05 19:18:55,642][15444] Updated weights for policy 0, policy_version 49301 (0.0025) [2024-08-05 19:18:58,121][15372] Fps is (10 sec: 23750.2, 60 sec: 24301.8, 300 sec: 24187.3). Total num frames: 403931136. Throughput: 0: 6041.4. Samples: 100982090. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 19:18:58,129][15372] Avg episode reward: [(0, '43.744')] [2024-08-05 19:18:59,590][15444] Updated weights for policy 0, policy_version 49311 (0.0014) [2024-08-05 19:19:02,441][15444] Updated weights for policy 0, policy_version 49321 (0.0017) [2024-08-05 19:19:03,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.5, 300 sec: 24159.5). Total num frames: 404045824. Throughput: 0: 6022.2. Samples: 101017550. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 19:19:03,119][15372] Avg episode reward: [(0, '43.183')] [2024-08-05 19:19:06,283][15444] Updated weights for policy 0, policy_version 49331 (0.0023) [2024-08-05 19:19:08,118][15372] Fps is (10 sec: 22944.4, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 404160512. Throughput: 0: 6030.7. Samples: 101035980. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 19:19:08,119][15372] Avg episode reward: [(0, '42.326')] [2024-08-05 19:19:09,264][15444] Updated weights for policy 0, policy_version 49341 (0.0021) [2024-08-05 19:19:12,766][15444] Updated weights for policy 0, policy_version 49351 (0.0036) [2024-08-05 19:19:13,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24166.4, 300 sec: 24187.3). Total num frames: 404291584. Throughput: 0: 6049.1. Samples: 101072890. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 19:19:13,119][15372] Avg episode reward: [(0, '44.073')] [2024-08-05 19:19:15,456][15417] Signal inference workers to stop experience collection... (18200 times) [2024-08-05 19:19:15,456][15417] Signal inference workers to resume experience collection... (18200 times) [2024-08-05 19:19:15,519][15444] InferenceWorker_p0-w0: stopping experience collection (18200 times) [2024-08-05 19:19:15,520][15444] InferenceWorker_p0-w0: resuming experience collection (18200 times) [2024-08-05 19:19:16,294][15444] Updated weights for policy 0, policy_version 49361 (0.0018) [2024-08-05 19:19:18,119][15372] Fps is (10 sec: 24575.4, 60 sec: 24029.8, 300 sec: 24159.4). Total num frames: 404406272. Throughput: 0: 6038.0. Samples: 101108530. Policy #0 lag: (min: 1.0, avg: 4.4, max: 9.0) [2024-08-05 19:19:18,119][15372] Avg episode reward: [(0, '44.493')] [2024-08-05 19:19:19,374][15444] Updated weights for policy 0, policy_version 49371 (0.0021) [2024-08-05 19:19:22,829][15444] Updated weights for policy 0, policy_version 49381 (0.0018) [2024-08-05 19:19:23,119][15372] Fps is (10 sec: 23754.4, 60 sec: 24166.0, 300 sec: 24187.1). Total num frames: 404529152. Throughput: 0: 6039.3. Samples: 101127580. Policy #0 lag: (min: 1.0, avg: 4.4, max: 9.0) [2024-08-05 19:19:23,120][15372] Avg episode reward: [(0, '43.179')] [2024-08-05 19:19:26,010][15444] Updated weights for policy 0, policy_version 49391 (0.0023) [2024-08-05 19:19:28,118][15372] Fps is (10 sec: 24576.4, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 404652032. Throughput: 0: 6050.2. Samples: 101164030. Policy #0 lag: (min: 1.0, avg: 4.4, max: 9.0) [2024-08-05 19:19:28,119][15372] Avg episode reward: [(0, '43.342')] [2024-08-05 19:19:29,698][15444] Updated weights for policy 0, policy_version 49401 (0.0021) [2024-08-05 19:19:33,118][15372] Fps is (10 sec: 23758.9, 60 sec: 24029.8, 300 sec: 24159.5). Total num frames: 404766720. Throughput: 0: 6014.5. Samples: 101199320. Policy #0 lag: (min: 1.0, avg: 4.4, max: 9.0) [2024-08-05 19:19:33,126][15372] Avg episode reward: [(0, '44.054')] [2024-08-05 19:19:33,265][15444] Updated weights for policy 0, policy_version 49411 (0.0013) [2024-08-05 19:19:36,249][15444] Updated weights for policy 0, policy_version 49421 (0.0036) [2024-08-05 19:19:38,119][15372] Fps is (10 sec: 24575.6, 60 sec: 24303.8, 300 sec: 24187.2). Total num frames: 404897792. Throughput: 0: 6021.1. Samples: 101218030. Policy #0 lag: (min: 1.0, avg: 4.4, max: 9.0) [2024-08-05 19:19:38,127][15372] Avg episode reward: [(0, '43.474')] [2024-08-05 19:19:39,910][15444] Updated weights for policy 0, policy_version 49431 (0.0014) [2024-08-05 19:19:43,119][15372] Fps is (10 sec: 24575.2, 60 sec: 24166.2, 300 sec: 24159.4). Total num frames: 405012480. Throughput: 0: 6048.1. Samples: 101254240. Policy #0 lag: (min: 1.0, avg: 4.4, max: 9.0) [2024-08-05 19:19:43,126][15372] Avg episode reward: [(0, '42.697')] [2024-08-05 19:19:43,352][15444] Updated weights for policy 0, policy_version 49441 (0.0015) [2024-08-05 19:19:46,686][15444] Updated weights for policy 0, policy_version 49451 (0.0032) [2024-08-05 19:19:48,118][15372] Fps is (10 sec: 22937.9, 60 sec: 23893.4, 300 sec: 24159.5). Total num frames: 405127168. Throughput: 0: 6043.3. Samples: 101289500. Policy #0 lag: (min: 1.0, avg: 4.4, max: 9.0) [2024-08-05 19:19:48,119][15372] Avg episode reward: [(0, '43.870')] [2024-08-05 19:19:50,165][15444] Updated weights for policy 0, policy_version 49461 (0.0011) [2024-08-05 19:19:53,119][15372] Fps is (10 sec: 24576.7, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 405258240. Throughput: 0: 6048.7. Samples: 101308170. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 19:19:53,126][15372] Avg episode reward: [(0, '44.004')] [2024-08-05 19:19:53,480][15444] Updated weights for policy 0, policy_version 49471 (0.0017) [2024-08-05 19:19:57,113][15444] Updated weights for policy 0, policy_version 49481 (0.0011) [2024-08-05 19:19:58,119][15372] Fps is (10 sec: 24575.7, 60 sec: 24031.0, 300 sec: 24131.7). Total num frames: 405372928. Throughput: 0: 6022.2. Samples: 101343890. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 19:19:58,119][15372] Avg episode reward: [(0, '43.379')] [2024-08-05 19:20:00,340][15444] Updated weights for policy 0, policy_version 49491 (0.0014) [2024-08-05 19:20:03,118][15372] Fps is (10 sec: 22937.8, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 405487616. Throughput: 0: 6019.4. Samples: 101379400. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 19:20:03,119][15372] Avg episode reward: [(0, '42.993')] [2024-08-05 19:20:03,861][15444] Updated weights for policy 0, policy_version 49501 (0.0011) [2024-08-05 19:20:06,205][15417] Signal inference workers to stop experience collection... (18250 times) [2024-08-05 19:20:06,205][15417] Signal inference workers to resume experience collection... (18250 times) [2024-08-05 19:20:06,277][15444] InferenceWorker_p0-w0: stopping experience collection (18250 times) [2024-08-05 19:20:06,277][15444] InferenceWorker_p0-w0: resuming experience collection (18250 times) [2024-08-05 19:20:07,472][15444] Updated weights for policy 0, policy_version 49511 (0.0013) [2024-08-05 19:20:08,118][15372] Fps is (10 sec: 23757.3, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 405610496. Throughput: 0: 5999.5. Samples: 101397550. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 19:20:08,119][15372] Avg episode reward: [(0, '42.927')] [2024-08-05 19:20:10,495][15444] Updated weights for policy 0, policy_version 49521 (0.0024) [2024-08-05 19:20:13,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24029.8, 300 sec: 24131.7). Total num frames: 405733376. Throughput: 0: 5998.2. Samples: 101433950. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 19:20:13,119][15372] Avg episode reward: [(0, '42.693')] [2024-08-05 19:20:14,071][15444] Updated weights for policy 0, policy_version 49531 (0.0012) [2024-08-05 19:20:17,407][15444] Updated weights for policy 0, policy_version 49541 (0.0023) [2024-08-05 19:20:18,119][15372] Fps is (10 sec: 23755.8, 60 sec: 24029.8, 300 sec: 24103.9). Total num frames: 405848064. Throughput: 0: 5999.7. Samples: 101469310. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 19:20:18,120][15372] Avg episode reward: [(0, '43.316')] [2024-08-05 19:20:18,175][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000049543_405856256.pth... [2024-08-05 19:20:18,288][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000048836_400064512.pth [2024-08-05 19:20:20,806][15444] Updated weights for policy 0, policy_version 49551 (0.0013) [2024-08-05 19:20:23,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.8, 300 sec: 24159.5). Total num frames: 405979136. Throughput: 0: 5992.5. Samples: 101487690. Policy #0 lag: (min: 1.0, avg: 4.4, max: 9.0) [2024-08-05 19:20:23,119][15372] Avg episode reward: [(0, '43.517')] [2024-08-05 19:20:24,354][15444] Updated weights for policy 0, policy_version 49561 (0.0019) [2024-08-05 19:20:27,543][15444] Updated weights for policy 0, policy_version 49571 (0.0015) [2024-08-05 19:20:28,118][15372] Fps is (10 sec: 24577.0, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 406093824. Throughput: 0: 5989.2. Samples: 101523750. Policy #0 lag: (min: 1.0, avg: 4.4, max: 9.0) [2024-08-05 19:20:28,119][15372] Avg episode reward: [(0, '43.438')] [2024-08-05 19:20:31,198][15444] Updated weights for policy 0, policy_version 49581 (0.0022) [2024-08-05 19:20:33,118][15372] Fps is (10 sec: 22937.6, 60 sec: 24029.9, 300 sec: 24103.9). Total num frames: 406208512. Throughput: 0: 5999.3. Samples: 101559470. Policy #0 lag: (min: 1.0, avg: 4.4, max: 9.0) [2024-08-05 19:20:33,119][15372] Avg episode reward: [(0, '43.889')] [2024-08-05 19:20:34,467][15444] Updated weights for policy 0, policy_version 49591 (0.0015) [2024-08-05 19:20:37,887][15444] Updated weights for policy 0, policy_version 49601 (0.0012) [2024-08-05 19:20:38,119][15372] Fps is (10 sec: 23756.7, 60 sec: 23893.4, 300 sec: 24104.0). Total num frames: 406331392. Throughput: 0: 6001.6. Samples: 101578240. Policy #0 lag: (min: 1.0, avg: 4.4, max: 9.0) [2024-08-05 19:20:38,119][15372] Avg episode reward: [(0, '43.776')] [2024-08-05 19:20:41,181][15444] Updated weights for policy 0, policy_version 49611 (0.0019) [2024-08-05 19:20:43,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24030.0, 300 sec: 24131.7). Total num frames: 406454272. Throughput: 0: 6012.9. Samples: 101614470. Policy #0 lag: (min: 1.0, avg: 4.4, max: 9.0) [2024-08-05 19:20:43,126][15372] Avg episode reward: [(0, '42.833')] [2024-08-05 19:20:44,511][15444] Updated weights for policy 0, policy_version 49621 (0.0026) [2024-08-05 19:20:48,071][15444] Updated weights for policy 0, policy_version 49631 (0.0022) [2024-08-05 19:20:48,119][15372] Fps is (10 sec: 24575.4, 60 sec: 24166.3, 300 sec: 24131.7). Total num frames: 406577152. Throughput: 0: 6035.3. Samples: 101650990. Policy #0 lag: (min: 1.0, avg: 4.4, max: 9.0) [2024-08-05 19:20:48,119][15372] Avg episode reward: [(0, '42.572')] [2024-08-05 19:20:51,058][15444] Updated weights for policy 0, policy_version 49641 (0.0019) [2024-08-05 19:20:52,026][15417] Signal inference workers to stop experience collection... (18300 times) [2024-08-05 19:20:52,034][15417] Signal inference workers to resume experience collection... (18300 times) [2024-08-05 19:20:52,091][15444] InferenceWorker_p0-w0: stopping experience collection (18300 times) [2024-08-05 19:20:52,091][15444] InferenceWorker_p0-w0: resuming experience collection (18300 times) [2024-08-05 19:20:53,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 406700032. Throughput: 0: 6037.3. Samples: 101669230. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 19:20:53,119][15372] Avg episode reward: [(0, '43.662')] [2024-08-05 19:20:54,648][15444] Updated weights for policy 0, policy_version 49651 (0.0012) [2024-08-05 19:20:58,119][15372] Fps is (10 sec: 24576.3, 60 sec: 24166.4, 300 sec: 24159.4). Total num frames: 406822912. Throughput: 0: 6046.9. Samples: 101706060. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 19:20:58,127][15372] Avg episode reward: [(0, '42.942')] [2024-08-05 19:20:58,134][15444] Updated weights for policy 0, policy_version 49661 (0.0022) [2024-08-05 19:21:01,283][15444] Updated weights for policy 0, policy_version 49671 (0.0021) [2024-08-05 19:21:03,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24303.0, 300 sec: 24159.5). Total num frames: 406945792. Throughput: 0: 6049.4. Samples: 101741530. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 19:21:03,126][15372] Avg episode reward: [(0, '42.404')] [2024-08-05 19:21:04,663][15444] Updated weights for policy 0, policy_version 49681 (0.0017) [2024-08-05 19:21:08,119][15372] Fps is (10 sec: 23755.6, 60 sec: 24166.1, 300 sec: 24131.6). Total num frames: 407060480. Throughput: 0: 6063.9. Samples: 101760570. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 19:21:08,127][15372] Avg episode reward: [(0, '43.081')] [2024-08-05 19:21:08,177][15444] Updated weights for policy 0, policy_version 49691 (0.0016) [2024-08-05 19:21:11,450][15444] Updated weights for policy 0, policy_version 49701 (0.0014) [2024-08-05 19:21:13,119][15372] Fps is (10 sec: 23755.6, 60 sec: 24166.2, 300 sec: 24159.4). Total num frames: 407183360. Throughput: 0: 6068.8. Samples: 101796850. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 19:21:13,127][15372] Avg episode reward: [(0, '43.467')] [2024-08-05 19:21:14,890][15444] Updated weights for policy 0, policy_version 49711 (0.0023) [2024-08-05 19:21:18,119][15372] Fps is (10 sec: 24577.1, 60 sec: 24303.0, 300 sec: 24159.4). Total num frames: 407306240. Throughput: 0: 6079.1. Samples: 101833030. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 19:21:18,127][15372] Avg episode reward: [(0, '43.168')] [2024-08-05 19:21:18,253][15444] Updated weights for policy 0, policy_version 49721 (0.0013) [2024-08-05 19:21:21,576][15444] Updated weights for policy 0, policy_version 49731 (0.0015) [2024-08-05 19:21:23,118][15372] Fps is (10 sec: 24577.1, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 407429120. Throughput: 0: 6069.8. Samples: 101851380. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 19:21:23,126][15372] Avg episode reward: [(0, '43.006')] [2024-08-05 19:21:25,059][15444] Updated weights for policy 0, policy_version 49741 (0.0019) [2024-08-05 19:21:28,118][15372] Fps is (10 sec: 24576.4, 60 sec: 24302.9, 300 sec: 24159.5). Total num frames: 407552000. Throughput: 0: 6081.6. Samples: 101888140. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 19:21:28,126][15372] Avg episode reward: [(0, '43.839')] [2024-08-05 19:21:28,329][15444] Updated weights for policy 0, policy_version 49751 (0.0012) [2024-08-05 19:21:31,788][15444] Updated weights for policy 0, policy_version 49761 (0.0016) [2024-08-05 19:21:33,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24302.9, 300 sec: 24159.5). Total num frames: 407666688. Throughput: 0: 6060.5. Samples: 101923710. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 19:21:33,119][15372] Avg episode reward: [(0, '43.758')] [2024-08-05 19:21:35,241][15444] Updated weights for policy 0, policy_version 49771 (0.0012) [2024-08-05 19:21:38,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24302.9, 300 sec: 24159.5). Total num frames: 407789568. Throughput: 0: 6083.8. Samples: 101943000. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 19:21:38,126][15372] Avg episode reward: [(0, '43.359')] [2024-08-05 19:21:38,453][15444] Updated weights for policy 0, policy_version 49781 (0.0013) [2024-08-05 19:21:42,108][15444] Updated weights for policy 0, policy_version 49791 (0.0013) [2024-08-05 19:21:43,118][15372] Fps is (10 sec: 25395.2, 60 sec: 24439.5, 300 sec: 24187.2). Total num frames: 407920640. Throughput: 0: 6059.6. Samples: 101978740. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 19:21:43,119][15372] Avg episode reward: [(0, '43.480')] [2024-08-05 19:21:45,136][15444] Updated weights for policy 0, policy_version 49801 (0.0010) [2024-08-05 19:21:48,119][15372] Fps is (10 sec: 23756.5, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 408027136. Throughput: 0: 6064.4. Samples: 102014430. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 19:21:48,126][15372] Avg episode reward: [(0, '42.905')] [2024-08-05 19:21:48,780][15444] Updated weights for policy 0, policy_version 49811 (0.0020) [2024-08-05 19:21:49,521][15417] Signal inference workers to stop experience collection... (18350 times) [2024-08-05 19:21:49,526][15417] Signal inference workers to resume experience collection... (18350 times) [2024-08-05 19:21:49,596][15444] InferenceWorker_p0-w0: stopping experience collection (18350 times) [2024-08-05 19:21:49,597][15444] InferenceWorker_p0-w0: resuming experience collection (18350 times) [2024-08-05 19:21:52,419][15444] Updated weights for policy 0, policy_version 49821 (0.0014) [2024-08-05 19:21:53,118][15372] Fps is (10 sec: 23756.6, 60 sec: 24302.9, 300 sec: 24159.5). Total num frames: 408158208. Throughput: 0: 6051.9. Samples: 102032900. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 19:21:53,126][15372] Avg episode reward: [(0, '42.189')] [2024-08-05 19:21:55,369][15444] Updated weights for policy 0, policy_version 49831 (0.0022) [2024-08-05 19:21:58,119][15372] Fps is (10 sec: 23756.4, 60 sec: 24029.8, 300 sec: 24131.7). Total num frames: 408264704. Throughput: 0: 6028.7. Samples: 102068140. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 19:21:58,119][15372] Avg episode reward: [(0, '43.327')] [2024-08-05 19:21:59,239][15444] Updated weights for policy 0, policy_version 49841 (0.0015) [2024-08-05 19:22:02,547][15444] Updated weights for policy 0, policy_version 49851 (0.0041) [2024-08-05 19:22:03,118][15372] Fps is (10 sec: 22937.8, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 408387584. Throughput: 0: 6005.6. Samples: 102103280. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 19:22:03,119][15372] Avg episode reward: [(0, '43.766')] [2024-08-05 19:22:05,966][15444] Updated weights for policy 0, policy_version 49861 (0.0011) [2024-08-05 19:22:08,118][15372] Fps is (10 sec: 24576.9, 60 sec: 24166.7, 300 sec: 24159.5). Total num frames: 408510464. Throughput: 0: 6000.7. Samples: 102121410. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 19:22:08,120][15372] Avg episode reward: [(0, '43.280')] [2024-08-05 19:22:09,556][15444] Updated weights for policy 0, policy_version 49871 (0.0020) [2024-08-05 19:22:12,734][15444] Updated weights for policy 0, policy_version 49881 (0.0033) [2024-08-05 19:22:13,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.6, 300 sec: 24159.7). Total num frames: 408633344. Throughput: 0: 5996.7. Samples: 102157990. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 19:22:13,119][15372] Avg episode reward: [(0, '42.911')] [2024-08-05 19:22:16,304][15444] Updated weights for policy 0, policy_version 49891 (0.0033) [2024-08-05 19:22:18,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 408748032. Throughput: 0: 5986.2. Samples: 102193090. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 19:22:18,119][15372] Avg episode reward: [(0, '43.195')] [2024-08-05 19:22:18,122][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000049896_408748032.pth... [2024-08-05 19:22:18,281][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000049190_402964480.pth [2024-08-05 19:22:19,552][15444] Updated weights for policy 0, policy_version 49901 (0.0017) [2024-08-05 19:22:23,108][15444] Updated weights for policy 0, policy_version 49911 (0.0022) [2024-08-05 19:22:23,119][15372] Fps is (10 sec: 23754.9, 60 sec: 24029.6, 300 sec: 24159.4). Total num frames: 408870912. Throughput: 0: 5971.9. Samples: 102211740. Policy #0 lag: (min: 0.0, avg: 3.5, max: 8.0) [2024-08-05 19:22:23,120][15372] Avg episode reward: [(0, '43.945')] [2024-08-05 19:22:26,712][15444] Updated weights for policy 0, policy_version 49921 (0.0012) [2024-08-05 19:22:28,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 408993792. Throughput: 0: 5971.1. Samples: 102247440. Policy #0 lag: (min: 1.0, avg: 4.3, max: 8.0) [2024-08-05 19:22:28,119][15372] Avg episode reward: [(0, '43.898')] [2024-08-05 19:22:29,831][15444] Updated weights for policy 0, policy_version 49931 (0.0022) [2024-08-05 19:22:31,960][15417] Signal inference workers to stop experience collection... (18400 times) [2024-08-05 19:22:31,962][15417] Signal inference workers to resume experience collection... (18400 times) [2024-08-05 19:22:32,038][15444] InferenceWorker_p0-w0: stopping experience collection (18400 times) [2024-08-05 19:22:32,044][15444] InferenceWorker_p0-w0: resuming experience collection (18400 times) [2024-08-05 19:22:33,119][15372] Fps is (10 sec: 23758.2, 60 sec: 24029.8, 300 sec: 24103.9). Total num frames: 409108480. Throughput: 0: 5982.9. Samples: 102283660. Policy #0 lag: (min: 1.0, avg: 4.3, max: 8.0) [2024-08-05 19:22:33,126][15372] Avg episode reward: [(0, '44.175')] [2024-08-05 19:22:33,427][15444] Updated weights for policy 0, policy_version 49941 (0.0026) [2024-08-05 19:22:36,440][15444] Updated weights for policy 0, policy_version 49951 (0.0017) [2024-08-05 19:22:38,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 409231360. Throughput: 0: 5979.1. Samples: 102301960. Policy #0 lag: (min: 1.0, avg: 4.3, max: 8.0) [2024-08-05 19:22:38,119][15372] Avg episode reward: [(0, '41.761')] [2024-08-05 19:22:40,063][15444] Updated weights for policy 0, policy_version 49961 (0.0012) [2024-08-05 19:22:43,118][15372] Fps is (10 sec: 24576.4, 60 sec: 23893.3, 300 sec: 24131.7). Total num frames: 409354240. Throughput: 0: 6013.8. Samples: 102338760. Policy #0 lag: (min: 1.0, avg: 4.3, max: 8.0) [2024-08-05 19:22:43,126][15372] Avg episode reward: [(0, '41.235')] [2024-08-05 19:22:43,526][15444] Updated weights for policy 0, policy_version 49971 (0.0022) [2024-08-05 19:22:46,748][15444] Updated weights for policy 0, policy_version 49981 (0.0011) [2024-08-05 19:22:48,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.9, 300 sec: 24103.9). Total num frames: 409468928. Throughput: 0: 6024.7. Samples: 102374390. Policy #0 lag: (min: 1.0, avg: 4.3, max: 8.0) [2024-08-05 19:22:48,119][15372] Avg episode reward: [(0, '42.986')] [2024-08-05 19:22:50,338][15444] Updated weights for policy 0, policy_version 49991 (0.0013) [2024-08-05 19:22:53,123][15372] Fps is (10 sec: 24564.5, 60 sec: 24028.0, 300 sec: 24159.1). Total num frames: 409600000. Throughput: 0: 6025.2. Samples: 102392570. Policy #0 lag: (min: 1.0, avg: 4.3, max: 8.0) [2024-08-05 19:22:53,123][15372] Avg episode reward: [(0, '43.770')] [2024-08-05 19:22:53,427][15444] Updated weights for policy 0, policy_version 50001 (0.0038) [2024-08-05 19:22:56,939][15444] Updated weights for policy 0, policy_version 50011 (0.0012) [2024-08-05 19:22:58,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24166.5, 300 sec: 24131.7). Total num frames: 409714688. Throughput: 0: 6015.6. Samples: 102428690. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 19:22:58,119][15372] Avg episode reward: [(0, '42.771')] [2024-08-05 19:23:00,302][15444] Updated weights for policy 0, policy_version 50021 (0.0011) [2024-08-05 19:23:03,118][15372] Fps is (10 sec: 23768.0, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 409837568. Throughput: 0: 6058.2. Samples: 102465710. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 19:23:03,119][15372] Avg episode reward: [(0, '42.696')] [2024-08-05 19:23:03,810][15444] Updated weights for policy 0, policy_version 50031 (0.0019) [2024-08-05 19:23:06,940][15444] Updated weights for policy 0, policy_version 50041 (0.0012) [2024-08-05 19:23:08,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 409960448. Throughput: 0: 6055.2. Samples: 102484220. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 19:23:08,119][15372] Avg episode reward: [(0, '43.210')] [2024-08-05 19:23:10,387][15444] Updated weights for policy 0, policy_version 50051 (0.0026) [2024-08-05 19:23:13,119][15372] Fps is (10 sec: 24574.9, 60 sec: 24166.2, 300 sec: 24131.7). Total num frames: 410083328. Throughput: 0: 6049.5. Samples: 102519670. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 19:23:13,119][15372] Avg episode reward: [(0, '41.686')] [2024-08-05 19:23:14,059][15444] Updated weights for policy 0, policy_version 50061 (0.0012) [2024-08-05 19:23:15,006][15417] Signal inference workers to stop experience collection... (18450 times) [2024-08-05 19:23:15,007][15417] Signal inference workers to resume experience collection... (18450 times) [2024-08-05 19:23:15,082][15444] InferenceWorker_p0-w0: stopping experience collection (18450 times) [2024-08-05 19:23:15,082][15444] InferenceWorker_p0-w0: resuming experience collection (18450 times) [2024-08-05 19:23:17,131][15444] Updated weights for policy 0, policy_version 50071 (0.0018) [2024-08-05 19:23:18,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 410198016. Throughput: 0: 6049.6. Samples: 102555890. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 19:23:18,119][15372] Avg episode reward: [(0, '41.851')] [2024-08-05 19:23:20,720][15444] Updated weights for policy 0, policy_version 50081 (0.0016) [2024-08-05 19:23:23,119][15372] Fps is (10 sec: 23757.6, 60 sec: 24166.7, 300 sec: 24131.7). Total num frames: 410320896. Throughput: 0: 6067.3. Samples: 102574990. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 19:23:23,119][15372] Avg episode reward: [(0, '43.086')] [2024-08-05 19:23:24,070][15444] Updated weights for policy 0, policy_version 50091 (0.0029) [2024-08-05 19:23:27,351][15444] Updated weights for policy 0, policy_version 50101 (0.0011) [2024-08-05 19:23:28,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 410443776. Throughput: 0: 6051.8. Samples: 102611090. Policy #0 lag: (min: 0.0, avg: 5.1, max: 9.0) [2024-08-05 19:23:28,119][15372] Avg episode reward: [(0, '44.458')] [2024-08-05 19:23:30,807][15444] Updated weights for policy 0, policy_version 50111 (0.0018) [2024-08-05 19:23:33,120][15372] Fps is (10 sec: 23753.5, 60 sec: 24165.9, 300 sec: 24131.8). Total num frames: 410558464. Throughput: 0: 6062.2. Samples: 102647200. Policy #0 lag: (min: 0.0, avg: 5.1, max: 9.0) [2024-08-05 19:23:33,129][15372] Avg episode reward: [(0, '44.555')] [2024-08-05 19:23:34,065][15444] Updated weights for policy 0, policy_version 50121 (0.0022) [2024-08-05 19:23:37,652][15444] Updated weights for policy 0, policy_version 50131 (0.0022) [2024-08-05 19:23:38,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 410681344. Throughput: 0: 6058.0. Samples: 102665150. Policy #0 lag: (min: 0.0, avg: 5.1, max: 9.0) [2024-08-05 19:23:38,119][15372] Avg episode reward: [(0, '43.812')] [2024-08-05 19:23:40,984][15444] Updated weights for policy 0, policy_version 50141 (0.0027) [2024-08-05 19:23:43,121][15372] Fps is (10 sec: 24574.7, 60 sec: 24165.6, 300 sec: 24103.8). Total num frames: 410804224. Throughput: 0: 6045.7. Samples: 102700760. Policy #0 lag: (min: 0.0, avg: 5.1, max: 9.0) [2024-08-05 19:23:43,121][15372] Avg episode reward: [(0, '44.021')] [2024-08-05 19:23:44,541][15444] Updated weights for policy 0, policy_version 50151 (0.0014) [2024-08-05 19:23:47,930][15444] Updated weights for policy 0, policy_version 50161 (0.0026) [2024-08-05 19:23:48,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 410918912. Throughput: 0: 6010.2. Samples: 102736170. Policy #0 lag: (min: 0.0, avg: 5.1, max: 9.0) [2024-08-05 19:23:48,119][15372] Avg episode reward: [(0, '42.606')] [2024-08-05 19:23:51,215][15444] Updated weights for policy 0, policy_version 50171 (0.0021) [2024-08-05 19:23:53,118][15372] Fps is (10 sec: 23761.7, 60 sec: 24031.7, 300 sec: 24104.2). Total num frames: 411041792. Throughput: 0: 6015.1. Samples: 102754900. Policy #0 lag: (min: 0.0, avg: 5.1, max: 9.0) [2024-08-05 19:23:53,126][15372] Avg episode reward: [(0, '42.739')] [2024-08-05 19:23:54,668][15444] Updated weights for policy 0, policy_version 50181 (0.0011) [2024-08-05 19:23:58,031][15444] Updated weights for policy 0, policy_version 50191 (0.0012) [2024-08-05 19:23:58,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 411164672. Throughput: 0: 6040.1. Samples: 102791470. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 19:23:58,119][15372] Avg episode reward: [(0, '43.363')] [2024-08-05 19:24:01,353][15444] Updated weights for policy 0, policy_version 50201 (0.0016) [2024-08-05 19:24:03,119][15372] Fps is (10 sec: 24575.5, 60 sec: 24166.3, 300 sec: 24159.4). Total num frames: 411287552. Throughput: 0: 6029.3. Samples: 102827210. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 19:24:03,127][15372] Avg episode reward: [(0, '43.742')] [2024-08-05 19:24:04,717][15444] Updated weights for policy 0, policy_version 50211 (0.0022) [2024-08-05 19:24:07,997][15444] Updated weights for policy 0, policy_version 50221 (0.0018) [2024-08-05 19:24:08,119][15372] Fps is (10 sec: 24575.5, 60 sec: 24166.3, 300 sec: 24131.7). Total num frames: 411410432. Throughput: 0: 6025.8. Samples: 102846150. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 19:24:08,119][15372] Avg episode reward: [(0, '43.966')] [2024-08-05 19:24:11,524][15444] Updated weights for policy 0, policy_version 50231 (0.0011) [2024-08-05 19:24:13,118][15372] Fps is (10 sec: 24576.5, 60 sec: 24166.6, 300 sec: 24159.5). Total num frames: 411533312. Throughput: 0: 6024.2. Samples: 102882180. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 19:24:13,126][15372] Avg episode reward: [(0, '44.069')] [2024-08-05 19:24:14,825][15444] Updated weights for policy 0, policy_version 50241 (0.0014) [2024-08-05 19:24:16,738][15417] Signal inference workers to stop experience collection... (18500 times) [2024-08-05 19:24:16,739][15417] Signal inference workers to resume experience collection... (18500 times) [2024-08-05 19:24:16,803][15444] InferenceWorker_p0-w0: stopping experience collection (18500 times) [2024-08-05 19:24:16,803][15444] InferenceWorker_p0-w0: resuming experience collection (18500 times) [2024-08-05 19:24:18,106][15444] Updated weights for policy 0, policy_version 50251 (0.0012) [2024-08-05 19:24:18,118][15372] Fps is (10 sec: 24576.4, 60 sec: 24302.9, 300 sec: 24159.5). Total num frames: 411656192. Throughput: 0: 6043.1. Samples: 102919130. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 19:24:18,119][15372] Avg episode reward: [(0, '43.775')] [2024-08-05 19:24:18,122][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000050251_411656192.pth... [2024-08-05 19:24:18,234][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000049543_405856256.pth [2024-08-05 19:24:21,586][15444] Updated weights for policy 0, policy_version 50261 (0.0012) [2024-08-05 19:24:23,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 411770880. Throughput: 0: 6045.8. Samples: 102937210. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 19:24:23,119][15372] Avg episode reward: [(0, '42.379')] [2024-08-05 19:24:24,935][15444] Updated weights for policy 0, policy_version 50271 (0.0012) [2024-08-05 19:24:28,119][15372] Fps is (10 sec: 23756.4, 60 sec: 24166.3, 300 sec: 24159.4). Total num frames: 411893760. Throughput: 0: 6069.4. Samples: 102973870. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 19:24:28,126][15372] Avg episode reward: [(0, '43.133')] [2024-08-05 19:24:28,387][15444] Updated weights for policy 0, policy_version 50281 (0.0024) [2024-08-05 19:24:31,749][15444] Updated weights for policy 0, policy_version 50291 (0.0012) [2024-08-05 19:24:33,119][15372] Fps is (10 sec: 24575.7, 60 sec: 24303.5, 300 sec: 24131.7). Total num frames: 412016640. Throughput: 0: 6078.2. Samples: 103009690. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 19:24:33,119][15372] Avg episode reward: [(0, '44.595')] [2024-08-05 19:24:34,989][15444] Updated weights for policy 0, policy_version 50301 (0.0027) [2024-08-05 19:24:38,118][15372] Fps is (10 sec: 24576.5, 60 sec: 24302.9, 300 sec: 24159.5). Total num frames: 412139520. Throughput: 0: 6075.8. Samples: 103028310. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 19:24:38,126][15372] Avg episode reward: [(0, '44.226')] [2024-08-05 19:24:38,533][15444] Updated weights for policy 0, policy_version 50311 (0.0020) [2024-08-05 19:24:41,793][15444] Updated weights for policy 0, policy_version 50321 (0.0011) [2024-08-05 19:24:43,119][15372] Fps is (10 sec: 24576.0, 60 sec: 24303.7, 300 sec: 24187.2). Total num frames: 412262400. Throughput: 0: 6063.8. Samples: 103064340. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 19:24:43,119][15372] Avg episode reward: [(0, '43.886')] [2024-08-05 19:24:45,254][15444] Updated weights for policy 0, policy_version 50331 (0.0016) [2024-08-05 19:24:48,121][15372] Fps is (10 sec: 24569.3, 60 sec: 24438.4, 300 sec: 24159.2). Total num frames: 412385280. Throughput: 0: 6084.8. Samples: 103101040. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 19:24:48,129][15372] Avg episode reward: [(0, '43.747')] [2024-08-05 19:24:48,696][15444] Updated weights for policy 0, policy_version 50341 (0.0032) [2024-08-05 19:24:51,839][15444] Updated weights for policy 0, policy_version 50351 (0.0026) [2024-08-05 19:24:53,119][15372] Fps is (10 sec: 23756.2, 60 sec: 24302.8, 300 sec: 24159.4). Total num frames: 412499968. Throughput: 0: 6070.4. Samples: 103119320. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 19:24:53,119][15372] Avg episode reward: [(0, '43.864')] [2024-08-05 19:24:55,356][15444] Updated weights for policy 0, policy_version 50361 (0.0014) [2024-08-05 19:24:58,119][15372] Fps is (10 sec: 23762.8, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 412622848. Throughput: 0: 6086.9. Samples: 103156090. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 19:24:58,119][15372] Avg episode reward: [(0, '42.654')] [2024-08-05 19:24:58,581][15444] Updated weights for policy 0, policy_version 50371 (0.0011) [2024-08-05 19:25:02,082][15444] Updated weights for policy 0, policy_version 50381 (0.0015) [2024-08-05 19:25:03,118][15372] Fps is (10 sec: 24577.0, 60 sec: 24303.0, 300 sec: 24187.2). Total num frames: 412745728. Throughput: 0: 6066.2. Samples: 103192110. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 19:25:03,119][15372] Avg episode reward: [(0, '42.688')] [2024-08-05 19:25:05,322][15444] Updated weights for policy 0, policy_version 50391 (0.0017) [2024-08-05 19:25:08,119][15372] Fps is (10 sec: 23754.9, 60 sec: 24166.1, 300 sec: 24159.4). Total num frames: 412860416. Throughput: 0: 6073.6. Samples: 103210530. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 19:25:08,120][15372] Avg episode reward: [(0, '42.719')] [2024-08-05 19:25:08,802][15444] Updated weights for policy 0, policy_version 50401 (0.0014) [2024-08-05 19:25:12,229][15444] Updated weights for policy 0, policy_version 50411 (0.0014) [2024-08-05 19:25:13,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24187.3). Total num frames: 412983296. Throughput: 0: 6054.9. Samples: 103246340. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 19:25:13,119][15372] Avg episode reward: [(0, '42.789')] [2024-08-05 19:25:15,529][15444] Updated weights for policy 0, policy_version 50421 (0.0024) [2024-08-05 19:25:18,119][15372] Fps is (10 sec: 25397.4, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 413114368. Throughput: 0: 6083.8. Samples: 103283460. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 19:25:18,119][15372] Avg episode reward: [(0, '42.832')] [2024-08-05 19:25:19,015][15444] Updated weights for policy 0, policy_version 50431 (0.0016) [2024-08-05 19:25:22,150][15444] Updated weights for policy 0, policy_version 50441 (0.0020) [2024-08-05 19:25:23,119][15372] Fps is (10 sec: 23756.3, 60 sec: 24166.3, 300 sec: 24159.4). Total num frames: 413220864. Throughput: 0: 6064.6. Samples: 103301220. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 19:25:23,127][15372] Avg episode reward: [(0, '42.749')] [2024-08-05 19:25:23,294][15417] Signal inference workers to stop experience collection... (18550 times) [2024-08-05 19:25:23,294][15417] Signal inference workers to resume experience collection... (18550 times) [2024-08-05 19:25:23,365][15444] InferenceWorker_p0-w0: stopping experience collection (18550 times) [2024-08-05 19:25:23,365][15444] InferenceWorker_p0-w0: resuming experience collection (18550 times) [2024-08-05 19:25:25,801][15444] Updated weights for policy 0, policy_version 50451 (0.0023) [2024-08-05 19:25:28,120][15372] Fps is (10 sec: 23753.0, 60 sec: 24302.3, 300 sec: 24214.9). Total num frames: 413351936. Throughput: 0: 6067.1. Samples: 103337370. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 19:25:28,120][15372] Avg episode reward: [(0, '43.206')] [2024-08-05 19:25:29,053][15444] Updated weights for policy 0, policy_version 50461 (0.0011) [2024-08-05 19:25:32,425][15444] Updated weights for policy 0, policy_version 50471 (0.0020) [2024-08-05 19:25:33,118][15372] Fps is (10 sec: 25395.6, 60 sec: 24303.0, 300 sec: 24215.0). Total num frames: 413474816. Throughput: 0: 6058.6. Samples: 103373660. Policy #0 lag: (min: 0.0, avg: 4.2, max: 9.0) [2024-08-05 19:25:33,126][15372] Avg episode reward: [(0, '43.470')] [2024-08-05 19:25:35,929][15444] Updated weights for policy 0, policy_version 50481 (0.0031) [2024-08-05 19:25:38,119][15372] Fps is (10 sec: 23760.6, 60 sec: 24166.3, 300 sec: 24187.2). Total num frames: 413589504. Throughput: 0: 6051.4. Samples: 103391630. Policy #0 lag: (min: 0.0, avg: 4.2, max: 9.0) [2024-08-05 19:25:38,126][15372] Avg episode reward: [(0, '44.053')] [2024-08-05 19:25:39,331][15444] Updated weights for policy 0, policy_version 50491 (0.0017) [2024-08-05 19:25:42,808][15444] Updated weights for policy 0, policy_version 50501 (0.0015) [2024-08-05 19:25:43,118][15372] Fps is (10 sec: 22937.6, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 413704192. Throughput: 0: 6032.0. Samples: 103427530. Policy #0 lag: (min: 0.0, avg: 4.2, max: 9.0) [2024-08-05 19:25:43,119][15372] Avg episode reward: [(0, '43.417')] [2024-08-05 19:25:46,212][15444] Updated weights for policy 0, policy_version 50511 (0.0017) [2024-08-05 19:25:48,119][15372] Fps is (10 sec: 24575.9, 60 sec: 24167.4, 300 sec: 24187.2). Total num frames: 413835264. Throughput: 0: 6029.3. Samples: 103463430. Policy #0 lag: (min: 0.0, avg: 4.2, max: 9.0) [2024-08-05 19:25:48,126][15372] Avg episode reward: [(0, '43.273')] [2024-08-05 19:25:49,510][15444] Updated weights for policy 0, policy_version 50521 (0.0018) [2024-08-05 19:25:53,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24030.0, 300 sec: 24131.7). Total num frames: 413941760. Throughput: 0: 6041.7. Samples: 103482400. Policy #0 lag: (min: 0.0, avg: 4.2, max: 9.0) [2024-08-05 19:25:53,126][15372] Avg episode reward: [(0, '42.615')] [2024-08-05 19:25:53,151][15444] Updated weights for policy 0, policy_version 50531 (0.0020) [2024-08-05 19:25:56,255][15444] Updated weights for policy 0, policy_version 50541 (0.0012) [2024-08-05 19:25:58,119][15372] Fps is (10 sec: 23757.0, 60 sec: 24166.4, 300 sec: 24159.4). Total num frames: 414072832. Throughput: 0: 6040.0. Samples: 103518140. Policy #0 lag: (min: 0.0, avg: 4.2, max: 9.0) [2024-08-05 19:25:58,126][15372] Avg episode reward: [(0, '42.567')] [2024-08-05 19:25:59,576][15444] Updated weights for policy 0, policy_version 50551 (0.0019) [2024-08-05 19:26:03,119][15372] Fps is (10 sec: 24575.8, 60 sec: 24029.8, 300 sec: 24159.5). Total num frames: 414187520. Throughput: 0: 6027.1. Samples: 103554680. Policy #0 lag: (min: 1.0, avg: 3.9, max: 7.0) [2024-08-05 19:26:03,127][15372] Avg episode reward: [(0, '43.357')] [2024-08-05 19:26:03,158][15444] Updated weights for policy 0, policy_version 50561 (0.0021) [2024-08-05 19:26:06,291][15444] Updated weights for policy 0, policy_version 50571 (0.0011) [2024-08-05 19:26:08,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24166.8, 300 sec: 24159.5). Total num frames: 414310400. Throughput: 0: 6048.5. Samples: 103573400. Policy #0 lag: (min: 1.0, avg: 3.9, max: 7.0) [2024-08-05 19:26:08,126][15372] Avg episode reward: [(0, '42.916')] [2024-08-05 19:26:09,808][15444] Updated weights for policy 0, policy_version 50581 (0.0026) [2024-08-05 19:26:13,117][15444] Updated weights for policy 0, policy_version 50591 (0.0028) [2024-08-05 19:26:13,118][15372] Fps is (10 sec: 25395.3, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 414441472. Throughput: 0: 6056.7. Samples: 103609910. Policy #0 lag: (min: 1.0, avg: 3.9, max: 7.0) [2024-08-05 19:26:13,126][15372] Avg episode reward: [(0, '43.085')] [2024-08-05 19:26:16,485][15444] Updated weights for policy 0, policy_version 50601 (0.0012) [2024-08-05 19:26:18,119][15372] Fps is (10 sec: 24575.5, 60 sec: 24029.8, 300 sec: 24159.4). Total num frames: 414556160. Throughput: 0: 6044.6. Samples: 103645670. Policy #0 lag: (min: 1.0, avg: 3.9, max: 7.0) [2024-08-05 19:26:18,126][15372] Avg episode reward: [(0, '44.137')] [2024-08-05 19:26:18,129][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000050605_414556160.pth... [2024-08-05 19:26:18,336][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000049896_408748032.pth [2024-08-05 19:26:19,942][15444] Updated weights for policy 0, policy_version 50611 (0.0036) [2024-08-05 19:26:22,799][15417] Signal inference workers to stop experience collection... (18600 times) [2024-08-05 19:26:22,800][15417] Signal inference workers to resume experience collection... (18600 times) [2024-08-05 19:26:22,831][15444] InferenceWorker_p0-w0: stopping experience collection (18600 times) [2024-08-05 19:26:22,831][15444] InferenceWorker_p0-w0: resuming experience collection (18600 times) [2024-08-05 19:26:23,119][15372] Fps is (10 sec: 23756.5, 60 sec: 24302.9, 300 sec: 24159.4). Total num frames: 414679040. Throughput: 0: 6056.4. Samples: 103664170. Policy #0 lag: (min: 1.0, avg: 3.9, max: 7.0) [2024-08-05 19:26:23,119][15372] Avg episode reward: [(0, '43.507')] [2024-08-05 19:26:23,298][15444] Updated weights for policy 0, policy_version 50621 (0.0015) [2024-08-05 19:26:26,790][15444] Updated weights for policy 0, policy_version 50631 (0.0014) [2024-08-05 19:26:28,118][15372] Fps is (10 sec: 24576.5, 60 sec: 24167.1, 300 sec: 24187.2). Total num frames: 414801920. Throughput: 0: 6038.0. Samples: 103699240. Policy #0 lag: (min: 1.0, avg: 3.9, max: 7.0) [2024-08-05 19:26:28,119][15372] Avg episode reward: [(0, '43.164')] [2024-08-05 19:26:30,156][15444] Updated weights for policy 0, policy_version 50641 (0.0021) [2024-08-05 19:26:33,119][15372] Fps is (10 sec: 23756.1, 60 sec: 24029.7, 300 sec: 24159.4). Total num frames: 414916608. Throughput: 0: 6061.3. Samples: 103736190. Policy #0 lag: (min: 1.0, avg: 3.9, max: 7.0) [2024-08-05 19:26:33,127][15372] Avg episode reward: [(0, '43.137')] [2024-08-05 19:26:33,587][15444] Updated weights for policy 0, policy_version 50651 (0.0021) [2024-08-05 19:26:37,015][15444] Updated weights for policy 0, policy_version 50661 (0.0015) [2024-08-05 19:26:38,119][15372] Fps is (10 sec: 23756.4, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 415039488. Throughput: 0: 6038.2. Samples: 103754120. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 19:26:38,119][15372] Avg episode reward: [(0, '43.424')] [2024-08-05 19:26:40,288][15444] Updated weights for policy 0, policy_version 50671 (0.0012) [2024-08-05 19:26:43,119][15372] Fps is (10 sec: 24576.7, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 415162368. Throughput: 0: 6050.2. Samples: 103790400. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 19:26:43,125][15372] Avg episode reward: [(0, '42.935')] [2024-08-05 19:26:43,735][15444] Updated weights for policy 0, policy_version 50681 (0.0011) [2024-08-05 19:26:47,146][15444] Updated weights for policy 0, policy_version 50691 (0.0013) [2024-08-05 19:26:48,119][15372] Fps is (10 sec: 24576.1, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 415285248. Throughput: 0: 6033.3. Samples: 103826180. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 19:26:48,119][15372] Avg episode reward: [(0, '42.271')] [2024-08-05 19:26:50,619][15444] Updated weights for policy 0, policy_version 50701 (0.0018) [2024-08-05 19:26:53,119][15372] Fps is (10 sec: 23756.4, 60 sec: 24302.8, 300 sec: 24187.2). Total num frames: 415399936. Throughput: 0: 6027.1. Samples: 103844620. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 19:26:53,127][15372] Avg episode reward: [(0, '42.683')] [2024-08-05 19:26:53,981][15444] Updated weights for policy 0, policy_version 50711 (0.0018) [2024-08-05 19:26:57,287][15444] Updated weights for policy 0, policy_version 50721 (0.0015) [2024-08-05 19:26:58,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 415522816. Throughput: 0: 6030.7. Samples: 103881290. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 19:26:58,126][15372] Avg episode reward: [(0, '42.920')] [2024-08-05 19:27:00,472][15444] Updated weights for policy 0, policy_version 50731 (0.0010) [2024-08-05 19:27:03,118][15372] Fps is (10 sec: 23757.7, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 415637504. Throughput: 0: 6018.0. Samples: 103916480. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 19:27:03,126][15372] Avg episode reward: [(0, '42.667')] [2024-08-05 19:27:04,237][15444] Updated weights for policy 0, policy_version 50741 (0.0011) [2024-08-05 19:27:07,777][15444] Updated weights for policy 0, policy_version 50751 (0.0015) [2024-08-05 19:27:08,119][15372] Fps is (10 sec: 23756.4, 60 sec: 24166.3, 300 sec: 24159.4). Total num frames: 415760384. Throughput: 0: 6014.2. Samples: 103934810. Policy #0 lag: (min: 0.0, avg: 2.8, max: 8.0) [2024-08-05 19:27:08,119][15372] Avg episode reward: [(0, '42.230')] [2024-08-05 19:27:10,950][15444] Updated weights for policy 0, policy_version 50761 (0.0038) [2024-08-05 19:27:13,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24029.9, 300 sec: 24187.2). Total num frames: 415883264. Throughput: 0: 6027.6. Samples: 103970480. Policy #0 lag: (min: 0.0, avg: 2.8, max: 8.0) [2024-08-05 19:27:13,126][15372] Avg episode reward: [(0, '42.026')] [2024-08-05 19:27:14,717][15444] Updated weights for policy 0, policy_version 50771 (0.0014) [2024-08-05 19:27:17,728][15444] Updated weights for policy 0, policy_version 50781 (0.0017) [2024-08-05 19:27:18,119][15372] Fps is (10 sec: 23754.9, 60 sec: 24029.5, 300 sec: 24159.4). Total num frames: 415997952. Throughput: 0: 5995.3. Samples: 104005980. Policy #0 lag: (min: 0.0, avg: 2.8, max: 8.0) [2024-08-05 19:27:18,120][15372] Avg episode reward: [(0, '42.479')] [2024-08-05 19:27:20,885][15417] Signal inference workers to stop experience collection... (18650 times) [2024-08-05 19:27:20,886][15417] Signal inference workers to resume experience collection... (18650 times) [2024-08-05 19:27:20,961][15444] InferenceWorker_p0-w0: stopping experience collection (18650 times) [2024-08-05 19:27:20,965][15444] InferenceWorker_p0-w0: resuming experience collection (18650 times) [2024-08-05 19:27:21,350][15444] Updated weights for policy 0, policy_version 50791 (0.0042) [2024-08-05 19:27:23,118][15372] Fps is (10 sec: 22937.5, 60 sec: 23893.4, 300 sec: 24131.7). Total num frames: 416112640. Throughput: 0: 6020.9. Samples: 104025060. Policy #0 lag: (min: 0.0, avg: 2.8, max: 8.0) [2024-08-05 19:27:23,119][15372] Avg episode reward: [(0, '44.016')] [2024-08-05 19:27:24,668][15444] Updated weights for policy 0, policy_version 50801 (0.0026) [2024-08-05 19:27:27,908][15444] Updated weights for policy 0, policy_version 50811 (0.0033) [2024-08-05 19:27:28,120][15372] Fps is (10 sec: 24574.6, 60 sec: 24029.2, 300 sec: 24187.1). Total num frames: 416243712. Throughput: 0: 6017.8. Samples: 104061210. Policy #0 lag: (min: 0.0, avg: 2.8, max: 8.0) [2024-08-05 19:27:28,120][15372] Avg episode reward: [(0, '43.798')] [2024-08-05 19:27:31,549][15444] Updated weights for policy 0, policy_version 50821 (0.0012) [2024-08-05 19:27:33,119][15372] Fps is (10 sec: 24574.5, 60 sec: 24029.8, 300 sec: 24159.4). Total num frames: 416358400. Throughput: 0: 6016.8. Samples: 104096940. Policy #0 lag: (min: 0.0, avg: 2.8, max: 8.0) [2024-08-05 19:27:33,119][15372] Avg episode reward: [(0, '43.324')] [2024-08-05 19:27:34,742][15444] Updated weights for policy 0, policy_version 50831 (0.0016) [2024-08-05 19:27:38,118][15372] Fps is (10 sec: 23760.6, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 416481280. Throughput: 0: 6024.0. Samples: 104115700. Policy #0 lag: (min: 0.0, avg: 3.4, max: 8.0) [2024-08-05 19:27:38,126][15372] Avg episode reward: [(0, '43.369')] [2024-08-05 19:27:38,241][15444] Updated weights for policy 0, policy_version 50841 (0.0015) [2024-08-05 19:27:41,594][15444] Updated weights for policy 0, policy_version 50851 (0.0021) [2024-08-05 19:27:43,119][15372] Fps is (10 sec: 24576.4, 60 sec: 24029.8, 300 sec: 24187.2). Total num frames: 416604160. Throughput: 0: 6002.6. Samples: 104151410. Policy #0 lag: (min: 0.0, avg: 3.4, max: 8.0) [2024-08-05 19:27:43,127][15372] Avg episode reward: [(0, '44.150')] [2024-08-05 19:27:44,812][15444] Updated weights for policy 0, policy_version 50861 (0.0023) [2024-08-05 19:27:48,119][15372] Fps is (10 sec: 23756.6, 60 sec: 23893.4, 300 sec: 24132.1). Total num frames: 416718848. Throughput: 0: 6026.0. Samples: 104187650. Policy #0 lag: (min: 0.0, avg: 3.4, max: 8.0) [2024-08-05 19:27:48,126][15372] Avg episode reward: [(0, '44.742')] [2024-08-05 19:27:48,131][15417] Saving new best policy, reward=44.742! [2024-08-05 19:27:48,525][15444] Updated weights for policy 0, policy_version 50871 (0.0021) [2024-08-05 19:27:51,837][15444] Updated weights for policy 0, policy_version 50881 (0.0013) [2024-08-05 19:27:53,119][15372] Fps is (10 sec: 24576.9, 60 sec: 24166.5, 300 sec: 24187.2). Total num frames: 416849920. Throughput: 0: 6026.2. Samples: 104205990. Policy #0 lag: (min: 0.0, avg: 3.4, max: 8.0) [2024-08-05 19:27:53,119][15372] Avg episode reward: [(0, '43.649')] [2024-08-05 19:27:55,203][15444] Updated weights for policy 0, policy_version 50891 (0.0031) [2024-08-05 19:27:58,118][15372] Fps is (10 sec: 25395.2, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 416972800. Throughput: 0: 6038.4. Samples: 104242210. Policy #0 lag: (min: 0.0, avg: 3.4, max: 8.0) [2024-08-05 19:27:58,126][15372] Avg episode reward: [(0, '43.059')] [2024-08-05 19:27:58,728][15444] Updated weights for policy 0, policy_version 50901 (0.0022) [2024-08-05 19:28:01,906][15444] Updated weights for policy 0, policy_version 50911 (0.0013) [2024-08-05 19:28:03,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 417087488. Throughput: 0: 6034.1. Samples: 104277510. Policy #0 lag: (min: 0.0, avg: 3.4, max: 8.0) [2024-08-05 19:28:03,119][15372] Avg episode reward: [(0, '43.311')] [2024-08-05 19:28:05,472][15444] Updated weights for policy 0, policy_version 50921 (0.0014) [2024-08-05 19:28:08,119][15372] Fps is (10 sec: 22937.2, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 417202176. Throughput: 0: 6021.3. Samples: 104296020. Policy #0 lag: (min: 0.0, avg: 3.4, max: 8.0) [2024-08-05 19:28:08,126][15372] Avg episode reward: [(0, '42.260')] [2024-08-05 19:28:08,348][15417] Signal inference workers to stop experience collection... (18700 times) [2024-08-05 19:28:08,356][15417] Signal inference workers to resume experience collection... (18700 times) [2024-08-05 19:28:08,394][15444] InferenceWorker_p0-w0: stopping experience collection (18700 times) [2024-08-05 19:28:08,399][15444] InferenceWorker_p0-w0: resuming experience collection (18700 times) [2024-08-05 19:28:08,757][15444] Updated weights for policy 0, policy_version 50931 (0.0018) [2024-08-05 19:28:12,467][15444] Updated weights for policy 0, policy_version 50941 (0.0017) [2024-08-05 19:28:13,122][15372] Fps is (10 sec: 23749.4, 60 sec: 24028.6, 300 sec: 24159.2). Total num frames: 417325056. Throughput: 0: 6004.9. Samples: 104331440. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 19:28:13,122][15372] Avg episode reward: [(0, '41.997')] [2024-08-05 19:28:15,751][15444] Updated weights for policy 0, policy_version 50951 (0.0026) [2024-08-05 19:28:18,119][15372] Fps is (10 sec: 24576.2, 60 sec: 24166.8, 300 sec: 24159.5). Total num frames: 417447936. Throughput: 0: 6017.4. Samples: 104367720. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 19:28:18,126][15372] Avg episode reward: [(0, '42.020')] [2024-08-05 19:28:18,130][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000050958_417447936.pth... [2024-08-05 19:28:18,242][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000050251_411656192.pth [2024-08-05 19:28:19,031][15444] Updated weights for policy 0, policy_version 50961 (0.0026) [2024-08-05 19:28:22,746][15444] Updated weights for policy 0, policy_version 50971 (0.0018) [2024-08-05 19:28:23,118][15372] Fps is (10 sec: 23764.3, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 417562624. Throughput: 0: 6004.9. Samples: 104385920. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 19:28:23,119][15372] Avg episode reward: [(0, '42.551')] [2024-08-05 19:28:25,848][15444] Updated weights for policy 0, policy_version 50981 (0.0021) [2024-08-05 19:28:28,122][15372] Fps is (10 sec: 23749.2, 60 sec: 24029.2, 300 sec: 24159.3). Total num frames: 417685504. Throughput: 0: 6008.1. Samples: 104421790. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 19:28:28,129][15372] Avg episode reward: [(0, '43.271')] [2024-08-05 19:28:29,512][15444] Updated weights for policy 0, policy_version 50991 (0.0017) [2024-08-05 19:28:32,736][15444] Updated weights for policy 0, policy_version 51001 (0.0013) [2024-08-05 19:28:33,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24030.1, 300 sec: 24131.7). Total num frames: 417800192. Throughput: 0: 5983.1. Samples: 104456890. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 19:28:33,119][15372] Avg episode reward: [(0, '42.370')] [2024-08-05 19:28:36,220][15444] Updated weights for policy 0, policy_version 51011 (0.0025) [2024-08-05 19:28:38,118][15372] Fps is (10 sec: 23764.6, 60 sec: 24029.9, 300 sec: 24131.9). Total num frames: 417923072. Throughput: 0: 5995.6. Samples: 104475790. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 19:28:38,126][15372] Avg episode reward: [(0, '43.093')] [2024-08-05 19:28:39,930][15444] Updated weights for policy 0, policy_version 51021 (0.0019) [2024-08-05 19:28:42,831][15444] Updated weights for policy 0, policy_version 51031 (0.0012) [2024-08-05 19:28:43,118][15372] Fps is (10 sec: 25395.2, 60 sec: 24166.6, 300 sec: 24187.2). Total num frames: 418054144. Throughput: 0: 5996.0. Samples: 104512030. Policy #0 lag: (min: 1.0, avg: 4.8, max: 7.0) [2024-08-05 19:28:43,119][15372] Avg episode reward: [(0, '44.600')] [2024-08-05 19:28:45,043][15417] Signal inference workers to stop experience collection... (18750 times) [2024-08-05 19:28:45,043][15417] Signal inference workers to resume experience collection... (18750 times) [2024-08-05 19:28:45,087][15444] InferenceWorker_p0-w0: stopping experience collection (18750 times) [2024-08-05 19:28:45,094][15444] InferenceWorker_p0-w0: resuming experience collection (18750 times) [2024-08-05 19:28:46,446][15444] Updated weights for policy 0, policy_version 51041 (0.0017) [2024-08-05 19:28:48,119][15372] Fps is (10 sec: 23756.4, 60 sec: 24029.8, 300 sec: 24131.7). Total num frames: 418160640. Throughput: 0: 6014.6. Samples: 104548170. Policy #0 lag: (min: 1.0, avg: 4.8, max: 7.0) [2024-08-05 19:28:48,119][15372] Avg episode reward: [(0, '43.931')] [2024-08-05 19:28:49,492][15444] Updated weights for policy 0, policy_version 51051 (0.0017) [2024-08-05 19:28:53,121][15372] Fps is (10 sec: 22932.1, 60 sec: 23892.4, 300 sec: 24131.5). Total num frames: 418283520. Throughput: 0: 6028.8. Samples: 104567330. Policy #0 lag: (min: 1.0, avg: 4.8, max: 7.0) [2024-08-05 19:28:53,129][15372] Avg episode reward: [(0, '43.058')] [2024-08-05 19:28:53,155][15444] Updated weights for policy 0, policy_version 51061 (0.0017) [2024-08-05 19:28:56,608][15444] Updated weights for policy 0, policy_version 51071 (0.0012) [2024-08-05 19:28:58,119][15372] Fps is (10 sec: 25394.1, 60 sec: 24029.7, 300 sec: 24159.4). Total num frames: 418414592. Throughput: 0: 6031.5. Samples: 104602840. Policy #0 lag: (min: 1.0, avg: 4.8, max: 7.0) [2024-08-05 19:28:58,127][15372] Avg episode reward: [(0, '42.346')] [2024-08-05 19:28:59,670][15444] Updated weights for policy 0, policy_version 51081 (0.0019) [2024-08-05 19:29:03,119][15372] Fps is (10 sec: 24581.5, 60 sec: 24029.8, 300 sec: 24131.7). Total num frames: 418529280. Throughput: 0: 6038.9. Samples: 104639470. Policy #0 lag: (min: 1.0, avg: 4.8, max: 7.0) [2024-08-05 19:29:03,126][15372] Avg episode reward: [(0, '42.325')] [2024-08-05 19:29:03,334][15444] Updated weights for policy 0, policy_version 51091 (0.0012) [2024-08-05 19:29:06,644][15444] Updated weights for policy 0, policy_version 51101 (0.0013) [2024-08-05 19:29:08,118][15372] Fps is (10 sec: 23758.1, 60 sec: 24166.5, 300 sec: 24131.7). Total num frames: 418652160. Throughput: 0: 6041.8. Samples: 104657800. Policy #0 lag: (min: 1.0, avg: 4.8, max: 7.0) [2024-08-05 19:29:08,126][15372] Avg episode reward: [(0, '43.435')] [2024-08-05 19:29:09,815][15444] Updated weights for policy 0, policy_version 51111 (0.0011) [2024-08-05 19:29:13,121][15372] Fps is (10 sec: 24570.2, 60 sec: 24166.7, 300 sec: 24131.5). Total num frames: 418775040. Throughput: 0: 6058.6. Samples: 104694420. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 19:29:13,129][15372] Avg episode reward: [(0, '42.920')] [2024-08-05 19:29:13,443][15444] Updated weights for policy 0, policy_version 51121 (0.0025) [2024-08-05 19:29:16,642][15444] Updated weights for policy 0, policy_version 51131 (0.0020) [2024-08-05 19:29:18,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24166.5, 300 sec: 24159.5). Total num frames: 418897920. Throughput: 0: 6074.0. Samples: 104730220. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 19:29:18,126][15372] Avg episode reward: [(0, '42.411')] [2024-08-05 19:29:19,913][15444] Updated weights for policy 0, policy_version 51141 (0.0014) [2024-08-05 19:29:23,118][15372] Fps is (10 sec: 24582.2, 60 sec: 24302.9, 300 sec: 24159.5). Total num frames: 419020800. Throughput: 0: 6066.0. Samples: 104748760. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 19:29:23,126][15372] Avg episode reward: [(0, '42.648')] [2024-08-05 19:29:23,574][15444] Updated weights for policy 0, policy_version 51151 (0.0013) [2024-08-05 19:29:26,822][15444] Updated weights for policy 0, policy_version 51161 (0.0021) [2024-08-05 19:29:28,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24167.7, 300 sec: 24131.7). Total num frames: 419135488. Throughput: 0: 6061.3. Samples: 104784790. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 19:29:28,119][15372] Avg episode reward: [(0, '41.478')] [2024-08-05 19:29:30,146][15444] Updated weights for policy 0, policy_version 51171 (0.0011) [2024-08-05 19:29:31,076][15417] Signal inference workers to stop experience collection... (18800 times) [2024-08-05 19:29:31,084][15417] Signal inference workers to resume experience collection... (18800 times) [2024-08-05 19:29:31,123][15444] InferenceWorker_p0-w0: stopping experience collection (18800 times) [2024-08-05 19:29:31,128][15444] InferenceWorker_p0-w0: resuming experience collection (18800 times) [2024-08-05 19:29:33,118][15372] Fps is (10 sec: 23756.6, 60 sec: 24302.9, 300 sec: 24131.7). Total num frames: 419258368. Throughput: 0: 6054.5. Samples: 104820620. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 19:29:33,119][15372] Avg episode reward: [(0, '41.948')] [2024-08-05 19:29:33,829][15444] Updated weights for policy 0, policy_version 51181 (0.0020) [2024-08-05 19:29:37,093][15444] Updated weights for policy 0, policy_version 51191 (0.0020) [2024-08-05 19:29:38,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24302.9, 300 sec: 24131.7). Total num frames: 419381248. Throughput: 0: 6043.2. Samples: 104839260. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 19:29:38,119][15372] Avg episode reward: [(0, '43.334')] [2024-08-05 19:29:40,210][15444] Updated weights for policy 0, policy_version 51201 (0.0013) [2024-08-05 19:29:43,119][15372] Fps is (10 sec: 23756.2, 60 sec: 24029.7, 300 sec: 24104.1). Total num frames: 419495936. Throughput: 0: 6048.9. Samples: 104875040. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 19:29:43,132][15372] Avg episode reward: [(0, '43.674')] [2024-08-05 19:29:43,870][15444] Updated weights for policy 0, policy_version 51211 (0.0012) [2024-08-05 19:29:47,464][15444] Updated weights for policy 0, policy_version 51221 (0.0014) [2024-08-05 19:29:48,119][15372] Fps is (10 sec: 22937.3, 60 sec: 24166.4, 300 sec: 24103.9). Total num frames: 419610624. Throughput: 0: 6026.4. Samples: 104910660. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 19:29:48,119][15372] Avg episode reward: [(0, '42.362')] [2024-08-05 19:29:50,670][15444] Updated weights for policy 0, policy_version 51231 (0.0013) [2024-08-05 19:29:53,119][15372] Fps is (10 sec: 24576.4, 60 sec: 24303.8, 300 sec: 24131.7). Total num frames: 419741696. Throughput: 0: 6024.4. Samples: 104928900. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 19:29:53,126][15372] Avg episode reward: [(0, '41.925')] [2024-08-05 19:29:54,351][15444] Updated weights for policy 0, policy_version 51241 (0.0012) [2024-08-05 19:29:57,568][15444] Updated weights for policy 0, policy_version 51251 (0.0023) [2024-08-05 19:29:58,119][15372] Fps is (10 sec: 24575.6, 60 sec: 24030.0, 300 sec: 24103.9). Total num frames: 419856384. Throughput: 0: 6010.5. Samples: 104964880. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 19:29:58,119][15372] Avg episode reward: [(0, '43.107')] [2024-08-05 19:30:01,052][15444] Updated weights for policy 0, policy_version 51261 (0.0031) [2024-08-05 19:30:03,118][15372] Fps is (10 sec: 23757.2, 60 sec: 24166.5, 300 sec: 24131.8). Total num frames: 419979264. Throughput: 0: 6016.0. Samples: 105000940. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 19:30:03,119][15372] Avg episode reward: [(0, '43.187')] [2024-08-05 19:30:04,321][15444] Updated weights for policy 0, policy_version 51271 (0.0015) [2024-08-05 19:30:07,953][15444] Updated weights for policy 0, policy_version 51281 (0.0022) [2024-08-05 19:30:08,119][15372] Fps is (10 sec: 23757.6, 60 sec: 24029.9, 300 sec: 24103.9). Total num frames: 420093952. Throughput: 0: 5994.9. Samples: 105018530. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 19:30:08,119][15372] Avg episode reward: [(0, '43.319')] [2024-08-05 19:30:11,194][15444] Updated weights for policy 0, policy_version 51291 (0.0026) [2024-08-05 19:30:13,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24030.8, 300 sec: 24076.2). Total num frames: 420216832. Throughput: 0: 5988.2. Samples: 105054260. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 19:30:13,126][15372] Avg episode reward: [(0, '43.317')] [2024-08-05 19:30:14,564][15444] Updated weights for policy 0, policy_version 51301 (0.0012) [2024-08-05 19:30:18,118][15372] Fps is (10 sec: 23756.7, 60 sec: 23893.3, 300 sec: 24103.9). Total num frames: 420331520. Throughput: 0: 5996.9. Samples: 105090480. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 19:30:18,126][15372] Avg episode reward: [(0, '43.689')] [2024-08-05 19:30:18,129][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000051310_420331520.pth... [2024-08-05 19:30:18,286][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000050605_414556160.pth [2024-08-05 19:30:18,391][15444] Updated weights for policy 0, policy_version 51311 (0.0017) [2024-08-05 19:30:21,397][15444] Updated weights for policy 0, policy_version 51321 (0.0016) [2024-08-05 19:30:23,118][15372] Fps is (10 sec: 23757.1, 60 sec: 23893.3, 300 sec: 24076.3). Total num frames: 420454400. Throughput: 0: 5990.5. Samples: 105108830. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 19:30:23,126][15372] Avg episode reward: [(0, '43.043')] [2024-08-05 19:30:24,761][15444] Updated weights for policy 0, policy_version 51331 (0.0011) [2024-08-05 19:30:26,722][15417] Signal inference workers to stop experience collection... (18850 times) [2024-08-05 19:30:26,723][15417] Signal inference workers to resume experience collection... (18850 times) [2024-08-05 19:30:26,760][15444] InferenceWorker_p0-w0: stopping experience collection (18850 times) [2024-08-05 19:30:26,760][15444] InferenceWorker_p0-w0: resuming experience collection (18850 times) [2024-08-05 19:30:28,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24029.9, 300 sec: 24076.1). Total num frames: 420577280. Throughput: 0: 6017.2. Samples: 105145810. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 19:30:28,119][15372] Avg episode reward: [(0, '42.913')] [2024-08-05 19:30:28,244][15444] Updated weights for policy 0, policy_version 51341 (0.0012) [2024-08-05 19:30:31,538][15444] Updated weights for policy 0, policy_version 51351 (0.0019) [2024-08-05 19:30:33,119][15372] Fps is (10 sec: 24575.3, 60 sec: 24029.8, 300 sec: 24103.9). Total num frames: 420700160. Throughput: 0: 6028.0. Samples: 105181920. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 19:30:33,119][15372] Avg episode reward: [(0, '43.772')] [2024-08-05 19:30:35,047][15444] Updated weights for policy 0, policy_version 51361 (0.0011) [2024-08-05 19:30:38,021][15444] Updated weights for policy 0, policy_version 51371 (0.0028) [2024-08-05 19:30:38,118][15372] Fps is (10 sec: 25395.1, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 420831232. Throughput: 0: 6036.0. Samples: 105200520. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 19:30:38,119][15372] Avg episode reward: [(0, '43.194')] [2024-08-05 19:30:41,688][15444] Updated weights for policy 0, policy_version 51381 (0.0011) [2024-08-05 19:30:43,118][15372] Fps is (10 sec: 24576.5, 60 sec: 24166.5, 300 sec: 24103.9). Total num frames: 420945920. Throughput: 0: 6047.8. Samples: 105237030. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 19:30:43,119][15372] Avg episode reward: [(0, '43.717')] [2024-08-05 19:30:44,873][15444] Updated weights for policy 0, policy_version 51391 (0.0013) [2024-08-05 19:30:48,119][15372] Fps is (10 sec: 23756.5, 60 sec: 24302.9, 300 sec: 24159.4). Total num frames: 421068800. Throughput: 0: 6060.4. Samples: 105273660. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 19:30:48,128][15372] Avg episode reward: [(0, '42.658')] [2024-08-05 19:30:48,359][15444] Updated weights for policy 0, policy_version 51401 (0.0012) [2024-08-05 19:30:51,714][15444] Updated weights for policy 0, policy_version 51411 (0.0014) [2024-08-05 19:30:53,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.5, 300 sec: 24131.7). Total num frames: 421191680. Throughput: 0: 6076.2. Samples: 105291960. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 19:30:53,119][15372] Avg episode reward: [(0, '42.970')] [2024-08-05 19:30:55,027][15444] Updated weights for policy 0, policy_version 51421 (0.0037) [2024-08-05 19:30:58,118][15372] Fps is (10 sec: 24576.3, 60 sec: 24303.1, 300 sec: 24159.5). Total num frames: 421314560. Throughput: 0: 6094.9. Samples: 105328530. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 19:30:58,126][15372] Avg episode reward: [(0, '43.685')] [2024-08-05 19:30:58,574][15444] Updated weights for policy 0, policy_version 51431 (0.0031) [2024-08-05 19:31:01,767][15444] Updated weights for policy 0, policy_version 51441 (0.0014) [2024-08-05 19:31:03,119][15372] Fps is (10 sec: 23755.8, 60 sec: 24166.2, 300 sec: 24131.7). Total num frames: 421429248. Throughput: 0: 6081.5. Samples: 105364150. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 19:31:03,119][15372] Avg episode reward: [(0, '44.623')] [2024-08-05 19:31:05,202][15444] Updated weights for policy 0, policy_version 51451 (0.0017) [2024-08-05 19:31:08,119][15372] Fps is (10 sec: 23756.6, 60 sec: 24302.9, 300 sec: 24103.9). Total num frames: 421552128. Throughput: 0: 6070.4. Samples: 105382000. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 19:31:08,143][15372] Avg episode reward: [(0, '43.579')] [2024-08-05 19:31:08,763][15444] Updated weights for policy 0, policy_version 51461 (0.0015) [2024-08-05 19:31:12,137][15444] Updated weights for policy 0, policy_version 51471 (0.0022) [2024-08-05 19:31:13,121][15372] Fps is (10 sec: 24571.0, 60 sec: 24301.9, 300 sec: 24131.5). Total num frames: 421675008. Throughput: 0: 6046.8. Samples: 105417930. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 19:31:13,121][15372] Avg episode reward: [(0, '42.640')] [2024-08-05 19:31:15,607][15444] Updated weights for policy 0, policy_version 51481 (0.0023) [2024-08-05 19:31:16,878][15417] Signal inference workers to stop experience collection... (18900 times) [2024-08-05 19:31:16,891][15417] Signal inference workers to resume experience collection... (18900 times) [2024-08-05 19:31:16,929][15444] InferenceWorker_p0-w0: stopping experience collection (18900 times) [2024-08-05 19:31:16,929][15444] InferenceWorker_p0-w0: resuming experience collection (18900 times) [2024-08-05 19:31:18,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24302.9, 300 sec: 24103.9). Total num frames: 421789696. Throughput: 0: 6045.6. Samples: 105453970. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 19:31:18,119][15372] Avg episode reward: [(0, '43.112')] [2024-08-05 19:31:19,037][15444] Updated weights for policy 0, policy_version 51491 (0.0011) [2024-08-05 19:31:22,487][15444] Updated weights for policy 0, policy_version 51501 (0.0019) [2024-08-05 19:31:23,118][15372] Fps is (10 sec: 22943.4, 60 sec: 24166.4, 300 sec: 24076.2). Total num frames: 421904384. Throughput: 0: 6049.1. Samples: 105472730. Policy #0 lag: (min: 1.0, avg: 3.9, max: 7.0) [2024-08-05 19:31:23,119][15372] Avg episode reward: [(0, '42.602')] [2024-08-05 19:31:25,541][15444] Updated weights for policy 0, policy_version 51511 (0.0028) [2024-08-05 19:31:28,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24302.9, 300 sec: 24131.7). Total num frames: 422035456. Throughput: 0: 6025.3. Samples: 105508170. Policy #0 lag: (min: 1.0, avg: 3.9, max: 7.0) [2024-08-05 19:31:28,126][15372] Avg episode reward: [(0, '42.498')] [2024-08-05 19:31:29,346][15444] Updated weights for policy 0, policy_version 51521 (0.0024) [2024-08-05 19:31:32,495][15444] Updated weights for policy 0, policy_version 51531 (0.0014) [2024-08-05 19:31:33,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.5, 300 sec: 24103.9). Total num frames: 422150144. Throughput: 0: 5999.8. Samples: 105543650. Policy #0 lag: (min: 1.0, avg: 3.9, max: 7.0) [2024-08-05 19:31:33,126][15372] Avg episode reward: [(0, '43.251')] [2024-08-05 19:31:35,795][15444] Updated weights for policy 0, policy_version 51541 (0.0013) [2024-08-05 19:31:38,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24029.9, 300 sec: 24103.9). Total num frames: 422273024. Throughput: 0: 6004.2. Samples: 105562150. Policy #0 lag: (min: 1.0, avg: 3.9, max: 7.0) [2024-08-05 19:31:38,126][15372] Avg episode reward: [(0, '42.398')] [2024-08-05 19:31:39,423][15444] Updated weights for policy 0, policy_version 51551 (0.0014) [2024-08-05 19:31:42,797][15444] Updated weights for policy 0, policy_version 51561 (0.0012) [2024-08-05 19:31:43,124][15372] Fps is (10 sec: 23743.8, 60 sec: 24027.7, 300 sec: 24075.7). Total num frames: 422387712. Throughput: 0: 5994.4. Samples: 105598310. Policy #0 lag: (min: 1.0, avg: 3.9, max: 7.0) [2024-08-05 19:31:43,124][15372] Avg episode reward: [(0, '42.612')] [2024-08-05 19:31:46,131][15444] Updated weights for policy 0, policy_version 51571 (0.0011) [2024-08-05 19:31:48,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24029.9, 300 sec: 24103.9). Total num frames: 422510592. Throughput: 0: 5994.9. Samples: 105633920. Policy #0 lag: (min: 1.0, avg: 3.9, max: 7.0) [2024-08-05 19:31:48,126][15372] Avg episode reward: [(0, '42.876')] [2024-08-05 19:31:49,644][15444] Updated weights for policy 0, policy_version 51581 (0.0012) [2024-08-05 19:31:53,002][15444] Updated weights for policy 0, policy_version 51591 (0.0024) [2024-08-05 19:31:53,118][15372] Fps is (10 sec: 24589.4, 60 sec: 24029.9, 300 sec: 24103.9). Total num frames: 422633472. Throughput: 0: 6010.0. Samples: 105652450. Policy #0 lag: (min: 1.0, avg: 3.9, max: 7.0) [2024-08-05 19:31:53,119][15372] Avg episode reward: [(0, '42.700')] [2024-08-05 19:31:56,479][15444] Updated weights for policy 0, policy_version 51601 (0.0012) [2024-08-05 19:31:58,119][15372] Fps is (10 sec: 23756.5, 60 sec: 23893.3, 300 sec: 24103.9). Total num frames: 422748160. Throughput: 0: 6015.6. Samples: 105688620. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 19:31:58,126][15372] Avg episode reward: [(0, '44.680')] [2024-08-05 19:31:59,736][15444] Updated weights for policy 0, policy_version 51611 (0.0012) [2024-08-05 19:32:03,091][15444] Updated weights for policy 0, policy_version 51621 (0.0010) [2024-08-05 19:32:03,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24166.6, 300 sec: 24131.7). Total num frames: 422879232. Throughput: 0: 6024.2. Samples: 105725060. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 19:32:03,119][15372] Avg episode reward: [(0, '44.147')] [2024-08-05 19:32:06,504][15444] Updated weights for policy 0, policy_version 51631 (0.0012) [2024-08-05 19:32:08,118][15372] Fps is (10 sec: 24576.3, 60 sec: 24029.9, 300 sec: 24103.9). Total num frames: 422993920. Throughput: 0: 6018.2. Samples: 105743550. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 19:32:08,126][15372] Avg episode reward: [(0, '42.913')] [2024-08-05 19:32:09,830][15444] Updated weights for policy 0, policy_version 51641 (0.0020) [2024-08-05 19:32:10,398][15417] Signal inference workers to stop experience collection... (18950 times) [2024-08-05 19:32:10,401][15417] Signal inference workers to resume experience collection... (18950 times) [2024-08-05 19:32:10,471][15444] InferenceWorker_p0-w0: stopping experience collection (18950 times) [2024-08-05 19:32:10,473][15444] InferenceWorker_p0-w0: resuming experience collection (18950 times) [2024-08-05 19:32:13,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24030.9, 300 sec: 24131.8). Total num frames: 423116800. Throughput: 0: 6036.7. Samples: 105779820. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 19:32:13,119][15372] Avg episode reward: [(0, '42.998')] [2024-08-05 19:32:13,290][15444] Updated weights for policy 0, policy_version 51651 (0.0011) [2024-08-05 19:32:16,728][15444] Updated weights for policy 0, policy_version 51661 (0.0016) [2024-08-05 19:32:18,119][15372] Fps is (10 sec: 24575.4, 60 sec: 24166.3, 300 sec: 24159.4). Total num frames: 423239680. Throughput: 0: 6046.2. Samples: 105815730. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 19:32:18,119][15372] Avg episode reward: [(0, '43.450')] [2024-08-05 19:32:18,122][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000051665_423239680.pth... [2024-08-05 19:32:18,291][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000050958_417447936.pth [2024-08-05 19:32:20,013][15444] Updated weights for policy 0, policy_version 51671 (0.0012) [2024-08-05 19:32:23,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24302.9, 300 sec: 24131.8). Total num frames: 423362560. Throughput: 0: 6042.2. Samples: 105834050. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 19:32:23,126][15372] Avg episode reward: [(0, '43.469')] [2024-08-05 19:32:23,444][15444] Updated weights for policy 0, policy_version 51681 (0.0013) [2024-08-05 19:32:26,999][15444] Updated weights for policy 0, policy_version 51691 (0.0035) [2024-08-05 19:32:28,119][15372] Fps is (10 sec: 22937.7, 60 sec: 23893.3, 300 sec: 24104.0). Total num frames: 423469056. Throughput: 0: 6031.8. Samples: 105869710. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 19:32:28,119][15372] Avg episode reward: [(0, '43.908')] [2024-08-05 19:32:30,571][15444] Updated weights for policy 0, policy_version 51701 (0.0013) [2024-08-05 19:32:33,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 423600128. Throughput: 0: 6038.7. Samples: 105905660. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 19:32:33,119][15372] Avg episode reward: [(0, '43.926')] [2024-08-05 19:32:33,685][15444] Updated weights for policy 0, policy_version 51711 (0.0014) [2024-08-05 19:32:37,303][15444] Updated weights for policy 0, policy_version 51721 (0.0026) [2024-08-05 19:32:38,119][15372] Fps is (10 sec: 25395.3, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 423723008. Throughput: 0: 6026.4. Samples: 105923640. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 19:32:38,119][15372] Avg episode reward: [(0, '43.550')] [2024-08-05 19:32:40,553][15444] Updated weights for policy 0, policy_version 51731 (0.0021) [2024-08-05 19:32:43,118][15372] Fps is (10 sec: 22937.5, 60 sec: 24032.0, 300 sec: 24103.9). Total num frames: 423829504. Throughput: 0: 6014.7. Samples: 105959280. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 19:32:43,126][15372] Avg episode reward: [(0, '42.876')] [2024-08-05 19:32:44,052][15444] Updated weights for policy 0, policy_version 51741 (0.0014) [2024-08-05 19:32:47,688][15444] Updated weights for policy 0, policy_version 51751 (0.0017) [2024-08-05 19:32:47,802][15417] Signal inference workers to stop experience collection... (19000 times) [2024-08-05 19:32:47,803][15417] Signal inference workers to resume experience collection... (19000 times) [2024-08-05 19:32:47,841][15444] InferenceWorker_p0-w0: stopping experience collection (19000 times) [2024-08-05 19:32:47,842][15444] InferenceWorker_p0-w0: resuming experience collection (19000 times) [2024-08-05 19:32:48,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.4, 300 sec: 24103.9). Total num frames: 423960576. Throughput: 0: 6007.5. Samples: 105995400. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 19:32:48,119][15372] Avg episode reward: [(0, '41.478')] [2024-08-05 19:32:50,766][15444] Updated weights for policy 0, policy_version 51761 (0.0022) [2024-08-05 19:32:53,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24029.8, 300 sec: 24076.2). Total num frames: 424075264. Throughput: 0: 5997.5. Samples: 106013440. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 19:32:53,126][15372] Avg episode reward: [(0, '43.196')] [2024-08-05 19:32:54,567][15444] Updated weights for policy 0, policy_version 51771 (0.0022) [2024-08-05 19:32:57,787][15444] Updated weights for policy 0, policy_version 51781 (0.0030) [2024-08-05 19:32:58,118][15372] Fps is (10 sec: 22937.8, 60 sec: 24029.9, 300 sec: 24076.2). Total num frames: 424189952. Throughput: 0: 5983.1. Samples: 106049060. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 19:32:58,119][15372] Avg episode reward: [(0, '42.983')] [2024-08-05 19:33:01,139][15444] Updated weights for policy 0, policy_version 51791 (0.0029) [2024-08-05 19:33:03,120][15372] Fps is (10 sec: 23752.9, 60 sec: 23892.7, 300 sec: 24103.8). Total num frames: 424312832. Throughput: 0: 5968.3. Samples: 106084310. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 19:33:03,128][15372] Avg episode reward: [(0, '42.766')] [2024-08-05 19:33:04,599][15444] Updated weights for policy 0, policy_version 51801 (0.0010) [2024-08-05 19:33:07,785][15444] Updated weights for policy 0, policy_version 51811 (0.0015) [2024-08-05 19:33:08,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24029.9, 300 sec: 24104.2). Total num frames: 424435712. Throughput: 0: 5977.8. Samples: 106103050. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 19:33:08,119][15372] Avg episode reward: [(0, '43.922')] [2024-08-05 19:33:11,495][15444] Updated weights for policy 0, policy_version 51821 (0.0024) [2024-08-05 19:33:13,119][15372] Fps is (10 sec: 23760.5, 60 sec: 23893.3, 300 sec: 24076.1). Total num frames: 424550400. Throughput: 0: 5985.1. Samples: 106139040. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 19:33:13,119][15372] Avg episode reward: [(0, '44.100')] [2024-08-05 19:33:14,708][15444] Updated weights for policy 0, policy_version 51831 (0.0024) [2024-08-05 19:33:18,118][15372] Fps is (10 sec: 23756.7, 60 sec: 23893.4, 300 sec: 24103.9). Total num frames: 424673280. Throughput: 0: 6000.4. Samples: 106175680. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 19:33:18,126][15372] Avg episode reward: [(0, '43.745')] [2024-08-05 19:33:18,177][15444] Updated weights for policy 0, policy_version 51841 (0.0026) [2024-08-05 19:33:21,473][15444] Updated weights for policy 0, policy_version 51851 (0.0021) [2024-08-05 19:33:23,118][15372] Fps is (10 sec: 24576.3, 60 sec: 23893.3, 300 sec: 24104.2). Total num frames: 424796160. Throughput: 0: 6002.5. Samples: 106193750. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 19:33:23,126][15372] Avg episode reward: [(0, '42.764')] [2024-08-05 19:33:24,773][15444] Updated weights for policy 0, policy_version 51861 (0.0018) [2024-08-05 19:33:28,121][15372] Fps is (10 sec: 24570.8, 60 sec: 24165.6, 300 sec: 24131.5). Total num frames: 424919040. Throughput: 0: 6030.2. Samples: 106230650. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 19:33:28,128][15372] Avg episode reward: [(0, '41.917')] [2024-08-05 19:33:28,356][15444] Updated weights for policy 0, policy_version 51871 (0.0018) [2024-08-05 19:33:28,471][15417] Signal inference workers to stop experience collection... (19050 times) [2024-08-05 19:33:28,472][15417] Signal inference workers to resume experience collection... (19050 times) [2024-08-05 19:33:28,513][15444] InferenceWorker_p0-w0: stopping experience collection (19050 times) [2024-08-05 19:33:28,514][15444] InferenceWorker_p0-w0: resuming experience collection (19050 times) [2024-08-05 19:33:31,612][15444] Updated weights for policy 0, policy_version 51881 (0.0024) [2024-08-05 19:33:33,123][15372] Fps is (10 sec: 24565.7, 60 sec: 24028.2, 300 sec: 24131.3). Total num frames: 425041920. Throughput: 0: 6021.5. Samples: 106266390. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 19:33:33,131][15372] Avg episode reward: [(0, '43.248')] [2024-08-05 19:33:35,010][15444] Updated weights for policy 0, policy_version 51891 (0.0012) [2024-08-05 19:33:38,118][15372] Fps is (10 sec: 24581.3, 60 sec: 24029.9, 300 sec: 24103.9). Total num frames: 425164800. Throughput: 0: 6025.8. Samples: 106284600. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 19:33:38,126][15372] Avg episode reward: [(0, '44.156')] [2024-08-05 19:33:38,462][15444] Updated weights for policy 0, policy_version 51901 (0.0022) [2024-08-05 19:33:42,067][15444] Updated weights for policy 0, policy_version 51911 (0.0013) [2024-08-05 19:33:43,118][15372] Fps is (10 sec: 23766.8, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 425279488. Throughput: 0: 6033.8. Samples: 106320580. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 19:33:43,119][15372] Avg episode reward: [(0, '44.123')] [2024-08-05 19:33:45,404][15444] Updated weights for policy 0, policy_version 51921 (0.0022) [2024-08-05 19:33:48,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.9, 300 sec: 24131.9). Total num frames: 425402368. Throughput: 0: 6057.1. Samples: 106356870. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 19:33:48,126][15372] Avg episode reward: [(0, '42.683')] [2024-08-05 19:33:48,659][15444] Updated weights for policy 0, policy_version 51931 (0.0013) [2024-08-05 19:33:52,097][15444] Updated weights for policy 0, policy_version 51941 (0.0018) [2024-08-05 19:33:53,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.4, 300 sec: 24104.0). Total num frames: 425525248. Throughput: 0: 6036.2. Samples: 106374680. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 19:33:53,119][15372] Avg episode reward: [(0, '42.776')] [2024-08-05 19:33:55,393][15444] Updated weights for policy 0, policy_version 51951 (0.0033) [2024-08-05 19:33:58,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.4, 300 sec: 24103.9). Total num frames: 425639936. Throughput: 0: 6037.6. Samples: 106410730. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 19:33:58,125][15372] Avg episode reward: [(0, '42.876')] [2024-08-05 19:33:59,016][15444] Updated weights for policy 0, policy_version 51961 (0.0016) [2024-08-05 19:34:02,567][15444] Updated weights for policy 0, policy_version 51971 (0.0014) [2024-08-05 19:34:03,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24167.1, 300 sec: 24103.9). Total num frames: 425762816. Throughput: 0: 6018.9. Samples: 106446530. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 19:34:03,119][15372] Avg episode reward: [(0, '43.122')] [2024-08-05 19:34:05,686][15444] Updated weights for policy 0, policy_version 51981 (0.0025) [2024-08-05 19:34:08,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.4, 300 sec: 24104.1). Total num frames: 425885696. Throughput: 0: 6020.0. Samples: 106464650. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 19:34:08,126][15372] Avg episode reward: [(0, '42.303')] [2024-08-05 19:34:09,174][15444] Updated weights for policy 0, policy_version 51991 (0.0010) [2024-08-05 19:34:12,798][15444] Updated weights for policy 0, policy_version 52001 (0.0022) [2024-08-05 19:34:13,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.5, 300 sec: 24076.1). Total num frames: 426000384. Throughput: 0: 5998.3. Samples: 106500560. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 19:34:13,119][15372] Avg episode reward: [(0, '42.500')] [2024-08-05 19:34:15,890][15444] Updated weights for policy 0, policy_version 52011 (0.0026) [2024-08-05 19:34:18,118][15372] Fps is (10 sec: 23756.6, 60 sec: 24166.4, 300 sec: 24076.1). Total num frames: 426123264. Throughput: 0: 6009.4. Samples: 106536790. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 19:34:18,126][15372] Avg episode reward: [(0, '43.476')] [2024-08-05 19:34:18,130][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000052017_426123264.pth... [2024-08-05 19:34:18,284][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000051310_420331520.pth [2024-08-05 19:34:19,542][15444] Updated weights for policy 0, policy_version 52021 (0.0012) [2024-08-05 19:34:22,182][15417] Signal inference workers to stop experience collection... (19100 times) [2024-08-05 19:34:22,183][15417] Signal inference workers to resume experience collection... (19100 times) [2024-08-05 19:34:22,215][15444] InferenceWorker_p0-w0: stopping experience collection (19100 times) [2024-08-05 19:34:22,223][15444] InferenceWorker_p0-w0: resuming experience collection (19100 times) [2024-08-05 19:34:22,836][15444] Updated weights for policy 0, policy_version 52031 (0.0011) [2024-08-05 19:34:23,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24029.9, 300 sec: 24076.2). Total num frames: 426237952. Throughput: 0: 5996.9. Samples: 106554460. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 19:34:23,119][15372] Avg episode reward: [(0, '43.610')] [2024-08-05 19:34:26,135][15444] Updated weights for policy 0, policy_version 52041 (0.0011) [2024-08-05 19:34:28,119][15372] Fps is (10 sec: 23756.2, 60 sec: 24030.6, 300 sec: 24076.1). Total num frames: 426360832. Throughput: 0: 6006.4. Samples: 106590870. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 19:34:28,127][15372] Avg episode reward: [(0, '43.952')] [2024-08-05 19:34:29,774][15444] Updated weights for policy 0, policy_version 52051 (0.0014) [2024-08-05 19:34:32,942][15444] Updated weights for policy 0, policy_version 52061 (0.0011) [2024-08-05 19:34:33,119][15372] Fps is (10 sec: 24575.8, 60 sec: 24031.5, 300 sec: 24076.1). Total num frames: 426483712. Throughput: 0: 6003.8. Samples: 106627040. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 19:34:33,119][15372] Avg episode reward: [(0, '42.394')] [2024-08-05 19:34:36,389][15444] Updated weights for policy 0, policy_version 52071 (0.0024) [2024-08-05 19:34:38,118][15372] Fps is (10 sec: 23757.6, 60 sec: 23893.3, 300 sec: 24076.2). Total num frames: 426598400. Throughput: 0: 6014.7. Samples: 106645340. Policy #0 lag: (min: 1.0, avg: 3.8, max: 7.0) [2024-08-05 19:34:38,126][15372] Avg episode reward: [(0, '42.427')] [2024-08-05 19:34:39,633][15444] Updated weights for policy 0, policy_version 52081 (0.0014) [2024-08-05 19:34:43,118][15372] Fps is (10 sec: 23757.1, 60 sec: 24029.9, 300 sec: 24103.9). Total num frames: 426721280. Throughput: 0: 6028.2. Samples: 106682000. Policy #0 lag: (min: 1.0, avg: 3.8, max: 7.0) [2024-08-05 19:34:43,126][15372] Avg episode reward: [(0, '43.317')] [2024-08-05 19:34:43,241][15444] Updated weights for policy 0, policy_version 52091 (0.0034) [2024-08-05 19:34:46,481][15444] Updated weights for policy 0, policy_version 52101 (0.0017) [2024-08-05 19:34:48,119][15372] Fps is (10 sec: 25394.7, 60 sec: 24166.3, 300 sec: 24103.9). Total num frames: 426852352. Throughput: 0: 6027.5. Samples: 106717770. Policy #0 lag: (min: 1.0, avg: 3.8, max: 7.0) [2024-08-05 19:34:48,126][15372] Avg episode reward: [(0, '43.466')] [2024-08-05 19:34:49,855][15444] Updated weights for policy 0, policy_version 52111 (0.0019) [2024-08-05 19:34:53,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24029.9, 300 sec: 24103.9). Total num frames: 426967040. Throughput: 0: 6034.9. Samples: 106736220. Policy #0 lag: (min: 1.0, avg: 3.8, max: 7.0) [2024-08-05 19:34:53,126][15372] Avg episode reward: [(0, '44.162')] [2024-08-05 19:34:53,202][15444] Updated weights for policy 0, policy_version 52121 (0.0017) [2024-08-05 19:34:56,555][15444] Updated weights for policy 0, policy_version 52131 (0.0017) [2024-08-05 19:34:58,119][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.3, 300 sec: 24103.9). Total num frames: 427089920. Throughput: 0: 6043.1. Samples: 106772500. Policy #0 lag: (min: 1.0, avg: 3.8, max: 7.0) [2024-08-05 19:34:58,126][15372] Avg episode reward: [(0, '43.620')] [2024-08-05 19:34:59,877][15444] Updated weights for policy 0, policy_version 52141 (0.0012) [2024-08-05 19:35:03,119][15372] Fps is (10 sec: 24575.3, 60 sec: 24166.3, 300 sec: 24131.7). Total num frames: 427212800. Throughput: 0: 6052.0. Samples: 106809130. Policy #0 lag: (min: 1.0, avg: 3.8, max: 7.0) [2024-08-05 19:35:03,126][15372] Avg episode reward: [(0, '43.161')] [2024-08-05 19:35:03,320][15444] Updated weights for policy 0, policy_version 52151 (0.0017) [2024-08-05 19:35:06,674][15444] Updated weights for policy 0, policy_version 52161 (0.0012) [2024-08-05 19:35:08,118][15372] Fps is (10 sec: 24576.5, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 427335680. Throughput: 0: 6074.4. Samples: 106827810. Policy #0 lag: (min: 1.0, avg: 4.0, max: 8.0) [2024-08-05 19:35:08,119][15372] Avg episode reward: [(0, '43.315')] [2024-08-05 19:35:10,124][15444] Updated weights for policy 0, policy_version 52171 (0.0013) [2024-08-05 19:35:13,120][15372] Fps is (10 sec: 24571.6, 60 sec: 24302.1, 300 sec: 24159.3). Total num frames: 427458560. Throughput: 0: 6082.7. Samples: 106864600. Policy #0 lag: (min: 1.0, avg: 4.0, max: 8.0) [2024-08-05 19:35:13,128][15372] Avg episode reward: [(0, '43.193')] [2024-08-05 19:35:13,425][15444] Updated weights for policy 0, policy_version 52181 (0.0020) [2024-08-05 19:35:16,937][15444] Updated weights for policy 0, policy_version 52191 (0.0020) [2024-08-05 19:35:18,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24303.0, 300 sec: 24159.5). Total num frames: 427581440. Throughput: 0: 6070.5. Samples: 106900210. Policy #0 lag: (min: 1.0, avg: 4.0, max: 8.0) [2024-08-05 19:35:18,119][15372] Avg episode reward: [(0, '43.970')] [2024-08-05 19:35:20,027][15444] Updated weights for policy 0, policy_version 52201 (0.0017) [2024-08-05 19:35:23,146][15372] Fps is (10 sec: 23697.3, 60 sec: 24291.9, 300 sec: 24129.5). Total num frames: 427696128. Throughput: 0: 6081.7. Samples: 106919180. Policy #0 lag: (min: 1.0, avg: 4.0, max: 8.0) [2024-08-05 19:35:23,154][15372] Avg episode reward: [(0, '42.596')] [2024-08-05 19:35:23,649][15444] Updated weights for policy 0, policy_version 52211 (0.0013) [2024-08-05 19:35:24,615][15417] Signal inference workers to stop experience collection... (19150 times) [2024-08-05 19:35:24,615][15417] Signal inference workers to resume experience collection... (19150 times) [2024-08-05 19:35:24,662][15444] InferenceWorker_p0-w0: stopping experience collection (19150 times) [2024-08-05 19:35:24,663][15444] InferenceWorker_p0-w0: resuming experience collection (19150 times) [2024-08-05 19:35:26,777][15444] Updated weights for policy 0, policy_version 52221 (0.0013) [2024-08-05 19:35:28,121][15372] Fps is (10 sec: 23751.6, 60 sec: 24302.2, 300 sec: 24131.5). Total num frames: 427819008. Throughput: 0: 6071.7. Samples: 106955240. Policy #0 lag: (min: 1.0, avg: 4.0, max: 8.0) [2024-08-05 19:35:28,129][15372] Avg episode reward: [(0, '42.475')] [2024-08-05 19:35:30,316][15444] Updated weights for policy 0, policy_version 52231 (0.0021) [2024-08-05 19:35:33,119][15372] Fps is (10 sec: 24641.9, 60 sec: 24302.8, 300 sec: 24103.9). Total num frames: 427941888. Throughput: 0: 6087.1. Samples: 106991690. Policy #0 lag: (min: 1.0, avg: 4.0, max: 8.0) [2024-08-05 19:35:33,120][15372] Avg episode reward: [(0, '42.483')] [2024-08-05 19:35:33,929][15444] Updated weights for policy 0, policy_version 52241 (0.0013) [2024-08-05 19:35:37,048][15444] Updated weights for policy 0, policy_version 52251 (0.0031) [2024-08-05 19:35:38,118][15372] Fps is (10 sec: 23761.8, 60 sec: 24302.9, 300 sec: 24103.9). Total num frames: 428056576. Throughput: 0: 6075.1. Samples: 107009600. Policy #0 lag: (min: 1.0, avg: 4.0, max: 8.0) [2024-08-05 19:35:38,119][15372] Avg episode reward: [(0, '41.500')] [2024-08-05 19:35:40,735][15444] Updated weights for policy 0, policy_version 52261 (0.0037) [2024-08-05 19:35:43,119][15372] Fps is (10 sec: 23757.5, 60 sec: 24302.9, 300 sec: 24103.9). Total num frames: 428179456. Throughput: 0: 6048.5. Samples: 107044680. Policy #0 lag: (min: 0.0, avg: 4.4, max: 8.0) [2024-08-05 19:35:43,126][15372] Avg episode reward: [(0, '42.413')] [2024-08-05 19:35:43,801][15444] Updated weights for policy 0, policy_version 52271 (0.0010) [2024-08-05 19:35:47,569][15444] Updated weights for policy 0, policy_version 52281 (0.0015) [2024-08-05 19:35:48,119][15372] Fps is (10 sec: 24575.8, 60 sec: 24166.4, 300 sec: 24103.9). Total num frames: 428302336. Throughput: 0: 6045.6. Samples: 107081180. Policy #0 lag: (min: 0.0, avg: 4.4, max: 8.0) [2024-08-05 19:35:48,119][15372] Avg episode reward: [(0, '43.087')] [2024-08-05 19:35:50,954][15444] Updated weights for policy 0, policy_version 52291 (0.0011) [2024-08-05 19:35:53,118][15372] Fps is (10 sec: 23757.1, 60 sec: 24166.4, 300 sec: 24076.1). Total num frames: 428417024. Throughput: 0: 6036.4. Samples: 107099450. Policy #0 lag: (min: 0.0, avg: 4.4, max: 8.0) [2024-08-05 19:35:53,126][15372] Avg episode reward: [(0, '43.258')] [2024-08-05 19:35:54,262][15444] Updated weights for policy 0, policy_version 52301 (0.0017) [2024-08-05 19:35:57,792][15444] Updated weights for policy 0, policy_version 52311 (0.0034) [2024-08-05 19:35:58,118][15372] Fps is (10 sec: 22937.9, 60 sec: 24029.9, 300 sec: 24076.2). Total num frames: 428531712. Throughput: 0: 6012.3. Samples: 107135140. Policy #0 lag: (min: 0.0, avg: 4.4, max: 8.0) [2024-08-05 19:35:58,119][15372] Avg episode reward: [(0, '42.334')] [2024-08-05 19:36:00,783][15417] Signal inference workers to stop experience collection... (19200 times) [2024-08-05 19:36:00,784][15417] Signal inference workers to resume experience collection... (19200 times) [2024-08-05 19:36:00,865][15444] InferenceWorker_p0-w0: stopping experience collection (19200 times) [2024-08-05 19:36:00,865][15444] InferenceWorker_p0-w0: resuming experience collection (19200 times) [2024-08-05 19:36:00,867][15444] Updated weights for policy 0, policy_version 52321 (0.0013) [2024-08-05 19:36:03,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24166.5, 300 sec: 24103.9). Total num frames: 428662784. Throughput: 0: 6011.8. Samples: 107170740. Policy #0 lag: (min: 0.0, avg: 4.4, max: 8.0) [2024-08-05 19:36:03,126][15372] Avg episode reward: [(0, '42.146')] [2024-08-05 19:36:04,551][15444] Updated weights for policy 0, policy_version 52331 (0.0021) [2024-08-05 19:36:08,069][15444] Updated weights for policy 0, policy_version 52341 (0.0012) [2024-08-05 19:36:08,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24029.9, 300 sec: 24076.3). Total num frames: 428777472. Throughput: 0: 6011.4. Samples: 107189530. Policy #0 lag: (min: 0.0, avg: 4.4, max: 8.0) [2024-08-05 19:36:08,119][15372] Avg episode reward: [(0, '42.809')] [2024-08-05 19:36:11,178][15444] Updated weights for policy 0, policy_version 52351 (0.0012) [2024-08-05 19:36:13,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24030.7, 300 sec: 24103.9). Total num frames: 428900352. Throughput: 0: 5999.6. Samples: 107225210. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 19:36:13,126][15372] Avg episode reward: [(0, '43.291')] [2024-08-05 19:36:14,702][15444] Updated weights for policy 0, policy_version 52361 (0.0022) [2024-08-05 19:36:17,886][15444] Updated weights for policy 0, policy_version 52371 (0.0012) [2024-08-05 19:36:18,119][15372] Fps is (10 sec: 24575.6, 60 sec: 24029.8, 300 sec: 24131.7). Total num frames: 429023232. Throughput: 0: 5993.1. Samples: 107261380. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 19:36:18,119][15372] Avg episode reward: [(0, '44.759')] [2024-08-05 19:36:18,122][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000052371_429023232.pth... [2024-08-05 19:36:18,264][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000051665_423239680.pth [2024-08-05 19:36:18,268][15417] Saving new best policy, reward=44.759! [2024-08-05 19:36:21,517][15444] Updated weights for policy 0, policy_version 52381 (0.0027) [2024-08-05 19:36:23,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24040.8, 300 sec: 24076.1). Total num frames: 429137920. Throughput: 0: 6004.7. Samples: 107279810. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 19:36:23,119][15372] Avg episode reward: [(0, '44.935')] [2024-08-05 19:36:23,120][15417] Saving new best policy, reward=44.935! [2024-08-05 19:36:24,902][15444] Updated weights for policy 0, policy_version 52391 (0.0021) [2024-08-05 19:36:28,118][15372] Fps is (10 sec: 23757.2, 60 sec: 24030.7, 300 sec: 24103.9). Total num frames: 429260800. Throughput: 0: 6022.5. Samples: 107315690. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 19:36:28,134][15372] Avg episode reward: [(0, '43.662')] [2024-08-05 19:36:28,215][15444] Updated weights for policy 0, policy_version 52401 (0.0022) [2024-08-05 19:36:31,936][15444] Updated weights for policy 0, policy_version 52411 (0.0012) [2024-08-05 19:36:33,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24030.0, 300 sec: 24103.9). Total num frames: 429383680. Throughput: 0: 6001.3. Samples: 107351240. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 19:36:33,119][15372] Avg episode reward: [(0, '43.885')] [2024-08-05 19:36:34,891][15444] Updated weights for policy 0, policy_version 52421 (0.0017) [2024-08-05 19:36:38,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.9, 300 sec: 24104.4). Total num frames: 429498368. Throughput: 0: 6024.2. Samples: 107370540. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 19:36:38,126][15372] Avg episode reward: [(0, '44.883')] [2024-08-05 19:36:38,521][15444] Updated weights for policy 0, policy_version 52431 (0.0013) [2024-08-05 19:36:41,828][15444] Updated weights for policy 0, policy_version 52441 (0.0017) [2024-08-05 19:36:43,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.9, 300 sec: 24103.9). Total num frames: 429621248. Throughput: 0: 6018.7. Samples: 107405980. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 19:36:43,119][15372] Avg episode reward: [(0, '43.613')] [2024-08-05 19:36:45,142][15444] Updated weights for policy 0, policy_version 52451 (0.0021) [2024-08-05 19:36:48,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24029.9, 300 sec: 24103.9). Total num frames: 429744128. Throughput: 0: 6024.2. Samples: 107441830. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 19:36:48,126][15372] Avg episode reward: [(0, '42.399')] [2024-08-05 19:36:48,874][15444] Updated weights for policy 0, policy_version 52461 (0.0034) [2024-08-05 19:36:51,910][15444] Updated weights for policy 0, policy_version 52471 (0.0013) [2024-08-05 19:36:53,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24029.9, 300 sec: 24103.9). Total num frames: 429858816. Throughput: 0: 6025.6. Samples: 107460680. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 19:36:53,119][15372] Avg episode reward: [(0, '43.235')] [2024-08-05 19:36:54,021][15417] Signal inference workers to stop experience collection... (19250 times) [2024-08-05 19:36:54,022][15417] Signal inference workers to resume experience collection... (19250 times) [2024-08-05 19:36:54,054][15444] InferenceWorker_p0-w0: stopping experience collection (19250 times) [2024-08-05 19:36:54,083][15444] InferenceWorker_p0-w0: resuming experience collection (19250 times) [2024-08-05 19:36:55,597][15444] Updated weights for policy 0, policy_version 52481 (0.0014) [2024-08-05 19:36:58,119][15372] Fps is (10 sec: 24575.8, 60 sec: 24302.9, 300 sec: 24103.9). Total num frames: 429989888. Throughput: 0: 6041.3. Samples: 107497070. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 19:36:58,119][15372] Avg episode reward: [(0, '43.026')] [2024-08-05 19:36:58,926][15444] Updated weights for policy 0, policy_version 52491 (0.0013) [2024-08-05 19:37:02,275][15444] Updated weights for policy 0, policy_version 52501 (0.0031) [2024-08-05 19:37:03,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24029.9, 300 sec: 24103.9). Total num frames: 430104576. Throughput: 0: 6022.0. Samples: 107532370. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 19:37:03,119][15372] Avg episode reward: [(0, '43.050')] [2024-08-05 19:37:05,678][15444] Updated weights for policy 0, policy_version 52511 (0.0017) [2024-08-05 19:37:08,119][15372] Fps is (10 sec: 23755.7, 60 sec: 24166.2, 300 sec: 24103.9). Total num frames: 430227456. Throughput: 0: 6036.1. Samples: 107551440. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 19:37:08,119][15372] Avg episode reward: [(0, '43.747')] [2024-08-05 19:37:08,851][15444] Updated weights for policy 0, policy_version 52521 (0.0028) [2024-08-05 19:37:12,457][15444] Updated weights for policy 0, policy_version 52531 (0.0023) [2024-08-05 19:37:13,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.9, 300 sec: 24076.2). Total num frames: 430342144. Throughput: 0: 6031.1. Samples: 107587090. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 19:37:13,119][15372] Avg episode reward: [(0, '42.424')] [2024-08-05 19:37:15,613][15444] Updated weights for policy 0, policy_version 52541 (0.0015) [2024-08-05 19:37:18,118][15372] Fps is (10 sec: 23758.1, 60 sec: 24029.9, 300 sec: 24076.1). Total num frames: 430465024. Throughput: 0: 6035.1. Samples: 107622820. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 19:37:18,126][15372] Avg episode reward: [(0, '42.204')] [2024-08-05 19:37:19,383][15444] Updated weights for policy 0, policy_version 52551 (0.0022) [2024-08-05 19:37:22,906][15444] Updated weights for policy 0, policy_version 52561 (0.0019) [2024-08-05 19:37:23,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 430587904. Throughput: 0: 6014.2. Samples: 107641180. Policy #0 lag: (min: 0.0, avg: 3.9, max: 9.0) [2024-08-05 19:37:23,119][15372] Avg episode reward: [(0, '43.596')] [2024-08-05 19:37:26,128][15444] Updated weights for policy 0, policy_version 52571 (0.0020) [2024-08-05 19:37:28,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.9, 300 sec: 24076.1). Total num frames: 430702592. Throughput: 0: 5998.7. Samples: 107675920. Policy #0 lag: (min: 0.0, avg: 3.9, max: 9.0) [2024-08-05 19:37:28,126][15372] Avg episode reward: [(0, '43.768')] [2024-08-05 19:37:29,677][15444] Updated weights for policy 0, policy_version 52581 (0.0038) [2024-08-05 19:37:33,039][15444] Updated weights for policy 0, policy_version 52591 (0.0028) [2024-08-05 19:37:33,119][15372] Fps is (10 sec: 23756.4, 60 sec: 24029.8, 300 sec: 24076.1). Total num frames: 430825472. Throughput: 0: 6008.6. Samples: 107712220. Policy #0 lag: (min: 0.0, avg: 3.9, max: 9.0) [2024-08-05 19:37:33,119][15372] Avg episode reward: [(0, '44.286')] [2024-08-05 19:37:36,451][15444] Updated weights for policy 0, policy_version 52601 (0.0014) [2024-08-05 19:37:36,560][15417] Signal inference workers to stop experience collection... (19300 times) [2024-08-05 19:37:36,563][15417] Signal inference workers to resume experience collection... (19300 times) [2024-08-05 19:37:36,633][15444] InferenceWorker_p0-w0: stopping experience collection (19300 times) [2024-08-05 19:37:36,636][15444] InferenceWorker_p0-w0: resuming experience collection (19300 times) [2024-08-05 19:37:38,119][15372] Fps is (10 sec: 24575.7, 60 sec: 24166.3, 300 sec: 24131.7). Total num frames: 430948352. Throughput: 0: 5999.8. Samples: 107730670. Policy #0 lag: (min: 0.0, avg: 3.9, max: 9.0) [2024-08-05 19:37:38,119][15372] Avg episode reward: [(0, '44.111')] [2024-08-05 19:37:39,737][15444] Updated weights for policy 0, policy_version 52611 (0.0028) [2024-08-05 19:37:43,019][15444] Updated weights for policy 0, policy_version 52621 (0.0026) [2024-08-05 19:37:43,118][15372] Fps is (10 sec: 24576.5, 60 sec: 24166.4, 300 sec: 24103.9). Total num frames: 431071232. Throughput: 0: 6015.6. Samples: 107767770. Policy #0 lag: (min: 0.0, avg: 3.9, max: 9.0) [2024-08-05 19:37:43,119][15372] Avg episode reward: [(0, '42.681')] [2024-08-05 19:37:46,501][15444] Updated weights for policy 0, policy_version 52631 (0.0025) [2024-08-05 19:37:48,118][15372] Fps is (10 sec: 24576.3, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 431194112. Throughput: 0: 6031.3. Samples: 107803780. Policy #0 lag: (min: 0.0, avg: 3.9, max: 9.0) [2024-08-05 19:37:48,126][15372] Avg episode reward: [(0, '43.140')] [2024-08-05 19:37:49,910][15444] Updated weights for policy 0, policy_version 52641 (0.0022) [2024-08-05 19:37:53,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 431308800. Throughput: 0: 6017.4. Samples: 107822220. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 19:37:53,126][15372] Avg episode reward: [(0, '43.071')] [2024-08-05 19:37:53,346][15444] Updated weights for policy 0, policy_version 52651 (0.0016) [2024-08-05 19:37:56,758][15444] Updated weights for policy 0, policy_version 52661 (0.0013) [2024-08-05 19:37:58,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.9, 300 sec: 24131.8). Total num frames: 431431680. Throughput: 0: 6025.1. Samples: 107858220. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 19:37:58,119][15372] Avg episode reward: [(0, '42.934')] [2024-08-05 19:37:59,738][15444] Updated weights for policy 0, policy_version 52671 (0.0016) [2024-08-05 19:38:03,119][15372] Fps is (10 sec: 24575.8, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 431554560. Throughput: 0: 6045.1. Samples: 107894850. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 19:38:03,126][15372] Avg episode reward: [(0, '42.621')] [2024-08-05 19:38:03,462][15444] Updated weights for policy 0, policy_version 52681 (0.0023) [2024-08-05 19:38:06,874][15444] Updated weights for policy 0, policy_version 52691 (0.0014) [2024-08-05 19:38:08,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24166.6, 300 sec: 24159.5). Total num frames: 431677440. Throughput: 0: 6036.4. Samples: 107912820. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 19:38:08,119][15372] Avg episode reward: [(0, '42.554')] [2024-08-05 19:38:09,985][15444] Updated weights for policy 0, policy_version 52701 (0.0014) [2024-08-05 19:38:13,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24302.9, 300 sec: 24159.5). Total num frames: 431800320. Throughput: 0: 6083.6. Samples: 107949680. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 19:38:13,126][15372] Avg episode reward: [(0, '42.263')] [2024-08-05 19:38:13,550][15444] Updated weights for policy 0, policy_version 52711 (0.0013) [2024-08-05 19:38:16,957][15444] Updated weights for policy 0, policy_version 52721 (0.0032) [2024-08-05 19:38:18,119][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 431915008. Throughput: 0: 6076.7. Samples: 107985670. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 19:38:18,119][15372] Avg episode reward: [(0, '43.988')] [2024-08-05 19:38:18,131][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000052725_431923200.pth... [2024-08-05 19:38:18,291][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000052017_426123264.pth [2024-08-05 19:38:20,210][15444] Updated weights for policy 0, policy_version 52731 (0.0026) [2024-08-05 19:38:23,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24131.9). Total num frames: 432037888. Throughput: 0: 6074.5. Samples: 108004020. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 19:38:23,126][15372] Avg episode reward: [(0, '44.042')] [2024-08-05 19:38:23,525][15444] Updated weights for policy 0, policy_version 52741 (0.0013) [2024-08-05 19:38:25,710][15417] Signal inference workers to stop experience collection... (19350 times) [2024-08-05 19:38:25,710][15417] Signal inference workers to resume experience collection... (19350 times) [2024-08-05 19:38:25,779][15444] InferenceWorker_p0-w0: stopping experience collection (19350 times) [2024-08-05 19:38:25,779][15444] InferenceWorker_p0-w0: resuming experience collection (19350 times) [2024-08-05 19:38:26,952][15444] Updated weights for policy 0, policy_version 52751 (0.0022) [2024-08-05 19:38:28,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24302.9, 300 sec: 24132.0). Total num frames: 432160768. Throughput: 0: 6057.1. Samples: 108040340. Policy #0 lag: (min: 0.0, avg: 4.2, max: 9.0) [2024-08-05 19:38:28,119][15372] Avg episode reward: [(0, '42.518')] [2024-08-05 19:38:30,571][15444] Updated weights for policy 0, policy_version 52761 (0.0018) [2024-08-05 19:38:33,120][15372] Fps is (10 sec: 24572.3, 60 sec: 24302.4, 300 sec: 24131.6). Total num frames: 432283648. Throughput: 0: 6074.2. Samples: 108077130. Policy #0 lag: (min: 0.0, avg: 4.2, max: 9.0) [2024-08-05 19:38:33,128][15372] Avg episode reward: [(0, '42.447')] [2024-08-05 19:38:33,888][15444] Updated weights for policy 0, policy_version 52771 (0.0035) [2024-08-05 19:38:37,287][15444] Updated weights for policy 0, policy_version 52781 (0.0025) [2024-08-05 19:38:38,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.5, 300 sec: 24131.7). Total num frames: 432398336. Throughput: 0: 6057.1. Samples: 108094790. Policy #0 lag: (min: 0.0, avg: 4.2, max: 9.0) [2024-08-05 19:38:38,119][15372] Avg episode reward: [(0, '43.629')] [2024-08-05 19:38:40,561][15444] Updated weights for policy 0, policy_version 52791 (0.0014) [2024-08-05 19:38:43,124][15372] Fps is (10 sec: 23747.0, 60 sec: 24164.1, 300 sec: 24131.2). Total num frames: 432521216. Throughput: 0: 6053.7. Samples: 108130670. Policy #0 lag: (min: 0.0, avg: 4.2, max: 9.0) [2024-08-05 19:38:43,124][15372] Avg episode reward: [(0, '44.119')] [2024-08-05 19:38:43,930][15444] Updated weights for policy 0, policy_version 52801 (0.0027) [2024-08-05 19:38:47,394][15444] Updated weights for policy 0, policy_version 52811 (0.0024) [2024-08-05 19:38:48,119][15372] Fps is (10 sec: 24575.7, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 432644096. Throughput: 0: 6046.4. Samples: 108166940. Policy #0 lag: (min: 0.0, avg: 4.2, max: 9.0) [2024-08-05 19:38:48,119][15372] Avg episode reward: [(0, '43.287')] [2024-08-05 19:38:50,581][15444] Updated weights for policy 0, policy_version 52821 (0.0014) [2024-08-05 19:38:53,118][15372] Fps is (10 sec: 23770.2, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 432758784. Throughput: 0: 6041.8. Samples: 108184700. Policy #0 lag: (min: 0.0, avg: 4.2, max: 9.0) [2024-08-05 19:38:53,126][15372] Avg episode reward: [(0, '43.554')] [2024-08-05 19:38:54,327][15444] Updated weights for policy 0, policy_version 52831 (0.0013) [2024-08-05 19:38:57,874][15444] Updated weights for policy 0, policy_version 52841 (0.0024) [2024-08-05 19:38:58,119][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.3, 300 sec: 24131.7). Total num frames: 432881664. Throughput: 0: 6028.9. Samples: 108220980. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 19:38:58,119][15372] Avg episode reward: [(0, '45.054')] [2024-08-05 19:38:58,121][15417] Saving new best policy, reward=45.054! [2024-08-05 19:39:00,926][15444] Updated weights for policy 0, policy_version 52851 (0.0021) [2024-08-05 19:39:03,119][15372] Fps is (10 sec: 24573.9, 60 sec: 24166.1, 300 sec: 24131.6). Total num frames: 433004544. Throughput: 0: 6015.2. Samples: 108256360. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 19:39:03,120][15372] Avg episode reward: [(0, '43.008')] [2024-08-05 19:39:04,639][15444] Updated weights for policy 0, policy_version 52861 (0.0011) [2024-08-05 19:39:07,903][15444] Updated weights for policy 0, policy_version 52871 (0.0024) [2024-08-05 19:39:08,118][15372] Fps is (10 sec: 23757.1, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 433119232. Throughput: 0: 5998.2. Samples: 108273940. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 19:39:08,119][15372] Avg episode reward: [(0, '41.993')] [2024-08-05 19:39:08,892][15417] Signal inference workers to stop experience collection... (19400 times) [2024-08-05 19:39:08,893][15417] Signal inference workers to resume experience collection... (19400 times) [2024-08-05 19:39:08,972][15444] InferenceWorker_p0-w0: stopping experience collection (19400 times) [2024-08-05 19:39:08,972][15444] InferenceWorker_p0-w0: resuming experience collection (19400 times) [2024-08-05 19:39:11,259][15444] Updated weights for policy 0, policy_version 52881 (0.0016) [2024-08-05 19:39:13,118][15372] Fps is (10 sec: 23758.7, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 433242112. Throughput: 0: 6006.4. Samples: 108310630. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 19:39:13,119][15372] Avg episode reward: [(0, '42.413')] [2024-08-05 19:39:14,631][15444] Updated weights for policy 0, policy_version 52891 (0.0013) [2024-08-05 19:39:18,010][15444] Updated weights for policy 0, policy_version 52901 (0.0022) [2024-08-05 19:39:18,119][15372] Fps is (10 sec: 24575.6, 60 sec: 24166.4, 300 sec: 24159.4). Total num frames: 433364992. Throughput: 0: 5997.3. Samples: 108347000. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 19:39:18,119][15372] Avg episode reward: [(0, '42.792')] [2024-08-05 19:39:21,361][15444] Updated weights for policy 0, policy_version 52911 (0.0021) [2024-08-05 19:39:23,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24029.8, 300 sec: 24131.7). Total num frames: 433479680. Throughput: 0: 6020.2. Samples: 108365700. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 19:39:23,126][15372] Avg episode reward: [(0, '42.918')] [2024-08-05 19:39:24,649][15444] Updated weights for policy 0, policy_version 52921 (0.0011) [2024-08-05 19:39:28,063][15444] Updated weights for policy 0, policy_version 52931 (0.0028) [2024-08-05 19:39:28,118][15372] Fps is (10 sec: 24576.4, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 433610752. Throughput: 0: 6032.1. Samples: 108402080. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 19:39:28,119][15372] Avg episode reward: [(0, '43.118')] [2024-08-05 19:39:31,702][15444] Updated weights for policy 0, policy_version 52941 (0.0017) [2024-08-05 19:39:33,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24030.5, 300 sec: 24159.5). Total num frames: 433725440. Throughput: 0: 6021.3. Samples: 108437900. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 19:39:33,119][15372] Avg episode reward: [(0, '43.217')] [2024-08-05 19:39:34,857][15444] Updated weights for policy 0, policy_version 52951 (0.0014) [2024-08-05 19:39:38,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 433848320. Throughput: 0: 6036.9. Samples: 108456360. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 19:39:38,126][15372] Avg episode reward: [(0, '43.046')] [2024-08-05 19:39:38,435][15444] Updated weights for policy 0, policy_version 52961 (0.0020) [2024-08-05 19:39:41,755][15444] Updated weights for policy 0, policy_version 52971 (0.0026) [2024-08-05 19:39:43,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24168.7, 300 sec: 24131.7). Total num frames: 433971200. Throughput: 0: 6019.8. Samples: 108491870. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 19:39:43,119][15372] Avg episode reward: [(0, '43.459')] [2024-08-05 19:39:45,010][15444] Updated weights for policy 0, policy_version 52981 (0.0011) [2024-08-05 19:39:48,119][15372] Fps is (10 sec: 24575.5, 60 sec: 24166.4, 300 sec: 24159.4). Total num frames: 434094080. Throughput: 0: 6047.2. Samples: 108528480. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 19:39:48,127][15372] Avg episode reward: [(0, '43.323')] [2024-08-05 19:39:48,726][15444] Updated weights for policy 0, policy_version 52991 (0.0021) [2024-08-05 19:39:51,803][15444] Updated weights for policy 0, policy_version 53001 (0.0013) [2024-08-05 19:39:53,121][15372] Fps is (10 sec: 22932.5, 60 sec: 24029.0, 300 sec: 24103.8). Total num frames: 434200576. Throughput: 0: 6071.7. Samples: 108547180. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 19:39:53,121][15372] Avg episode reward: [(0, '43.222')] [2024-08-05 19:39:55,407][15444] Updated weights for policy 0, policy_version 53011 (0.0020) [2024-08-05 19:39:57,226][15417] Signal inference workers to stop experience collection... (19450 times) [2024-08-05 19:39:57,233][15417] Signal inference workers to resume experience collection... (19450 times) [2024-08-05 19:39:57,305][15444] InferenceWorker_p0-w0: stopping experience collection (19450 times) [2024-08-05 19:39:57,305][15444] InferenceWorker_p0-w0: resuming experience collection (19450 times) [2024-08-05 19:39:58,119][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 434331648. Throughput: 0: 6050.6. Samples: 108582910. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 19:39:58,119][15372] Avg episode reward: [(0, '43.413')] [2024-08-05 19:39:58,646][15444] Updated weights for policy 0, policy_version 53021 (0.0012) [2024-08-05 19:40:01,848][15444] Updated weights for policy 0, policy_version 53031 (0.0014) [2024-08-05 19:40:03,118][15372] Fps is (10 sec: 25400.9, 60 sec: 24166.7, 300 sec: 24131.7). Total num frames: 434454528. Throughput: 0: 6060.7. Samples: 108619730. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 19:40:03,119][15372] Avg episode reward: [(0, '44.325')] [2024-08-05 19:40:05,481][15444] Updated weights for policy 0, policy_version 53041 (0.0031) [2024-08-05 19:40:08,118][15372] Fps is (10 sec: 24576.4, 60 sec: 24302.9, 300 sec: 24131.9). Total num frames: 434577408. Throughput: 0: 6055.3. Samples: 108638190. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 19:40:08,126][15372] Avg episode reward: [(0, '43.829')] [2024-08-05 19:40:08,718][15444] Updated weights for policy 0, policy_version 53051 (0.0012) [2024-08-05 19:40:12,205][15444] Updated weights for policy 0, policy_version 53061 (0.0014) [2024-08-05 19:40:13,119][15372] Fps is (10 sec: 24575.3, 60 sec: 24302.8, 300 sec: 24131.7). Total num frames: 434700288. Throughput: 0: 6039.3. Samples: 108673850. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 19:40:13,119][15372] Avg episode reward: [(0, '43.162')] [2024-08-05 19:40:15,573][15444] Updated weights for policy 0, policy_version 53071 (0.0024) [2024-08-05 19:40:18,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.4, 300 sec: 24133.9). Total num frames: 434814976. Throughput: 0: 6067.5. Samples: 108710940. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 19:40:18,119][15372] Avg episode reward: [(0, '43.384')] [2024-08-05 19:40:18,151][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000053079_434823168.pth... [2024-08-05 19:40:18,311][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000052371_429023232.pth [2024-08-05 19:40:18,968][15444] Updated weights for policy 0, policy_version 53081 (0.0027) [2024-08-05 19:40:22,209][15444] Updated weights for policy 0, policy_version 53091 (0.0012) [2024-08-05 19:40:23,118][15372] Fps is (10 sec: 22938.2, 60 sec: 24166.4, 300 sec: 24104.1). Total num frames: 434929664. Throughput: 0: 6051.3. Samples: 108728670. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 19:40:23,126][15372] Avg episode reward: [(0, '42.680')] [2024-08-05 19:40:25,713][15444] Updated weights for policy 0, policy_version 53101 (0.0014) [2024-08-05 19:40:28,119][15372] Fps is (10 sec: 23756.3, 60 sec: 24029.8, 300 sec: 24103.9). Total num frames: 435052544. Throughput: 0: 6064.8. Samples: 108764790. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 19:40:28,119][15372] Avg episode reward: [(0, '42.839')] [2024-08-05 19:40:29,083][15444] Updated weights for policy 0, policy_version 53111 (0.0014) [2024-08-05 19:40:32,638][15444] Updated weights for policy 0, policy_version 53121 (0.0030) [2024-08-05 19:40:33,119][15372] Fps is (10 sec: 25392.9, 60 sec: 24302.6, 300 sec: 24159.4). Total num frames: 435183616. Throughput: 0: 6043.9. Samples: 108800460. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 19:40:33,120][15372] Avg episode reward: [(0, '43.891')] [2024-08-05 19:40:36,100][15444] Updated weights for policy 0, policy_version 53131 (0.0018) [2024-08-05 19:40:38,118][15372] Fps is (10 sec: 24576.7, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 435298304. Throughput: 0: 6035.4. Samples: 108818760. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 19:40:38,126][15372] Avg episode reward: [(0, '44.032')] [2024-08-05 19:40:39,483][15444] Updated weights for policy 0, policy_version 53141 (0.0016) [2024-08-05 19:40:42,853][15444] Updated weights for policy 0, policy_version 53151 (0.0023) [2024-08-05 19:40:43,118][15372] Fps is (10 sec: 23758.8, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 435421184. Throughput: 0: 6045.4. Samples: 108854950. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 19:40:43,119][15372] Avg episode reward: [(0, '44.034')] [2024-08-05 19:40:46,066][15444] Updated weights for policy 0, policy_version 53161 (0.0016) [2024-08-05 19:40:48,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 435535872. Throughput: 0: 6004.4. Samples: 108889930. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 19:40:48,119][15372] Avg episode reward: [(0, '44.015')] [2024-08-05 19:40:49,742][15444] Updated weights for policy 0, policy_version 53171 (0.0020) [2024-08-05 19:40:52,097][15417] Signal inference workers to stop experience collection... (19500 times) [2024-08-05 19:40:52,101][15417] Signal inference workers to resume experience collection... (19500 times) [2024-08-05 19:40:52,180][15444] InferenceWorker_p0-w0: stopping experience collection (19500 times) [2024-08-05 19:40:52,180][15444] InferenceWorker_p0-w0: resuming experience collection (19500 times) [2024-08-05 19:40:53,118][15372] Fps is (10 sec: 22937.8, 60 sec: 24167.3, 300 sec: 24131.7). Total num frames: 435650560. Throughput: 0: 5997.1. Samples: 108908060. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 19:40:53,126][15372] Avg episode reward: [(0, '42.970')] [2024-08-05 19:40:53,282][15444] Updated weights for policy 0, policy_version 53181 (0.0029) [2024-08-05 19:40:56,458][15444] Updated weights for policy 0, policy_version 53191 (0.0011) [2024-08-05 19:40:58,119][15372] Fps is (10 sec: 23756.5, 60 sec: 24029.9, 300 sec: 24103.9). Total num frames: 435773440. Throughput: 0: 5997.6. Samples: 108943740. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 19:40:58,126][15372] Avg episode reward: [(0, '42.129')] [2024-08-05 19:41:00,043][15444] Updated weights for policy 0, policy_version 53201 (0.0033) [2024-08-05 19:41:03,049][15444] Updated weights for policy 0, policy_version 53211 (0.0024) [2024-08-05 19:41:03,119][15372] Fps is (10 sec: 25394.7, 60 sec: 24166.3, 300 sec: 24159.4). Total num frames: 435904512. Throughput: 0: 5989.3. Samples: 108980460. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 19:41:03,119][15372] Avg episode reward: [(0, '43.411')] [2024-08-05 19:41:06,793][15444] Updated weights for policy 0, policy_version 53221 (0.0025) [2024-08-05 19:41:08,118][15372] Fps is (10 sec: 23757.1, 60 sec: 23893.3, 300 sec: 24103.9). Total num frames: 436011008. Throughput: 0: 6006.9. Samples: 108998980. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-08-05 19:41:08,119][15372] Avg episode reward: [(0, '44.483')] [2024-08-05 19:41:10,328][15444] Updated weights for policy 0, policy_version 53231 (0.0023) [2024-08-05 19:41:13,119][15372] Fps is (10 sec: 22937.8, 60 sec: 23893.4, 300 sec: 24103.9). Total num frames: 436133888. Throughput: 0: 5998.2. Samples: 109034710. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 19:41:13,119][15372] Avg episode reward: [(0, '42.531')] [2024-08-05 19:41:13,577][15444] Updated weights for policy 0, policy_version 53241 (0.0027) [2024-08-05 19:41:17,159][15444] Updated weights for policy 0, policy_version 53251 (0.0023) [2024-08-05 19:41:18,119][15372] Fps is (10 sec: 24575.4, 60 sec: 24029.8, 300 sec: 24131.7). Total num frames: 436256768. Throughput: 0: 5992.5. Samples: 109070120. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 19:41:18,119][15372] Avg episode reward: [(0, '43.561')] [2024-08-05 19:41:20,413][15444] Updated weights for policy 0, policy_version 53261 (0.0011) [2024-08-05 19:41:23,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 436379648. Throughput: 0: 5988.7. Samples: 109088250. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 19:41:23,123][15372] Avg episode reward: [(0, '43.850')] [2024-08-05 19:41:23,794][15444] Updated weights for policy 0, policy_version 53271 (0.0013) [2024-08-05 19:41:27,323][15444] Updated weights for policy 0, policy_version 53281 (0.0014) [2024-08-05 19:41:28,119][15372] Fps is (10 sec: 23757.2, 60 sec: 24029.9, 300 sec: 24103.9). Total num frames: 436494336. Throughput: 0: 5976.0. Samples: 109123870. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 19:41:28,119][15372] Avg episode reward: [(0, '42.820')] [2024-08-05 19:41:30,700][15444] Updated weights for policy 0, policy_version 53291 (0.0013) [2024-08-05 19:41:33,119][15372] Fps is (10 sec: 22936.7, 60 sec: 23757.0, 300 sec: 24103.9). Total num frames: 436609024. Throughput: 0: 6001.3. Samples: 109159990. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 19:41:33,127][15372] Avg episode reward: [(0, '42.831')] [2024-08-05 19:41:34,205][15444] Updated weights for policy 0, policy_version 53301 (0.0015) [2024-08-05 19:41:34,946][15417] Signal inference workers to stop experience collection... (19550 times) [2024-08-05 19:41:34,946][15417] Signal inference workers to resume experience collection... (19550 times) [2024-08-05 19:41:34,998][15444] InferenceWorker_p0-w0: stopping experience collection (19550 times) [2024-08-05 19:41:34,998][15444] InferenceWorker_p0-w0: resuming experience collection (19550 times) [2024-08-05 19:41:37,542][15444] Updated weights for policy 0, policy_version 53311 (0.0016) [2024-08-05 19:41:38,119][15372] Fps is (10 sec: 23756.7, 60 sec: 23893.3, 300 sec: 24103.9). Total num frames: 436731904. Throughput: 0: 5995.5. Samples: 109177860. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 19:41:38,119][15372] Avg episode reward: [(0, '42.548')] [2024-08-05 19:41:40,900][15444] Updated weights for policy 0, policy_version 53321 (0.0014) [2024-08-05 19:41:43,118][15372] Fps is (10 sec: 24577.1, 60 sec: 23893.4, 300 sec: 24103.9). Total num frames: 436854784. Throughput: 0: 6007.6. Samples: 109214080. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 19:41:43,126][15372] Avg episode reward: [(0, '42.897')] [2024-08-05 19:41:44,468][15444] Updated weights for policy 0, policy_version 53331 (0.0017) [2024-08-05 19:41:47,670][15444] Updated weights for policy 0, policy_version 53341 (0.0012) [2024-08-05 19:41:48,118][15372] Fps is (10 sec: 23757.1, 60 sec: 23893.3, 300 sec: 24103.9). Total num frames: 436969472. Throughput: 0: 5974.7. Samples: 109249320. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 19:41:48,119][15372] Avg episode reward: [(0, '43.372')] [2024-08-05 19:41:51,298][15444] Updated weights for policy 0, policy_version 53351 (0.0015) [2024-08-05 19:41:53,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.9, 300 sec: 24076.2). Total num frames: 437092352. Throughput: 0: 5977.1. Samples: 109267950. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 19:41:53,119][15372] Avg episode reward: [(0, '44.530')] [2024-08-05 19:41:54,763][15444] Updated weights for policy 0, policy_version 53361 (0.0015) [2024-08-05 19:41:57,966][15444] Updated weights for policy 0, policy_version 53371 (0.0017) [2024-08-05 19:41:58,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24029.9, 300 sec: 24103.9). Total num frames: 437215232. Throughput: 0: 5995.8. Samples: 109304520. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 19:41:58,119][15372] Avg episode reward: [(0, '45.036')] [2024-08-05 19:42:01,476][15444] Updated weights for policy 0, policy_version 53381 (0.0030) [2024-08-05 19:42:03,118][15372] Fps is (10 sec: 23756.9, 60 sec: 23756.9, 300 sec: 24076.2). Total num frames: 437329920. Throughput: 0: 5982.0. Samples: 109339310. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 19:42:03,126][15372] Avg episode reward: [(0, '44.346')] [2024-08-05 19:42:04,621][15444] Updated weights for policy 0, policy_version 53391 (0.0020) [2024-08-05 19:42:08,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24029.9, 300 sec: 24103.9). Total num frames: 437452800. Throughput: 0: 5997.1. Samples: 109358120. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 19:42:08,126][15372] Avg episode reward: [(0, '43.916')] [2024-08-05 19:42:08,252][15444] Updated weights for policy 0, policy_version 53401 (0.0013) [2024-08-05 19:42:11,575][15444] Updated weights for policy 0, policy_version 53411 (0.0031) [2024-08-05 19:42:13,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24029.9, 300 sec: 24103.9). Total num frames: 437575680. Throughput: 0: 6002.5. Samples: 109393980. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 19:42:13,126][15372] Avg episode reward: [(0, '43.149')] [2024-08-05 19:42:14,934][15444] Updated weights for policy 0, policy_version 53421 (0.0012) [2024-08-05 19:42:17,360][15417] Signal inference workers to stop experience collection... (19600 times) [2024-08-05 19:42:17,360][15417] Signal inference workers to resume experience collection... (19600 times) [2024-08-05 19:42:17,395][15444] InferenceWorker_p0-w0: stopping experience collection (19600 times) [2024-08-05 19:42:17,395][15444] InferenceWorker_p0-w0: resuming experience collection (19600 times) [2024-08-05 19:42:18,119][15372] Fps is (10 sec: 23756.5, 60 sec: 23893.4, 300 sec: 24076.1). Total num frames: 437690368. Throughput: 0: 6002.5. Samples: 109430100. Policy #0 lag: (min: 0.0, avg: 4.4, max: 8.0) [2024-08-05 19:42:18,126][15372] Avg episode reward: [(0, '43.485')] [2024-08-05 19:42:18,187][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000053430_437698560.pth... [2024-08-05 19:42:18,312][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000052725_431923200.pth [2024-08-05 19:42:18,600][15444] Updated weights for policy 0, policy_version 53431 (0.0019) [2024-08-05 19:42:21,633][15444] Updated weights for policy 0, policy_version 53441 (0.0011) [2024-08-05 19:42:23,118][15372] Fps is (10 sec: 23756.7, 60 sec: 23893.4, 300 sec: 24103.9). Total num frames: 437813248. Throughput: 0: 6014.9. Samples: 109448530. Policy #0 lag: (min: 0.0, avg: 4.4, max: 8.0) [2024-08-05 19:42:23,126][15372] Avg episode reward: [(0, '43.167')] [2024-08-05 19:42:25,110][15444] Updated weights for policy 0, policy_version 53451 (0.0011) [2024-08-05 19:42:28,119][15372] Fps is (10 sec: 25395.2, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 437944320. Throughput: 0: 6025.3. Samples: 109485220. Policy #0 lag: (min: 0.0, avg: 4.4, max: 8.0) [2024-08-05 19:42:28,126][15372] Avg episode reward: [(0, '43.586')] [2024-08-05 19:42:28,621][15444] Updated weights for policy 0, policy_version 53461 (0.0018) [2024-08-05 19:42:31,915][15444] Updated weights for policy 0, policy_version 53471 (0.0015) [2024-08-05 19:42:33,119][15372] Fps is (10 sec: 24575.5, 60 sec: 24166.5, 300 sec: 24103.9). Total num frames: 438059008. Throughput: 0: 6024.2. Samples: 109520410. Policy #0 lag: (min: 0.0, avg: 4.4, max: 8.0) [2024-08-05 19:42:33,119][15372] Avg episode reward: [(0, '43.226')] [2024-08-05 19:42:35,417][15444] Updated weights for policy 0, policy_version 53481 (0.0018) [2024-08-05 19:42:38,119][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24103.9). Total num frames: 438181888. Throughput: 0: 6032.7. Samples: 109539420. Policy #0 lag: (min: 0.0, avg: 4.4, max: 8.0) [2024-08-05 19:42:38,119][15372] Avg episode reward: [(0, '44.092')] [2024-08-05 19:42:38,527][15444] Updated weights for policy 0, policy_version 53491 (0.0018) [2024-08-05 19:42:42,301][15444] Updated weights for policy 0, policy_version 53501 (0.0026) [2024-08-05 19:42:43,118][15372] Fps is (10 sec: 24576.5, 60 sec: 24166.4, 300 sec: 24103.9). Total num frames: 438304768. Throughput: 0: 6019.6. Samples: 109575400. Policy #0 lag: (min: 0.0, avg: 4.4, max: 8.0) [2024-08-05 19:42:43,119][15372] Avg episode reward: [(0, '44.995')] [2024-08-05 19:42:45,745][15444] Updated weights for policy 0, policy_version 53511 (0.0019) [2024-08-05 19:42:48,119][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.3, 300 sec: 24103.9). Total num frames: 438419456. Throughput: 0: 6043.1. Samples: 109611250. Policy #0 lag: (min: 0.0, avg: 4.4, max: 8.0) [2024-08-05 19:42:48,126][15372] Avg episode reward: [(0, '44.284')] [2024-08-05 19:42:49,057][15444] Updated weights for policy 0, policy_version 53521 (0.0020) [2024-08-05 19:42:52,522][15444] Updated weights for policy 0, policy_version 53531 (0.0017) [2024-08-05 19:42:53,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.4, 300 sec: 24103.9). Total num frames: 438542336. Throughput: 0: 6027.8. Samples: 109629370. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 19:42:53,119][15372] Avg episode reward: [(0, '44.582')] [2024-08-05 19:42:55,738][15444] Updated weights for policy 0, policy_version 53541 (0.0028) [2024-08-05 19:42:58,118][15372] Fps is (10 sec: 23757.2, 60 sec: 24029.9, 300 sec: 24076.2). Total num frames: 438657024. Throughput: 0: 6031.1. Samples: 109665380. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 19:42:58,126][15372] Avg episode reward: [(0, '42.917')] [2024-08-05 19:42:59,231][15444] Updated weights for policy 0, policy_version 53551 (0.0012) [2024-08-05 19:43:02,662][15444] Updated weights for policy 0, policy_version 53561 (0.0011) [2024-08-05 19:43:03,119][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.4, 300 sec: 24076.1). Total num frames: 438779904. Throughput: 0: 6021.1. Samples: 109701050. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 19:43:03,119][15372] Avg episode reward: [(0, '42.886')] [2024-08-05 19:43:05,776][15417] Signal inference workers to stop experience collection... (19650 times) [2024-08-05 19:43:05,781][15417] Signal inference workers to resume experience collection... (19650 times) [2024-08-05 19:43:05,828][15444] InferenceWorker_p0-w0: stopping experience collection (19650 times) [2024-08-05 19:43:05,836][15444] InferenceWorker_p0-w0: resuming experience collection (19650 times) [2024-08-05 19:43:05,891][15444] Updated weights for policy 0, policy_version 53571 (0.0017) [2024-08-05 19:43:08,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.4, 300 sec: 24076.1). Total num frames: 438902784. Throughput: 0: 6021.1. Samples: 109719480. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 19:43:08,119][15372] Avg episode reward: [(0, '44.369')] [2024-08-05 19:43:09,500][15444] Updated weights for policy 0, policy_version 53581 (0.0012) [2024-08-05 19:43:12,841][15444] Updated weights for policy 0, policy_version 53591 (0.0012) [2024-08-05 19:43:13,119][15372] Fps is (10 sec: 23756.2, 60 sec: 24029.7, 300 sec: 24076.1). Total num frames: 439017472. Throughput: 0: 6016.4. Samples: 109755960. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 19:43:13,119][15372] Avg episode reward: [(0, '43.944')] [2024-08-05 19:43:16,253][15444] Updated weights for policy 0, policy_version 53601 (0.0025) [2024-08-05 19:43:18,119][15372] Fps is (10 sec: 23756.5, 60 sec: 24166.4, 300 sec: 24076.1). Total num frames: 439140352. Throughput: 0: 6028.0. Samples: 109791670. Policy #0 lag: (min: 0.0, avg: 3.9, max: 7.0) [2024-08-05 19:43:18,119][15372] Avg episode reward: [(0, '43.470')] [2024-08-05 19:43:19,493][15444] Updated weights for policy 0, policy_version 53611 (0.0022) [2024-08-05 19:43:23,007][15444] Updated weights for policy 0, policy_version 53621 (0.0020) [2024-08-05 19:43:23,119][15372] Fps is (10 sec: 24576.5, 60 sec: 24166.3, 300 sec: 24076.1). Total num frames: 439263232. Throughput: 0: 6025.5. Samples: 109810570. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 19:43:23,119][15372] Avg episode reward: [(0, '43.674')] [2024-08-05 19:43:26,189][15444] Updated weights for policy 0, policy_version 53631 (0.0015) [2024-08-05 19:43:28,118][15372] Fps is (10 sec: 24576.3, 60 sec: 24029.9, 300 sec: 24076.3). Total num frames: 439386112. Throughput: 0: 6023.8. Samples: 109846470. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 19:43:28,126][15372] Avg episode reward: [(0, '43.457')] [2024-08-05 19:43:29,668][15444] Updated weights for policy 0, policy_version 53641 (0.0026) [2024-08-05 19:43:33,118][15372] Fps is (10 sec: 23757.2, 60 sec: 24029.9, 300 sec: 24076.1). Total num frames: 439500800. Throughput: 0: 6031.8. Samples: 109882680. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 19:43:33,126][15372] Avg episode reward: [(0, '43.765')] [2024-08-05 19:43:33,198][15444] Updated weights for policy 0, policy_version 53651 (0.0024) [2024-08-05 19:43:36,254][15444] Updated weights for policy 0, policy_version 53661 (0.0013) [2024-08-05 19:43:38,119][15372] Fps is (10 sec: 23756.6, 60 sec: 24029.9, 300 sec: 24076.6). Total num frames: 439623680. Throughput: 0: 6045.3. Samples: 109901410. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 19:43:38,119][15372] Avg episode reward: [(0, '43.870')] [2024-08-05 19:43:39,987][15444] Updated weights for policy 0, policy_version 53671 (0.0012) [2024-08-05 19:43:43,119][15372] Fps is (10 sec: 25394.7, 60 sec: 24166.3, 300 sec: 24103.9). Total num frames: 439754752. Throughput: 0: 6056.9. Samples: 109937940. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 19:43:43,126][15372] Avg episode reward: [(0, '43.669')] [2024-08-05 19:43:43,135][15444] Updated weights for policy 0, policy_version 53681 (0.0016) [2024-08-05 19:43:46,549][15444] Updated weights for policy 0, policy_version 53691 (0.0011) [2024-08-05 19:43:48,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24166.5, 300 sec: 24103.9). Total num frames: 439869440. Throughput: 0: 6060.9. Samples: 109973790. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 19:43:48,126][15372] Avg episode reward: [(0, '43.977')] [2024-08-05 19:43:49,940][15444] Updated weights for policy 0, policy_version 53701 (0.0024) [2024-08-05 19:43:53,119][15372] Fps is (10 sec: 23756.5, 60 sec: 24166.3, 300 sec: 24103.9). Total num frames: 439992320. Throughput: 0: 6069.3. Samples: 109992600. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 19:43:53,127][15372] Avg episode reward: [(0, '43.838')] [2024-08-05 19:43:53,145][15444] Updated weights for policy 0, policy_version 53711 (0.0028) [2024-08-05 19:43:56,887][15444] Updated weights for policy 0, policy_version 53721 (0.0022) [2024-08-05 19:43:58,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.4, 300 sec: 24076.2). Total num frames: 440107008. Throughput: 0: 6054.0. Samples: 110028390. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 19:43:58,119][15372] Avg episode reward: [(0, '44.263')] [2024-08-05 19:44:00,258][15444] Updated weights for policy 0, policy_version 53731 (0.0011) [2024-08-05 19:44:03,119][15372] Fps is (10 sec: 24576.0, 60 sec: 24302.8, 300 sec: 24131.7). Total num frames: 440238080. Throughput: 0: 6073.3. Samples: 110064970. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 19:44:03,140][15372] Avg episode reward: [(0, '44.185')] [2024-08-05 19:44:03,493][15444] Updated weights for policy 0, policy_version 53741 (0.0013) [2024-08-05 19:44:06,325][15417] Signal inference workers to stop experience collection... (19700 times) [2024-08-05 19:44:06,325][15417] Signal inference workers to resume experience collection... (19700 times) [2024-08-05 19:44:06,371][15444] InferenceWorker_p0-w0: stopping experience collection (19700 times) [2024-08-05 19:44:06,371][15444] InferenceWorker_p0-w0: resuming experience collection (19700 times) [2024-08-05 19:44:07,055][15444] Updated weights for policy 0, policy_version 53751 (0.0029) [2024-08-05 19:44:08,119][15372] Fps is (10 sec: 24575.4, 60 sec: 24166.3, 300 sec: 24103.9). Total num frames: 440352768. Throughput: 0: 6049.3. Samples: 110082790. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 19:44:08,119][15372] Avg episode reward: [(0, '44.165')] [2024-08-05 19:44:10,159][15444] Updated weights for policy 0, policy_version 53761 (0.0012) [2024-08-05 19:44:13,118][15372] Fps is (10 sec: 23757.5, 60 sec: 24303.1, 300 sec: 24103.9). Total num frames: 440475648. Throughput: 0: 6045.3. Samples: 110118510. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 19:44:13,126][15372] Avg episode reward: [(0, '44.742')] [2024-08-05 19:44:13,887][15444] Updated weights for policy 0, policy_version 53771 (0.0030) [2024-08-05 19:44:16,993][15444] Updated weights for policy 0, policy_version 53781 (0.0015) [2024-08-05 19:44:18,118][15372] Fps is (10 sec: 23757.4, 60 sec: 24166.4, 300 sec: 24103.9). Total num frames: 440590336. Throughput: 0: 6033.3. Samples: 110154180. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 19:44:18,120][15372] Avg episode reward: [(0, '44.265')] [2024-08-05 19:44:18,125][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000053783_440590336.pth... [2024-08-05 19:44:18,265][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000053079_434823168.pth [2024-08-05 19:44:20,723][15444] Updated weights for policy 0, policy_version 53791 (0.0023) [2024-08-05 19:44:23,119][15372] Fps is (10 sec: 23755.9, 60 sec: 24166.3, 300 sec: 24076.1). Total num frames: 440713216. Throughput: 0: 6023.7. Samples: 110172480. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 19:44:23,119][15372] Avg episode reward: [(0, '43.820')] [2024-08-05 19:44:24,242][15444] Updated weights for policy 0, policy_version 53801 (0.0022) [2024-08-05 19:44:27,477][15444] Updated weights for policy 0, policy_version 53811 (0.0022) [2024-08-05 19:44:28,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.8, 300 sec: 24076.1). Total num frames: 440827904. Throughput: 0: 6005.1. Samples: 110208170. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 19:44:28,119][15372] Avg episode reward: [(0, '43.330')] [2024-08-05 19:44:30,981][15444] Updated weights for policy 0, policy_version 53821 (0.0024) [2024-08-05 19:44:33,119][15372] Fps is (10 sec: 22937.8, 60 sec: 24029.8, 300 sec: 24048.4). Total num frames: 440942592. Throughput: 0: 6001.1. Samples: 110243840. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 19:44:33,119][15372] Avg episode reward: [(0, '43.201')] [2024-08-05 19:44:34,331][15444] Updated weights for policy 0, policy_version 53831 (0.0025) [2024-08-05 19:44:37,936][15444] Updated weights for policy 0, policy_version 53841 (0.0022) [2024-08-05 19:44:38,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.9, 300 sec: 24048.4). Total num frames: 441065472. Throughput: 0: 5972.5. Samples: 110261360. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 19:44:38,119][15372] Avg episode reward: [(0, '43.119')] [2024-08-05 19:44:41,334][15444] Updated weights for policy 0, policy_version 53851 (0.0018) [2024-08-05 19:44:42,465][15417] Signal inference workers to stop experience collection... (19750 times) [2024-08-05 19:44:42,473][15417] Signal inference workers to resume experience collection... (19750 times) [2024-08-05 19:44:42,509][15444] InferenceWorker_p0-w0: stopping experience collection (19750 times) [2024-08-05 19:44:42,516][15444] InferenceWorker_p0-w0: resuming experience collection (19750 times) [2024-08-05 19:44:43,118][15372] Fps is (10 sec: 24576.7, 60 sec: 23893.4, 300 sec: 24048.4). Total num frames: 441188352. Throughput: 0: 5982.7. Samples: 110297610. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 19:44:43,119][15372] Avg episode reward: [(0, '43.847')] [2024-08-05 19:44:44,484][15444] Updated weights for policy 0, policy_version 53861 (0.0020) [2024-08-05 19:44:48,101][15444] Updated weights for policy 0, policy_version 53871 (0.0021) [2024-08-05 19:44:48,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24029.9, 300 sec: 24104.1). Total num frames: 441311232. Throughput: 0: 5977.1. Samples: 110333940. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 19:44:48,119][15372] Avg episode reward: [(0, '44.248')] [2024-08-05 19:44:51,138][15444] Updated weights for policy 0, policy_version 53881 (0.0029) [2024-08-05 19:44:53,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24030.0, 300 sec: 24076.2). Total num frames: 441434112. Throughput: 0: 5987.2. Samples: 110352210. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 19:44:53,126][15372] Avg episode reward: [(0, '45.039')] [2024-08-05 19:44:54,710][15444] Updated weights for policy 0, policy_version 53891 (0.0025) [2024-08-05 19:44:58,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.9, 300 sec: 24048.4). Total num frames: 441548800. Throughput: 0: 6004.7. Samples: 110388720. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 19:44:58,126][15372] Avg episode reward: [(0, '44.782')] [2024-08-05 19:44:58,254][15444] Updated weights for policy 0, policy_version 53901 (0.0013) [2024-08-05 19:45:01,317][15444] Updated weights for policy 0, policy_version 53911 (0.0010) [2024-08-05 19:45:03,119][15372] Fps is (10 sec: 24575.6, 60 sec: 24029.9, 300 sec: 24076.1). Total num frames: 441679872. Throughput: 0: 6013.5. Samples: 110424790. Policy #0 lag: (min: 1.0, avg: 3.5, max: 7.0) [2024-08-05 19:45:03,126][15372] Avg episode reward: [(0, '43.372')] [2024-08-05 19:45:04,923][15444] Updated weights for policy 0, policy_version 53921 (0.0015) [2024-08-05 19:45:08,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24030.0, 300 sec: 24048.4). Total num frames: 441794560. Throughput: 0: 6019.2. Samples: 110443340. Policy #0 lag: (min: 1.0, avg: 3.5, max: 7.0) [2024-08-05 19:45:08,126][15372] Avg episode reward: [(0, '43.458')] [2024-08-05 19:45:08,376][15444] Updated weights for policy 0, policy_version 53931 (0.0017) [2024-08-05 19:45:11,488][15444] Updated weights for policy 0, policy_version 53941 (0.0016) [2024-08-05 19:45:13,119][15372] Fps is (10 sec: 23757.0, 60 sec: 24029.8, 300 sec: 24076.1). Total num frames: 441917440. Throughput: 0: 6030.4. Samples: 110479540. Policy #0 lag: (min: 1.0, avg: 3.5, max: 7.0) [2024-08-05 19:45:13,126][15372] Avg episode reward: [(0, '43.324')] [2024-08-05 19:45:15,145][15444] Updated weights for policy 0, policy_version 53951 (0.0017) [2024-08-05 19:45:18,119][15372] Fps is (10 sec: 24575.7, 60 sec: 24166.4, 300 sec: 24103.9). Total num frames: 442040320. Throughput: 0: 6057.6. Samples: 110516430. Policy #0 lag: (min: 1.0, avg: 3.5, max: 7.0) [2024-08-05 19:45:18,126][15372] Avg episode reward: [(0, '44.015')] [2024-08-05 19:45:18,293][15444] Updated weights for policy 0, policy_version 53961 (0.0019) [2024-08-05 19:45:21,830][15444] Updated weights for policy 0, policy_version 53971 (0.0010) [2024-08-05 19:45:23,119][15372] Fps is (10 sec: 24576.1, 60 sec: 24166.5, 300 sec: 24103.9). Total num frames: 442163200. Throughput: 0: 6071.1. Samples: 110534560. Policy #0 lag: (min: 1.0, avg: 3.5, max: 7.0) [2024-08-05 19:45:23,119][15372] Avg episode reward: [(0, '44.312')] [2024-08-05 19:45:25,050][15444] Updated weights for policy 0, policy_version 53981 (0.0022) [2024-08-05 19:45:28,118][15372] Fps is (10 sec: 23757.1, 60 sec: 24166.4, 300 sec: 24048.5). Total num frames: 442277888. Throughput: 0: 6077.6. Samples: 110571100. Policy #0 lag: (min: 1.0, avg: 3.5, max: 7.0) [2024-08-05 19:45:28,126][15372] Avg episode reward: [(0, '44.839')] [2024-08-05 19:45:28,456][15444] Updated weights for policy 0, policy_version 53991 (0.0025) [2024-08-05 19:45:30,353][15417] Signal inference workers to stop experience collection... (19800 times) [2024-08-05 19:45:30,357][15417] Signal inference workers to resume experience collection... (19800 times) [2024-08-05 19:45:30,402][15444] InferenceWorker_p0-w0: stopping experience collection (19800 times) [2024-08-05 19:45:30,407][15444] InferenceWorker_p0-w0: resuming experience collection (19800 times) [2024-08-05 19:45:32,002][15444] Updated weights for policy 0, policy_version 54001 (0.0026) [2024-08-05 19:45:33,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24439.6, 300 sec: 24103.9). Total num frames: 442408960. Throughput: 0: 6069.8. Samples: 110607080. Policy #0 lag: (min: 1.0, avg: 3.5, max: 7.0) [2024-08-05 19:45:33,119][15372] Avg episode reward: [(0, '44.224')] [2024-08-05 19:45:34,963][15444] Updated weights for policy 0, policy_version 54011 (0.0014) [2024-08-05 19:45:38,118][15372] Fps is (10 sec: 23756.6, 60 sec: 24166.4, 300 sec: 24048.4). Total num frames: 442515456. Throughput: 0: 6079.3. Samples: 110625780. Policy #0 lag: (min: 0.0, avg: 3.4, max: 8.0) [2024-08-05 19:45:38,126][15372] Avg episode reward: [(0, '43.556')] [2024-08-05 19:45:38,721][15444] Updated weights for policy 0, policy_version 54021 (0.0017) [2024-08-05 19:45:42,151][15444] Updated weights for policy 0, policy_version 54031 (0.0021) [2024-08-05 19:45:43,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24302.9, 300 sec: 24103.9). Total num frames: 442646528. Throughput: 0: 6061.6. Samples: 110661490. Policy #0 lag: (min: 0.0, avg: 3.4, max: 8.0) [2024-08-05 19:45:43,119][15372] Avg episode reward: [(0, '43.227')] [2024-08-05 19:45:45,353][15444] Updated weights for policy 0, policy_version 54041 (0.0012) [2024-08-05 19:45:48,119][15372] Fps is (10 sec: 25395.0, 60 sec: 24302.9, 300 sec: 24131.7). Total num frames: 442769408. Throughput: 0: 6066.9. Samples: 110697800. Policy #0 lag: (min: 0.0, avg: 3.4, max: 8.0) [2024-08-05 19:45:48,119][15372] Avg episode reward: [(0, '43.440')] [2024-08-05 19:45:48,885][15444] Updated weights for policy 0, policy_version 54051 (0.0016) [2024-08-05 19:45:52,037][15444] Updated weights for policy 0, policy_version 54061 (0.0012) [2024-08-05 19:45:53,118][15372] Fps is (10 sec: 23756.6, 60 sec: 24166.4, 300 sec: 24103.9). Total num frames: 442884096. Throughput: 0: 6063.1. Samples: 110716180. Policy #0 lag: (min: 0.0, avg: 3.4, max: 8.0) [2024-08-05 19:45:53,119][15372] Avg episode reward: [(0, '43.708')] [2024-08-05 19:45:55,631][15444] Updated weights for policy 0, policy_version 54071 (0.0029) [2024-08-05 19:45:58,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24302.9, 300 sec: 24076.2). Total num frames: 443006976. Throughput: 0: 6054.9. Samples: 110752010. Policy #0 lag: (min: 0.0, avg: 3.4, max: 8.0) [2024-08-05 19:45:58,126][15372] Avg episode reward: [(0, '44.194')] [2024-08-05 19:45:58,848][15444] Updated weights for policy 0, policy_version 54081 (0.0034) [2024-08-05 19:46:02,504][15444] Updated weights for policy 0, policy_version 54091 (0.0012) [2024-08-05 19:46:02,607][15417] Signal inference workers to stop experience collection... (19850 times) [2024-08-05 19:46:02,608][15417] Signal inference workers to resume experience collection... (19850 times) [2024-08-05 19:46:02,654][15444] InferenceWorker_p0-w0: stopping experience collection (19850 times) [2024-08-05 19:46:02,661][15444] InferenceWorker_p0-w0: resuming experience collection (19850 times) [2024-08-05 19:46:03,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 443129856. Throughput: 0: 6028.9. Samples: 110787730. Policy #0 lag: (min: 0.0, avg: 3.4, max: 8.0) [2024-08-05 19:46:03,119][15372] Avg episode reward: [(0, '44.082')] [2024-08-05 19:46:06,020][15444] Updated weights for policy 0, policy_version 54101 (0.0018) [2024-08-05 19:46:08,122][15372] Fps is (10 sec: 23748.1, 60 sec: 24164.9, 300 sec: 24103.6). Total num frames: 443244544. Throughput: 0: 6034.0. Samples: 110806110. Policy #0 lag: (min: 0.0, avg: 3.4, max: 8.0) [2024-08-05 19:46:08,123][15372] Avg episode reward: [(0, '42.645')] [2024-08-05 19:46:09,339][15444] Updated weights for policy 0, policy_version 54111 (0.0014) [2024-08-05 19:46:13,118][15372] Fps is (10 sec: 22118.6, 60 sec: 23893.4, 300 sec: 24048.4). Total num frames: 443351040. Throughput: 0: 6004.4. Samples: 110841300. Policy #0 lag: (min: 1.0, avg: 5.3, max: 10.0) [2024-08-05 19:46:13,126][15372] Avg episode reward: [(0, '42.482')] [2024-08-05 19:46:13,125][15444] Updated weights for policy 0, policy_version 54121 (0.0012) [2024-08-05 19:46:16,007][15444] Updated weights for policy 0, policy_version 54131 (0.0012) [2024-08-05 19:46:18,119][15372] Fps is (10 sec: 23765.4, 60 sec: 24029.9, 300 sec: 24076.1). Total num frames: 443482112. Throughput: 0: 6006.9. Samples: 110877390. Policy #0 lag: (min: 1.0, avg: 5.3, max: 10.0) [2024-08-05 19:46:18,119][15372] Avg episode reward: [(0, '43.949')] [2024-08-05 19:46:18,132][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000054137_443490304.pth... [2024-08-05 19:46:18,272][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000053430_437698560.pth [2024-08-05 19:46:19,843][15444] Updated weights for policy 0, policy_version 54141 (0.0011) [2024-08-05 19:46:23,025][15444] Updated weights for policy 0, policy_version 54151 (0.0021) [2024-08-05 19:46:23,119][15372] Fps is (10 sec: 25394.7, 60 sec: 24029.8, 300 sec: 24103.9). Total num frames: 443604992. Throughput: 0: 5969.1. Samples: 110894390. Policy #0 lag: (min: 1.0, avg: 5.3, max: 10.0) [2024-08-05 19:46:23,119][15372] Avg episode reward: [(0, '44.052')] [2024-08-05 19:46:26,555][15444] Updated weights for policy 0, policy_version 54161 (0.0014) [2024-08-05 19:46:28,119][15372] Fps is (10 sec: 23756.7, 60 sec: 24029.8, 300 sec: 24103.9). Total num frames: 443719680. Throughput: 0: 5972.0. Samples: 110930230. Policy #0 lag: (min: 1.0, avg: 5.3, max: 10.0) [2024-08-05 19:46:28,126][15372] Avg episode reward: [(0, '43.412')] [2024-08-05 19:46:30,030][15444] Updated weights for policy 0, policy_version 54171 (0.0013) [2024-08-05 19:46:30,951][15417] Signal inference workers to stop experience collection... (19900 times) [2024-08-05 19:46:30,952][15417] Signal inference workers to resume experience collection... (19900 times) [2024-08-05 19:46:30,994][15444] InferenceWorker_p0-w0: stopping experience collection (19900 times) [2024-08-05 19:46:30,995][15444] InferenceWorker_p0-w0: resuming experience collection (19900 times) [2024-08-05 19:46:33,118][15372] Fps is (10 sec: 23757.3, 60 sec: 23893.3, 300 sec: 24103.9). Total num frames: 443842560. Throughput: 0: 5984.0. Samples: 110967080. Policy #0 lag: (min: 1.0, avg: 5.3, max: 10.0) [2024-08-05 19:46:33,119][15372] Avg episode reward: [(0, '43.505')] [2024-08-05 19:46:33,139][15444] Updated weights for policy 0, policy_version 54181 (0.0022) [2024-08-05 19:46:36,682][15444] Updated weights for policy 0, policy_version 54191 (0.0013) [2024-08-05 19:46:38,119][15372] Fps is (10 sec: 23756.9, 60 sec: 24029.8, 300 sec: 24076.1). Total num frames: 443957248. Throughput: 0: 5988.4. Samples: 110985660. Policy #0 lag: (min: 1.0, avg: 5.3, max: 10.0) [2024-08-05 19:46:38,119][15372] Avg episode reward: [(0, '43.992')] [2024-08-05 19:46:40,110][15444] Updated weights for policy 0, policy_version 54201 (0.0014) [2024-08-05 19:46:43,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 444088320. Throughput: 0: 6013.1. Samples: 111022600. Policy #0 lag: (min: 1.0, avg: 5.3, max: 10.0) [2024-08-05 19:46:43,126][15372] Avg episode reward: [(0, '43.261')] [2024-08-05 19:46:43,360][15444] Updated weights for policy 0, policy_version 54211 (0.0014) [2024-08-05 19:46:46,952][15444] Updated weights for policy 0, policy_version 54221 (0.0033) [2024-08-05 19:46:48,120][15372] Fps is (10 sec: 25390.7, 60 sec: 24029.2, 300 sec: 24131.5). Total num frames: 444211200. Throughput: 0: 6010.6. Samples: 111058220. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 19:46:48,121][15372] Avg episode reward: [(0, '42.499')] [2024-08-05 19:46:49,986][15444] Updated weights for policy 0, policy_version 54231 (0.0022) [2024-08-05 19:46:53,119][15372] Fps is (10 sec: 23756.1, 60 sec: 24029.8, 300 sec: 24103.9). Total num frames: 444325888. Throughput: 0: 6019.6. Samples: 111076970. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 19:46:53,127][15372] Avg episode reward: [(0, '43.035')] [2024-08-05 19:46:53,749][15444] Updated weights for policy 0, policy_version 54241 (0.0019) [2024-08-05 19:46:56,673][15444] Updated weights for policy 0, policy_version 54251 (0.0023) [2024-08-05 19:46:58,119][15372] Fps is (10 sec: 22940.8, 60 sec: 23893.2, 300 sec: 24103.9). Total num frames: 444440576. Throughput: 0: 6016.4. Samples: 111112040. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 19:46:58,127][15372] Avg episode reward: [(0, '42.561')] [2024-08-05 19:47:00,430][15444] Updated weights for policy 0, policy_version 54261 (0.0013) [2024-08-05 19:47:03,120][15372] Fps is (10 sec: 25391.4, 60 sec: 24165.7, 300 sec: 24159.3). Total num frames: 444579840. Throughput: 0: 6031.6. Samples: 111148820. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 19:47:03,120][15372] Avg episode reward: [(0, '42.732')] [2024-08-05 19:47:03,913][15444] Updated weights for policy 0, policy_version 54271 (0.0039) [2024-08-05 19:47:07,100][15444] Updated weights for policy 0, policy_version 54281 (0.0025) [2024-08-05 19:47:08,119][15372] Fps is (10 sec: 24576.9, 60 sec: 24031.3, 300 sec: 24103.9). Total num frames: 444686336. Throughput: 0: 6049.8. Samples: 111166630. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 19:47:08,119][15372] Avg episode reward: [(0, '43.959')] [2024-08-05 19:47:10,500][15444] Updated weights for policy 0, policy_version 54291 (0.0010) [2024-08-05 19:47:11,414][15417] Signal inference workers to stop experience collection... (19950 times) [2024-08-05 19:47:11,422][15417] Signal inference workers to resume experience collection... (19950 times) [2024-08-05 19:47:11,475][15444] InferenceWorker_p0-w0: stopping experience collection (19950 times) [2024-08-05 19:47:11,485][15444] InferenceWorker_p0-w0: resuming experience collection (19950 times) [2024-08-05 19:47:13,118][15372] Fps is (10 sec: 23761.0, 60 sec: 24439.5, 300 sec: 24159.5). Total num frames: 444817408. Throughput: 0: 6058.7. Samples: 111202870. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 19:47:13,119][15372] Avg episode reward: [(0, '44.183')] [2024-08-05 19:47:13,876][15444] Updated weights for policy 0, policy_version 54301 (0.0011) [2024-08-05 19:47:17,114][15444] Updated weights for policy 0, policy_version 54311 (0.0011) [2024-08-05 19:47:18,120][15372] Fps is (10 sec: 24576.3, 60 sec: 24166.5, 300 sec: 24131.7). Total num frames: 444932096. Throughput: 0: 6060.7. Samples: 111239810. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 19:47:18,120][15372] Avg episode reward: [(0, '43.820')] [2024-08-05 19:47:20,412][15444] Updated weights for policy 0, policy_version 54321 (0.0019) [2024-08-05 19:47:23,122][15372] Fps is (10 sec: 23749.1, 60 sec: 24165.2, 300 sec: 24103.7). Total num frames: 445054976. Throughput: 0: 6060.2. Samples: 111258390. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 19:47:23,122][15372] Avg episode reward: [(0, '44.378')] [2024-08-05 19:47:24,019][15444] Updated weights for policy 0, policy_version 54331 (0.0027) [2024-08-05 19:47:27,386][15444] Updated weights for policy 0, policy_version 54341 (0.0019) [2024-08-05 19:47:28,119][15372] Fps is (10 sec: 24575.7, 60 sec: 24302.9, 300 sec: 24131.7). Total num frames: 445177856. Throughput: 0: 6043.5. Samples: 111294560. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 19:47:28,119][15372] Avg episode reward: [(0, '44.532')] [2024-08-05 19:47:30,632][15444] Updated weights for policy 0, policy_version 54351 (0.0012) [2024-08-05 19:47:33,119][15372] Fps is (10 sec: 24583.7, 60 sec: 24302.9, 300 sec: 24131.7). Total num frames: 445300736. Throughput: 0: 6060.5. Samples: 111330930. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 19:47:33,126][15372] Avg episode reward: [(0, '44.070')] [2024-08-05 19:47:34,388][15444] Updated weights for policy 0, policy_version 54361 (0.0019) [2024-08-05 19:47:37,392][15444] Updated weights for policy 0, policy_version 54371 (0.0015) [2024-08-05 19:47:38,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24303.0, 300 sec: 24103.9). Total num frames: 445415424. Throughput: 0: 6033.1. Samples: 111348460. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 19:47:38,126][15372] Avg episode reward: [(0, '43.863')] [2024-08-05 19:47:41,065][15444] Updated weights for policy 0, policy_version 54381 (0.0014) [2024-08-05 19:47:43,119][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.3, 300 sec: 24131.7). Total num frames: 445538304. Throughput: 0: 6041.2. Samples: 111383890. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 19:47:43,119][15372] Avg episode reward: [(0, '43.487')] [2024-08-05 19:47:44,557][15444] Updated weights for policy 0, policy_version 54391 (0.0035) [2024-08-05 19:47:47,791][15444] Updated weights for policy 0, policy_version 54401 (0.0024) [2024-08-05 19:47:48,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24167.2, 300 sec: 24131.7). Total num frames: 445661184. Throughput: 0: 6034.9. Samples: 111420380. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 19:47:48,119][15372] Avg episode reward: [(0, '43.457')] [2024-08-05 19:47:51,296][15444] Updated weights for policy 0, policy_version 54411 (0.0011) [2024-08-05 19:47:53,118][15372] Fps is (10 sec: 23757.1, 60 sec: 24166.5, 300 sec: 24131.7). Total num frames: 445775872. Throughput: 0: 6047.6. Samples: 111438770. Policy #0 lag: (min: 0.0, avg: 4.6, max: 8.0) [2024-08-05 19:47:53,126][15372] Avg episode reward: [(0, '43.214')] [2024-08-05 19:47:54,487][15444] Updated weights for policy 0, policy_version 54421 (0.0021) [2024-08-05 19:47:57,963][15444] Updated weights for policy 0, policy_version 54431 (0.0012) [2024-08-05 19:47:58,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24303.1, 300 sec: 24131.7). Total num frames: 445898752. Throughput: 0: 6049.6. Samples: 111475100. Policy #0 lag: (min: 0.0, avg: 4.6, max: 8.0) [2024-08-05 19:47:58,119][15372] Avg episode reward: [(0, '43.385')] [2024-08-05 19:47:59,898][15417] Signal inference workers to stop experience collection... (20000 times) [2024-08-05 19:47:59,900][15417] Signal inference workers to resume experience collection... (20000 times) [2024-08-05 19:47:59,970][15444] InferenceWorker_p0-w0: stopping experience collection (20000 times) [2024-08-05 19:47:59,971][15444] InferenceWorker_p0-w0: resuming experience collection (20000 times) [2024-08-05 19:48:01,365][15444] Updated weights for policy 0, policy_version 54441 (0.0014) [2024-08-05 19:48:03,119][15372] Fps is (10 sec: 24575.5, 60 sec: 24030.5, 300 sec: 24131.7). Total num frames: 446021632. Throughput: 0: 6024.6. Samples: 111510920. Policy #0 lag: (min: 0.0, avg: 4.6, max: 8.0) [2024-08-05 19:48:03,119][15372] Avg episode reward: [(0, '44.502')] [2024-08-05 19:48:04,625][15444] Updated weights for policy 0, policy_version 54451 (0.0019) [2024-08-05 19:48:08,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.5, 300 sec: 24131.7). Total num frames: 446136320. Throughput: 0: 6040.0. Samples: 111530170. Policy #0 lag: (min: 0.0, avg: 4.6, max: 8.0) [2024-08-05 19:48:08,126][15372] Avg episode reward: [(0, '43.761')] [2024-08-05 19:48:08,237][15444] Updated weights for policy 0, policy_version 54461 (0.0024) [2024-08-05 19:48:11,242][15444] Updated weights for policy 0, policy_version 54471 (0.0011) [2024-08-05 19:48:13,118][15372] Fps is (10 sec: 24576.6, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 446267392. Throughput: 0: 6023.8. Samples: 111565630. Policy #0 lag: (min: 0.0, avg: 4.6, max: 8.0) [2024-08-05 19:48:13,126][15372] Avg episode reward: [(0, '42.553')] [2024-08-05 19:48:14,970][15444] Updated weights for policy 0, policy_version 54481 (0.0020) [2024-08-05 19:48:18,119][15372] Fps is (10 sec: 24574.2, 60 sec: 24166.1, 300 sec: 24131.6). Total num frames: 446382080. Throughput: 0: 6025.5. Samples: 111602080. Policy #0 lag: (min: 0.0, avg: 4.6, max: 8.0) [2024-08-05 19:48:18,127][15372] Avg episode reward: [(0, '43.807')] [2024-08-05 19:48:18,132][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000054490_446382080.pth... [2024-08-05 19:48:18,273][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000053783_440590336.pth [2024-08-05 19:48:18,381][15444] Updated weights for policy 0, policy_version 54491 (0.0028) [2024-08-05 19:48:21,660][15444] Updated weights for policy 0, policy_version 54501 (0.0014) [2024-08-05 19:48:23,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24167.7, 300 sec: 24131.7). Total num frames: 446504960. Throughput: 0: 6042.7. Samples: 111620380. Policy #0 lag: (min: 0.0, avg: 4.6, max: 8.0) [2024-08-05 19:48:23,126][15372] Avg episode reward: [(0, '44.201')] [2024-08-05 19:48:24,911][15444] Updated weights for policy 0, policy_version 54511 (0.0012) [2024-08-05 19:48:28,118][15372] Fps is (10 sec: 24577.8, 60 sec: 24166.5, 300 sec: 24159.5). Total num frames: 446627840. Throughput: 0: 6072.0. Samples: 111657130. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 19:48:28,126][15372] Avg episode reward: [(0, '44.106')] [2024-08-05 19:48:28,366][15444] Updated weights for policy 0, policy_version 54521 (0.0013) [2024-08-05 19:48:31,690][15444] Updated weights for policy 0, policy_version 54531 (0.0013) [2024-08-05 19:48:33,120][15372] Fps is (10 sec: 23754.2, 60 sec: 24029.5, 300 sec: 24131.6). Total num frames: 446742528. Throughput: 0: 6047.4. Samples: 111692520. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 19:48:33,120][15372] Avg episode reward: [(0, '43.988')] [2024-08-05 19:48:35,167][15444] Updated weights for policy 0, policy_version 54541 (0.0026) [2024-08-05 19:48:38,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.4, 300 sec: 24103.9). Total num frames: 446865408. Throughput: 0: 6042.9. Samples: 111710700. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 19:48:38,126][15372] Avg episode reward: [(0, '43.689')] [2024-08-05 19:48:38,859][15444] Updated weights for policy 0, policy_version 54551 (0.0013) [2024-08-05 19:48:42,150][15444] Updated weights for policy 0, policy_version 54561 (0.0014) [2024-08-05 19:48:43,118][15372] Fps is (10 sec: 24578.6, 60 sec: 24166.5, 300 sec: 24131.7). Total num frames: 446988288. Throughput: 0: 6019.3. Samples: 111745970. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 19:48:43,119][15372] Avg episode reward: [(0, '42.715')] [2024-08-05 19:48:45,497][15444] Updated weights for policy 0, policy_version 54571 (0.0032) [2024-08-05 19:48:48,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24029.9, 300 sec: 24103.9). Total num frames: 447102976. Throughput: 0: 6045.6. Samples: 111782970. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 19:48:48,119][15372] Avg episode reward: [(0, '42.954')] [2024-08-05 19:48:48,854][15444] Updated weights for policy 0, policy_version 54581 (0.0014) [2024-08-05 19:48:50,504][15417] Signal inference workers to stop experience collection... (20050 times) [2024-08-05 19:48:50,504][15417] Signal inference workers to resume experience collection... (20050 times) [2024-08-05 19:48:50,551][15444] InferenceWorker_p0-w0: stopping experience collection (20050 times) [2024-08-05 19:48:50,561][15444] InferenceWorker_p0-w0: resuming experience collection (20050 times) [2024-08-05 19:48:52,416][15444] Updated weights for policy 0, policy_version 54591 (0.0034) [2024-08-05 19:48:53,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24302.9, 300 sec: 24159.5). Total num frames: 447234048. Throughput: 0: 6017.3. Samples: 111800950. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 19:48:53,126][15372] Avg episode reward: [(0, '43.374')] [2024-08-05 19:48:55,537][15444] Updated weights for policy 0, policy_version 54601 (0.0012) [2024-08-05 19:48:58,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.4, 300 sec: 24103.9). Total num frames: 447348736. Throughput: 0: 6044.0. Samples: 111837610. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 19:48:58,126][15372] Avg episode reward: [(0, '43.700')] [2024-08-05 19:48:58,968][15444] Updated weights for policy 0, policy_version 54611 (0.0010) [2024-08-05 19:49:02,355][15444] Updated weights for policy 0, policy_version 54621 (0.0017) [2024-08-05 19:49:03,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.5, 300 sec: 24131.7). Total num frames: 447471616. Throughput: 0: 6033.9. Samples: 111873600. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 19:49:03,126][15372] Avg episode reward: [(0, '43.811')] [2024-08-05 19:49:05,596][15444] Updated weights for policy 0, policy_version 54631 (0.0019) [2024-08-05 19:49:08,119][15372] Fps is (10 sec: 23756.6, 60 sec: 24166.4, 300 sec: 24103.9). Total num frames: 447586304. Throughput: 0: 6028.7. Samples: 111891670. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 19:49:08,126][15372] Avg episode reward: [(0, '43.978')] [2024-08-05 19:49:09,291][15444] Updated weights for policy 0, policy_version 54641 (0.0019) [2024-08-05 19:49:12,524][15444] Updated weights for policy 0, policy_version 54651 (0.0022) [2024-08-05 19:49:13,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 447709184. Throughput: 0: 6024.2. Samples: 111928220. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 19:49:13,119][15372] Avg episode reward: [(0, '43.092')] [2024-08-05 19:49:16,029][15444] Updated weights for policy 0, policy_version 54661 (0.0018) [2024-08-05 19:49:18,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24166.7, 300 sec: 24131.7). Total num frames: 447832064. Throughput: 0: 6030.8. Samples: 111963900. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 19:49:18,119][15372] Avg episode reward: [(0, '42.348')] [2024-08-05 19:49:19,485][15444] Updated weights for policy 0, policy_version 54671 (0.0021) [2024-08-05 19:49:22,827][15444] Updated weights for policy 0, policy_version 54681 (0.0012) [2024-08-05 19:49:23,119][15372] Fps is (10 sec: 23756.2, 60 sec: 24029.8, 300 sec: 24131.7). Total num frames: 447946752. Throughput: 0: 6028.9. Samples: 111982000. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 19:49:23,119][15372] Avg episode reward: [(0, '43.736')] [2024-08-05 19:49:26,006][15444] Updated weights for policy 0, policy_version 54691 (0.0012) [2024-08-05 19:49:28,119][15372] Fps is (10 sec: 23756.5, 60 sec: 24029.8, 300 sec: 24159.5). Total num frames: 448069632. Throughput: 0: 6047.1. Samples: 112018090. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 19:49:28,119][15372] Avg episode reward: [(0, '44.006')] [2024-08-05 19:49:29,772][15444] Updated weights for policy 0, policy_version 54701 (0.0021) [2024-08-05 19:49:33,081][15444] Updated weights for policy 0, policy_version 54711 (0.0036) [2024-08-05 19:49:33,118][15372] Fps is (10 sec: 24576.5, 60 sec: 24166.8, 300 sec: 24159.5). Total num frames: 448192512. Throughput: 0: 6014.7. Samples: 112053630. Policy #0 lag: (min: 1.0, avg: 3.8, max: 8.0) [2024-08-05 19:49:33,119][15372] Avg episode reward: [(0, '43.271')] [2024-08-05 19:49:36,292][15444] Updated weights for policy 0, policy_version 54721 (0.0013) [2024-08-05 19:49:38,119][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.4, 300 sec: 24159.4). Total num frames: 448315392. Throughput: 0: 6031.3. Samples: 112072360. Policy #0 lag: (min: 1.0, avg: 3.8, max: 8.0) [2024-08-05 19:49:38,126][15372] Avg episode reward: [(0, '44.089')] [2024-08-05 19:49:39,832][15444] Updated weights for policy 0, policy_version 54731 (0.0015) [2024-08-05 19:49:43,119][15372] Fps is (10 sec: 23756.2, 60 sec: 24029.8, 300 sec: 24131.7). Total num frames: 448430080. Throughput: 0: 6025.3. Samples: 112108750. Policy #0 lag: (min: 1.0, avg: 3.8, max: 8.0) [2024-08-05 19:49:43,127][15372] Avg episode reward: [(0, '43.876')] [2024-08-05 19:49:43,490][15444] Updated weights for policy 0, policy_version 54741 (0.0014) [2024-08-05 19:49:46,483][15444] Updated weights for policy 0, policy_version 54751 (0.0020) [2024-08-05 19:49:48,119][15372] Fps is (10 sec: 23756.5, 60 sec: 24166.3, 300 sec: 24131.7). Total num frames: 448552960. Throughput: 0: 6022.2. Samples: 112144600. Policy #0 lag: (min: 1.0, avg: 3.8, max: 8.0) [2024-08-05 19:49:48,126][15372] Avg episode reward: [(0, '43.073')] [2024-08-05 19:49:48,470][15417] Signal inference workers to stop experience collection... (20100 times) [2024-08-05 19:49:48,471][15417] Signal inference workers to resume experience collection... (20100 times) [2024-08-05 19:49:48,547][15444] InferenceWorker_p0-w0: stopping experience collection (20100 times) [2024-08-05 19:49:48,547][15444] InferenceWorker_p0-w0: resuming experience collection (20100 times) [2024-08-05 19:49:50,116][15444] Updated weights for policy 0, policy_version 54761 (0.0010) [2024-08-05 19:49:53,118][15372] Fps is (10 sec: 24576.6, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 448675840. Throughput: 0: 6027.1. Samples: 112162890. Policy #0 lag: (min: 1.0, avg: 3.8, max: 8.0) [2024-08-05 19:49:53,126][15372] Avg episode reward: [(0, '43.397')] [2024-08-05 19:49:53,251][15444] Updated weights for policy 0, policy_version 54771 (0.0018) [2024-08-05 19:49:56,698][15444] Updated weights for policy 0, policy_version 54781 (0.0029) [2024-08-05 19:49:58,118][15372] Fps is (10 sec: 24576.6, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 448798720. Throughput: 0: 6020.2. Samples: 112199130. Policy #0 lag: (min: 1.0, avg: 3.8, max: 8.0) [2024-08-05 19:49:58,119][15372] Avg episode reward: [(0, '43.789')] [2024-08-05 19:50:00,076][15444] Updated weights for policy 0, policy_version 54791 (0.0020) [2024-08-05 19:50:03,119][15372] Fps is (10 sec: 23755.9, 60 sec: 24029.7, 300 sec: 24131.7). Total num frames: 448913408. Throughput: 0: 6033.1. Samples: 112235390. Policy #0 lag: (min: 1.0, avg: 3.8, max: 8.0) [2024-08-05 19:50:03,127][15372] Avg episode reward: [(0, '44.310')] [2024-08-05 19:50:03,582][15444] Updated weights for policy 0, policy_version 54801 (0.0015) [2024-08-05 19:50:07,108][15444] Updated weights for policy 0, policy_version 54811 (0.0015) [2024-08-05 19:50:08,119][15372] Fps is (10 sec: 23755.9, 60 sec: 24166.3, 300 sec: 24131.7). Total num frames: 449036288. Throughput: 0: 6033.8. Samples: 112253520. Policy #0 lag: (min: 1.0, avg: 4.0, max: 8.0) [2024-08-05 19:50:08,120][15372] Avg episode reward: [(0, '44.507')] [2024-08-05 19:50:10,186][15444] Updated weights for policy 0, policy_version 54821 (0.0026) [2024-08-05 19:50:13,119][15372] Fps is (10 sec: 24575.9, 60 sec: 24166.2, 300 sec: 24131.7). Total num frames: 449159168. Throughput: 0: 6038.2. Samples: 112289810. Policy #0 lag: (min: 1.0, avg: 4.0, max: 8.0) [2024-08-05 19:50:13,127][15372] Avg episode reward: [(0, '44.061')] [2024-08-05 19:50:13,685][15444] Updated weights for policy 0, policy_version 54831 (0.0014) [2024-08-05 19:50:17,133][15444] Updated weights for policy 0, policy_version 54841 (0.0018) [2024-08-05 19:50:18,118][15372] Fps is (10 sec: 23757.5, 60 sec: 24029.8, 300 sec: 24103.9). Total num frames: 449273856. Throughput: 0: 6047.3. Samples: 112325760. Policy #0 lag: (min: 1.0, avg: 4.0, max: 8.0) [2024-08-05 19:50:18,119][15372] Avg episode reward: [(0, '43.850')] [2024-08-05 19:50:18,197][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000054844_449282048.pth... [2024-08-05 19:50:18,329][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000054137_443490304.pth [2024-08-05 19:50:20,599][15444] Updated weights for policy 0, policy_version 54851 (0.0012) [2024-08-05 19:50:23,118][15372] Fps is (10 sec: 24577.1, 60 sec: 24303.0, 300 sec: 24159.5). Total num frames: 449404928. Throughput: 0: 6040.7. Samples: 112344190. Policy #0 lag: (min: 1.0, avg: 4.0, max: 8.0) [2024-08-05 19:50:23,119][15372] Avg episode reward: [(0, '43.582')] [2024-08-05 19:50:23,917][15444] Updated weights for policy 0, policy_version 54861 (0.0019) [2024-08-05 19:50:27,090][15444] Updated weights for policy 0, policy_version 54871 (0.0013) [2024-08-05 19:50:28,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24166.5, 300 sec: 24103.9). Total num frames: 449519616. Throughput: 0: 6041.8. Samples: 112380630. Policy #0 lag: (min: 1.0, avg: 4.0, max: 8.0) [2024-08-05 19:50:28,126][15372] Avg episode reward: [(0, '42.349')] [2024-08-05 19:50:30,837][15444] Updated weights for policy 0, policy_version 54881 (0.0017) [2024-08-05 19:50:33,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 449642496. Throughput: 0: 6049.4. Samples: 112416820. Policy #0 lag: (min: 1.0, avg: 4.0, max: 8.0) [2024-08-05 19:50:33,119][15372] Avg episode reward: [(0, '42.649')] [2024-08-05 19:50:34,037][15444] Updated weights for policy 0, policy_version 54891 (0.0014) [2024-08-05 19:50:37,450][15444] Updated weights for policy 0, policy_version 54901 (0.0015) [2024-08-05 19:50:38,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 449765376. Throughput: 0: 6034.9. Samples: 112434460. Policy #0 lag: (min: 1.0, avg: 4.0, max: 8.0) [2024-08-05 19:50:38,119][15372] Avg episode reward: [(0, '43.050')] [2024-08-05 19:50:40,002][15417] Signal inference workers to stop experience collection... (20150 times) [2024-08-05 19:50:40,003][15417] Signal inference workers to resume experience collection... (20150 times) [2024-08-05 19:50:40,073][15444] InferenceWorker_p0-w0: stopping experience collection (20150 times) [2024-08-05 19:50:40,073][15444] InferenceWorker_p0-w0: resuming experience collection (20150 times) [2024-08-05 19:50:40,884][15444] Updated weights for policy 0, policy_version 54911 (0.0019) [2024-08-05 19:50:43,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.5, 300 sec: 24103.9). Total num frames: 449880064. Throughput: 0: 6043.8. Samples: 112471100. Policy #0 lag: (min: 2.0, avg: 4.4, max: 9.0) [2024-08-05 19:50:43,119][15372] Avg episode reward: [(0, '43.602')] [2024-08-05 19:50:44,087][15444] Updated weights for policy 0, policy_version 54921 (0.0011) [2024-08-05 19:50:47,746][15444] Updated weights for policy 0, policy_version 54931 (0.0020) [2024-08-05 19:50:48,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.5, 300 sec: 24131.7). Total num frames: 450002944. Throughput: 0: 6050.7. Samples: 112507670. Policy #0 lag: (min: 2.0, avg: 4.4, max: 9.0) [2024-08-05 19:50:48,119][15372] Avg episode reward: [(0, '43.132')] [2024-08-05 19:50:50,778][15444] Updated weights for policy 0, policy_version 54941 (0.0018) [2024-08-05 19:50:53,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 450125824. Throughput: 0: 6054.7. Samples: 112525980. Policy #0 lag: (min: 2.0, avg: 4.4, max: 9.0) [2024-08-05 19:50:53,126][15372] Avg episode reward: [(0, '43.979')] [2024-08-05 19:50:54,430][15444] Updated weights for policy 0, policy_version 54951 (0.0013) [2024-08-05 19:50:57,823][15444] Updated weights for policy 0, policy_version 54961 (0.0021) [2024-08-05 19:50:58,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24029.9, 300 sec: 24103.9). Total num frames: 450240512. Throughput: 0: 6034.7. Samples: 112561370. Policy #0 lag: (min: 2.0, avg: 4.4, max: 9.0) [2024-08-05 19:50:58,119][15372] Avg episode reward: [(0, '44.231')] [2024-08-05 19:51:01,052][15444] Updated weights for policy 0, policy_version 54971 (0.0010) [2024-08-05 19:51:03,118][15372] Fps is (10 sec: 23756.6, 60 sec: 24166.5, 300 sec: 24132.0). Total num frames: 450363392. Throughput: 0: 6030.9. Samples: 112597150. Policy #0 lag: (min: 2.0, avg: 4.4, max: 9.0) [2024-08-05 19:51:03,119][15372] Avg episode reward: [(0, '43.937')] [2024-08-05 19:51:04,770][15444] Updated weights for policy 0, policy_version 54981 (0.0012) [2024-08-05 19:51:08,077][15444] Updated weights for policy 0, policy_version 54991 (0.0024) [2024-08-05 19:51:08,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.5, 300 sec: 24187.2). Total num frames: 450486272. Throughput: 0: 6016.4. Samples: 112614930. Policy #0 lag: (min: 2.0, avg: 4.4, max: 9.0) [2024-08-05 19:51:08,119][15372] Avg episode reward: [(0, '43.307')] [2024-08-05 19:51:11,581][15444] Updated weights for policy 0, policy_version 55001 (0.0011) [2024-08-05 19:51:13,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24030.0, 300 sec: 24131.7). Total num frames: 450600960. Throughput: 0: 6008.9. Samples: 112651030. Policy #0 lag: (min: 2.0, avg: 4.4, max: 9.0) [2024-08-05 19:51:13,126][15372] Avg episode reward: [(0, '43.753')] [2024-08-05 19:51:14,886][15444] Updated weights for policy 0, policy_version 55011 (0.0019) [2024-08-05 19:51:18,120][15372] Fps is (10 sec: 23753.2, 60 sec: 24165.8, 300 sec: 24131.6). Total num frames: 450723840. Throughput: 0: 6020.5. Samples: 112687750. Policy #0 lag: (min: 1.0, avg: 4.0, max: 8.0) [2024-08-05 19:51:18,128][15372] Avg episode reward: [(0, '43.929')] [2024-08-05 19:51:18,178][15444] Updated weights for policy 0, policy_version 55021 (0.0016) [2024-08-05 19:51:21,766][15444] Updated weights for policy 0, policy_version 55031 (0.0014) [2024-08-05 19:51:23,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24029.8, 300 sec: 24159.5). Total num frames: 450846720. Throughput: 0: 6038.2. Samples: 112706180. Policy #0 lag: (min: 1.0, avg: 4.0, max: 8.0) [2024-08-05 19:51:23,119][15372] Avg episode reward: [(0, '42.572')] [2024-08-05 19:51:24,908][15444] Updated weights for policy 0, policy_version 55041 (0.0011) [2024-08-05 19:51:28,118][15372] Fps is (10 sec: 24579.8, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 450969600. Throughput: 0: 6039.5. Samples: 112742880. Policy #0 lag: (min: 1.0, avg: 4.0, max: 8.0) [2024-08-05 19:51:28,126][15372] Avg episode reward: [(0, '43.388')] [2024-08-05 19:51:28,393][15444] Updated weights for policy 0, policy_version 55051 (0.0019) [2024-08-05 19:51:31,863][15444] Updated weights for policy 0, policy_version 55061 (0.0012) [2024-08-05 19:51:33,119][15372] Fps is (10 sec: 24575.8, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 451092480. Throughput: 0: 6012.2. Samples: 112778220. Policy #0 lag: (min: 1.0, avg: 4.0, max: 8.0) [2024-08-05 19:51:33,119][15372] Avg episode reward: [(0, '42.661')] [2024-08-05 19:51:35,099][15444] Updated weights for policy 0, policy_version 55071 (0.0016) [2024-08-05 19:51:37,020][15417] Signal inference workers to stop experience collection... (20200 times) [2024-08-05 19:51:37,021][15417] Signal inference workers to resume experience collection... (20200 times) [2024-08-05 19:51:37,064][15444] InferenceWorker_p0-w0: stopping experience collection (20200 times) [2024-08-05 19:51:37,066][15444] InferenceWorker_p0-w0: resuming experience collection (20200 times) [2024-08-05 19:51:38,119][15372] Fps is (10 sec: 24575.7, 60 sec: 24166.4, 300 sec: 24159.4). Total num frames: 451215360. Throughput: 0: 6015.1. Samples: 112796660. Policy #0 lag: (min: 1.0, avg: 4.0, max: 8.0) [2024-08-05 19:51:38,119][15372] Avg episode reward: [(0, '42.039')] [2024-08-05 19:51:38,453][15444] Updated weights for policy 0, policy_version 55081 (0.0028) [2024-08-05 19:51:41,965][15444] Updated weights for policy 0, policy_version 55091 (0.0040) [2024-08-05 19:51:43,119][15372] Fps is (10 sec: 23755.2, 60 sec: 24166.1, 300 sec: 24131.8). Total num frames: 451330048. Throughput: 0: 6037.2. Samples: 112833050. Policy #0 lag: (min: 1.0, avg: 4.0, max: 8.0) [2024-08-05 19:51:43,120][15372] Avg episode reward: [(0, '43.384')] [2024-08-05 19:51:45,237][15444] Updated weights for policy 0, policy_version 55101 (0.0026) [2024-08-05 19:51:48,118][15372] Fps is (10 sec: 23757.1, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 451452928. Throughput: 0: 6066.7. Samples: 112870150. Policy #0 lag: (min: 1.0, avg: 4.0, max: 8.0) [2024-08-05 19:51:48,126][15372] Avg episode reward: [(0, '44.242')] [2024-08-05 19:51:48,663][15444] Updated weights for policy 0, policy_version 55111 (0.0019) [2024-08-05 19:51:51,827][15444] Updated weights for policy 0, policy_version 55121 (0.0011) [2024-08-05 19:51:53,118][15372] Fps is (10 sec: 24577.9, 60 sec: 24166.4, 300 sec: 24187.3). Total num frames: 451575808. Throughput: 0: 6068.4. Samples: 112888010. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 19:51:53,119][15372] Avg episode reward: [(0, '43.402')] [2024-08-05 19:51:55,293][15444] Updated weights for policy 0, policy_version 55131 (0.0010) [2024-08-05 19:51:58,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24302.9, 300 sec: 24131.8). Total num frames: 451698688. Throughput: 0: 6068.2. Samples: 112924100. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 19:51:58,119][15372] Avg episode reward: [(0, '43.548')] [2024-08-05 19:51:58,956][15444] Updated weights for policy 0, policy_version 55141 (0.0030) [2024-08-05 19:52:02,095][15444] Updated weights for policy 0, policy_version 55151 (0.0014) [2024-08-05 19:52:03,119][15372] Fps is (10 sec: 23756.3, 60 sec: 24166.3, 300 sec: 24159.4). Total num frames: 451813376. Throughput: 0: 6042.8. Samples: 112959670. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 19:52:03,126][15372] Avg episode reward: [(0, '43.733')] [2024-08-05 19:52:05,775][15444] Updated weights for policy 0, policy_version 55161 (0.0014) [2024-08-05 19:52:08,119][15372] Fps is (10 sec: 23756.4, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 451936256. Throughput: 0: 6051.5. Samples: 112978500. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 19:52:08,119][15372] Avg episode reward: [(0, '43.394')] [2024-08-05 19:52:08,873][15444] Updated weights for policy 0, policy_version 55171 (0.0012) [2024-08-05 19:52:12,533][15444] Updated weights for policy 0, policy_version 55181 (0.0016) [2024-08-05 19:52:13,119][15372] Fps is (10 sec: 24574.8, 60 sec: 24302.7, 300 sec: 24159.4). Total num frames: 452059136. Throughput: 0: 6024.8. Samples: 113014000. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 19:52:13,120][15372] Avg episode reward: [(0, '43.466')] [2024-08-05 19:52:15,933][15444] Updated weights for policy 0, policy_version 55191 (0.0023) [2024-08-05 19:52:18,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24167.0, 300 sec: 24131.9). Total num frames: 452173824. Throughput: 0: 6038.9. Samples: 113049970. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 19:52:18,126][15372] Avg episode reward: [(0, '43.689')] [2024-08-05 19:52:18,130][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000055197_452173824.pth... [2024-08-05 19:52:18,278][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000054490_446382080.pth [2024-08-05 19:52:19,372][15444] Updated weights for policy 0, policy_version 55201 (0.0013) [2024-08-05 19:52:22,912][15444] Updated weights for policy 0, policy_version 55211 (0.0020) [2024-08-05 19:52:23,118][15372] Fps is (10 sec: 23758.6, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 452296704. Throughput: 0: 6029.8. Samples: 113068000. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 19:52:23,119][15372] Avg episode reward: [(0, '43.602')] [2024-08-05 19:52:26,177][15444] Updated weights for policy 0, policy_version 55221 (0.0023) [2024-08-05 19:52:27,234][15417] Signal inference workers to stop experience collection... (20250 times) [2024-08-05 19:52:27,242][15417] Signal inference workers to resume experience collection... (20250 times) [2024-08-05 19:52:27,288][15444] InferenceWorker_p0-w0: stopping experience collection (20250 times) [2024-08-05 19:52:27,288][15444] InferenceWorker_p0-w0: resuming experience collection (20250 times) [2024-08-05 19:52:28,119][15372] Fps is (10 sec: 23756.6, 60 sec: 24029.8, 300 sec: 24103.9). Total num frames: 452411392. Throughput: 0: 6011.2. Samples: 113103550. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 19:52:28,119][15372] Avg episode reward: [(0, '43.250')] [2024-08-05 19:52:29,552][15444] Updated weights for policy 0, policy_version 55231 (0.0017) [2024-08-05 19:52:32,838][15444] Updated weights for policy 0, policy_version 55241 (0.0014) [2024-08-05 19:52:33,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 452534272. Throughput: 0: 6000.9. Samples: 113140190. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 19:52:33,119][15372] Avg episode reward: [(0, '43.408')] [2024-08-05 19:52:36,233][15444] Updated weights for policy 0, policy_version 55251 (0.0017) [2024-08-05 19:52:38,118][15372] Fps is (10 sec: 23757.2, 60 sec: 23893.4, 300 sec: 24103.9). Total num frames: 452648960. Throughput: 0: 6016.4. Samples: 113158750. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 19:52:38,126][15372] Avg episode reward: [(0, '43.686')] [2024-08-05 19:52:39,697][15444] Updated weights for policy 0, policy_version 55261 (0.0018) [2024-08-05 19:52:42,974][15444] Updated weights for policy 0, policy_version 55271 (0.0016) [2024-08-05 19:52:43,119][15372] Fps is (10 sec: 24575.5, 60 sec: 24166.6, 300 sec: 24131.7). Total num frames: 452780032. Throughput: 0: 6020.6. Samples: 113195030. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 19:52:43,119][15372] Avg episode reward: [(0, '43.848')] [2024-08-05 19:52:46,666][15444] Updated weights for policy 0, policy_version 55281 (0.0031) [2024-08-05 19:52:48,119][15372] Fps is (10 sec: 24575.1, 60 sec: 24029.7, 300 sec: 24131.7). Total num frames: 452894720. Throughput: 0: 6005.8. Samples: 113229930. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 19:52:48,119][15372] Avg episode reward: [(0, '43.049')] [2024-08-05 19:52:50,056][15444] Updated weights for policy 0, policy_version 55291 (0.0016) [2024-08-05 19:52:53,118][15372] Fps is (10 sec: 23757.4, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 453017600. Throughput: 0: 5998.2. Samples: 113248420. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 19:52:53,126][15372] Avg episode reward: [(0, '43.081')] [2024-08-05 19:52:53,334][15444] Updated weights for policy 0, policy_version 55301 (0.0022) [2024-08-05 19:52:56,762][15444] Updated weights for policy 0, policy_version 55311 (0.0027) [2024-08-05 19:52:58,118][15372] Fps is (10 sec: 24576.9, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 453140480. Throughput: 0: 6018.1. Samples: 113284810. Policy #0 lag: (min: 1.0, avg: 3.9, max: 8.0) [2024-08-05 19:52:58,119][15372] Avg episode reward: [(0, '43.312')] [2024-08-05 19:53:00,058][15444] Updated weights for policy 0, policy_version 55321 (0.0012) [2024-08-05 19:53:03,119][15372] Fps is (10 sec: 23756.5, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 453255168. Throughput: 0: 6023.8. Samples: 113321040. Policy #0 lag: (min: 1.0, avg: 3.9, max: 8.0) [2024-08-05 19:53:03,126][15372] Avg episode reward: [(0, '44.062')] [2024-08-05 19:53:03,526][15444] Updated weights for policy 0, policy_version 55331 (0.0024) [2024-08-05 19:53:07,113][15444] Updated weights for policy 0, policy_version 55341 (0.0037) [2024-08-05 19:53:08,112][15417] Signal inference workers to stop experience collection... (20300 times) [2024-08-05 19:53:08,117][15417] Signal inference workers to resume experience collection... (20300 times) [2024-08-05 19:53:08,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24029.9, 300 sec: 24103.9). Total num frames: 453378048. Throughput: 0: 6025.8. Samples: 113339160. Policy #0 lag: (min: 1.0, avg: 3.9, max: 8.0) [2024-08-05 19:53:08,119][15372] Avg episode reward: [(0, '45.096')] [2024-08-05 19:53:08,172][15444] InferenceWorker_p0-w0: stopping experience collection (20300 times) [2024-08-05 19:53:08,172][15444] InferenceWorker_p0-w0: resuming experience collection (20300 times) [2024-08-05 19:53:08,272][15417] Saving new best policy, reward=45.096! [2024-08-05 19:53:10,240][15444] Updated weights for policy 0, policy_version 55351 (0.0026) [2024-08-05 19:53:13,118][15372] Fps is (10 sec: 24576.3, 60 sec: 24030.2, 300 sec: 24131.7). Total num frames: 453500928. Throughput: 0: 6043.6. Samples: 113375510. Policy #0 lag: (min: 1.0, avg: 3.9, max: 8.0) [2024-08-05 19:53:13,126][15372] Avg episode reward: [(0, '44.458')] [2024-08-05 19:53:13,899][15444] Updated weights for policy 0, policy_version 55361 (0.0020) [2024-08-05 19:53:16,946][15444] Updated weights for policy 0, policy_version 55371 (0.0018) [2024-08-05 19:53:18,119][15372] Fps is (10 sec: 23754.4, 60 sec: 24029.5, 300 sec: 24103.8). Total num frames: 453615616. Throughput: 0: 6031.9. Samples: 113411630. Policy #0 lag: (min: 1.0, avg: 3.9, max: 8.0) [2024-08-05 19:53:18,120][15372] Avg episode reward: [(0, '42.901')] [2024-08-05 19:53:20,466][15444] Updated weights for policy 0, policy_version 55381 (0.0013) [2024-08-05 19:53:23,119][15372] Fps is (10 sec: 24575.4, 60 sec: 24166.3, 300 sec: 24131.7). Total num frames: 453746688. Throughput: 0: 6025.3. Samples: 113429890. Policy #0 lag: (min: 1.0, avg: 3.9, max: 8.0) [2024-08-05 19:53:23,119][15372] Avg episode reward: [(0, '42.929')] [2024-08-05 19:53:23,748][15444] Updated weights for policy 0, policy_version 55391 (0.0012) [2024-08-05 19:53:27,132][15444] Updated weights for policy 0, policy_version 55401 (0.0016) [2024-08-05 19:53:28,118][15372] Fps is (10 sec: 24578.5, 60 sec: 24166.5, 300 sec: 24131.8). Total num frames: 453861376. Throughput: 0: 6032.7. Samples: 113466500. Policy #0 lag: (min: 1.0, avg: 3.9, max: 8.0) [2024-08-05 19:53:28,119][15372] Avg episode reward: [(0, '43.076')] [2024-08-05 19:53:30,694][15444] Updated weights for policy 0, policy_version 55411 (0.0011) [2024-08-05 19:53:33,119][15372] Fps is (10 sec: 23757.2, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 453984256. Throughput: 0: 6070.9. Samples: 113503120. Policy #0 lag: (min: 1.0, avg: 4.0, max: 9.0) [2024-08-05 19:53:33,120][15372] Avg episode reward: [(0, '43.028')] [2024-08-05 19:53:33,808][15444] Updated weights for policy 0, policy_version 55421 (0.0017) [2024-08-05 19:53:37,624][15444] Updated weights for policy 0, policy_version 55431 (0.0010) [2024-08-05 19:53:38,119][15372] Fps is (10 sec: 24575.7, 60 sec: 24302.9, 300 sec: 24131.7). Total num frames: 454107136. Throughput: 0: 6046.0. Samples: 113520490. Policy #0 lag: (min: 1.0, avg: 4.0, max: 9.0) [2024-08-05 19:53:38,119][15372] Avg episode reward: [(0, '43.298')] [2024-08-05 19:53:40,691][15444] Updated weights for policy 0, policy_version 55441 (0.0021) [2024-08-05 19:53:43,119][15372] Fps is (10 sec: 23756.5, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 454221824. Throughput: 0: 6028.2. Samples: 113556080. Policy #0 lag: (min: 1.0, avg: 4.0, max: 9.0) [2024-08-05 19:53:43,127][15372] Avg episode reward: [(0, '43.636')] [2024-08-05 19:53:44,205][15444] Updated weights for policy 0, policy_version 55451 (0.0025) [2024-08-05 19:53:47,896][15444] Updated weights for policy 0, policy_version 55461 (0.0028) [2024-08-05 19:53:48,119][15372] Fps is (10 sec: 22937.2, 60 sec: 24029.9, 300 sec: 24076.1). Total num frames: 454336512. Throughput: 0: 6023.7. Samples: 113592110. Policy #0 lag: (min: 1.0, avg: 4.0, max: 9.0) [2024-08-05 19:53:48,120][15372] Avg episode reward: [(0, '43.410')] [2024-08-05 19:53:50,251][15417] Signal inference workers to stop experience collection... (20350 times) [2024-08-05 19:53:50,251][15417] Signal inference workers to resume experience collection... (20350 times) [2024-08-05 19:53:50,293][15444] InferenceWorker_p0-w0: stopping experience collection (20350 times) [2024-08-05 19:53:50,293][15444] InferenceWorker_p0-w0: resuming experience collection (20350 times) [2024-08-05 19:53:51,016][15444] Updated weights for policy 0, policy_version 55471 (0.0012) [2024-08-05 19:53:53,118][15372] Fps is (10 sec: 24576.5, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 454467584. Throughput: 0: 6027.3. Samples: 113610390. Policy #0 lag: (min: 1.0, avg: 4.0, max: 9.0) [2024-08-05 19:53:53,119][15372] Avg episode reward: [(0, '43.642')] [2024-08-05 19:53:54,352][15444] Updated weights for policy 0, policy_version 55481 (0.0011) [2024-08-05 19:53:57,748][15444] Updated weights for policy 0, policy_version 55491 (0.0018) [2024-08-05 19:53:58,118][15372] Fps is (10 sec: 24576.7, 60 sec: 24029.9, 300 sec: 24103.9). Total num frames: 454582272. Throughput: 0: 6036.4. Samples: 113647150. Policy #0 lag: (min: 1.0, avg: 4.0, max: 9.0) [2024-08-05 19:53:58,119][15372] Avg episode reward: [(0, '42.853')] [2024-08-05 19:54:01,232][15444] Updated weights for policy 0, policy_version 55501 (0.0014) [2024-08-05 19:54:03,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 454705152. Throughput: 0: 6030.1. Samples: 113682980. Policy #0 lag: (min: 1.0, avg: 4.0, max: 9.0) [2024-08-05 19:54:03,126][15372] Avg episode reward: [(0, '42.685')] [2024-08-05 19:54:04,557][15444] Updated weights for policy 0, policy_version 55511 (0.0010) [2024-08-05 19:54:08,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.9, 300 sec: 24103.9). Total num frames: 454819840. Throughput: 0: 6041.6. Samples: 113701760. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 19:54:08,126][15372] Avg episode reward: [(0, '43.826')] [2024-08-05 19:54:08,152][15444] Updated weights for policy 0, policy_version 55521 (0.0014) [2024-08-05 19:54:11,263][15444] Updated weights for policy 0, policy_version 55531 (0.0011) [2024-08-05 19:54:13,119][15372] Fps is (10 sec: 24575.8, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 454950912. Throughput: 0: 6028.4. Samples: 113737780. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 19:54:13,126][15372] Avg episode reward: [(0, '43.858')] [2024-08-05 19:54:14,738][15444] Updated weights for policy 0, policy_version 55541 (0.0013) [2024-08-05 19:54:17,921][15444] Updated weights for policy 0, policy_version 55551 (0.0012) [2024-08-05 19:54:18,118][15372] Fps is (10 sec: 25395.1, 60 sec: 24303.3, 300 sec: 24159.5). Total num frames: 455073792. Throughput: 0: 6024.4. Samples: 113774220. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 19:54:18,119][15372] Avg episode reward: [(0, '42.877')] [2024-08-05 19:54:18,122][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000055551_455073792.pth... [2024-08-05 19:54:18,230][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000054844_449282048.pth [2024-08-05 19:54:21,355][15444] Updated weights for policy 0, policy_version 55561 (0.0011) [2024-08-05 19:54:23,118][15372] Fps is (10 sec: 23757.1, 60 sec: 24030.0, 300 sec: 24131.7). Total num frames: 455188480. Throughput: 0: 6044.7. Samples: 113792500. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 19:54:23,126][15372] Avg episode reward: [(0, '41.164')] [2024-08-05 19:54:25,008][15444] Updated weights for policy 0, policy_version 55571 (0.0025) [2024-08-05 19:54:28,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 455311360. Throughput: 0: 6059.8. Samples: 113828770. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 19:54:28,126][15372] Avg episode reward: [(0, '41.995')] [2024-08-05 19:54:28,217][15444] Updated weights for policy 0, policy_version 55581 (0.0014) [2024-08-05 19:54:31,597][15444] Updated weights for policy 0, policy_version 55591 (0.0018) [2024-08-05 19:54:33,119][15372] Fps is (10 sec: 24575.4, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 455434240. Throughput: 0: 6051.8. Samples: 113864440. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 19:54:33,126][15372] Avg episode reward: [(0, '43.472')] [2024-08-05 19:54:34,994][15444] Updated weights for policy 0, policy_version 55601 (0.0012) [2024-08-05 19:54:38,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 455548928. Throughput: 0: 6066.9. Samples: 113883400. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-08-05 19:54:38,126][15372] Avg episode reward: [(0, '42.673')] [2024-08-05 19:54:38,281][15417] Signal inference workers to stop experience collection... (20400 times) [2024-08-05 19:54:38,281][15417] Signal inference workers to resume experience collection... (20400 times) [2024-08-05 19:54:38,310][15444] InferenceWorker_p0-w0: stopping experience collection (20400 times) [2024-08-05 19:54:38,310][15444] InferenceWorker_p0-w0: resuming experience collection (20400 times) [2024-08-05 19:54:38,373][15444] Updated weights for policy 0, policy_version 55611 (0.0017) [2024-08-05 19:54:41,974][15444] Updated weights for policy 0, policy_version 55621 (0.0019) [2024-08-05 19:54:43,119][15372] Fps is (10 sec: 24576.4, 60 sec: 24303.0, 300 sec: 24159.5). Total num frames: 455680000. Throughput: 0: 6038.2. Samples: 113918870. Policy #0 lag: (min: 1.0, avg: 4.0, max: 8.0) [2024-08-05 19:54:43,119][15372] Avg episode reward: [(0, '43.283')] [2024-08-05 19:54:45,189][15444] Updated weights for policy 0, policy_version 55631 (0.0012) [2024-08-05 19:54:48,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24303.0, 300 sec: 24131.7). Total num frames: 455794688. Throughput: 0: 6042.2. Samples: 113954880. Policy #0 lag: (min: 1.0, avg: 4.0, max: 8.0) [2024-08-05 19:54:48,126][15372] Avg episode reward: [(0, '43.587')] [2024-08-05 19:54:48,672][15444] Updated weights for policy 0, policy_version 55641 (0.0013) [2024-08-05 19:54:52,050][15444] Updated weights for policy 0, policy_version 55651 (0.0032) [2024-08-05 19:54:53,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 455917568. Throughput: 0: 6033.1. Samples: 113973250. Policy #0 lag: (min: 1.0, avg: 4.0, max: 8.0) [2024-08-05 19:54:53,119][15372] Avg episode reward: [(0, '43.696')] [2024-08-05 19:54:55,366][15444] Updated weights for policy 0, policy_version 55661 (0.0011) [2024-08-05 19:54:58,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24302.9, 300 sec: 24159.5). Total num frames: 456040448. Throughput: 0: 6047.3. Samples: 114009910. Policy #0 lag: (min: 1.0, avg: 4.0, max: 8.0) [2024-08-05 19:54:58,120][15372] Avg episode reward: [(0, '43.485')] [2024-08-05 19:54:58,818][15444] Updated weights for policy 0, policy_version 55671 (0.0030) [2024-08-05 19:55:02,069][15444] Updated weights for policy 0, policy_version 55681 (0.0014) [2024-08-05 19:55:03,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 456155136. Throughput: 0: 6039.8. Samples: 114046010. Policy #0 lag: (min: 1.0, avg: 4.0, max: 8.0) [2024-08-05 19:55:03,119][15372] Avg episode reward: [(0, '43.784')] [2024-08-05 19:55:05,540][15444] Updated weights for policy 0, policy_version 55691 (0.0020) [2024-08-05 19:55:08,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24439.5, 300 sec: 24159.5). Total num frames: 456286208. Throughput: 0: 6050.0. Samples: 114064750. Policy #0 lag: (min: 1.0, avg: 4.0, max: 8.0) [2024-08-05 19:55:08,119][15372] Avg episode reward: [(0, '43.326')] [2024-08-05 19:55:08,916][15444] Updated weights for policy 0, policy_version 55701 (0.0021) [2024-08-05 19:55:12,365][15444] Updated weights for policy 0, policy_version 55711 (0.0015) [2024-08-05 19:55:13,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 456400896. Throughput: 0: 6044.4. Samples: 114100770. Policy #0 lag: (min: 1.0, avg: 4.0, max: 8.0) [2024-08-05 19:55:13,126][15372] Avg episode reward: [(0, '43.264')] [2024-08-05 19:55:15,782][15444] Updated weights for policy 0, policy_version 55721 (0.0018) [2024-08-05 19:55:18,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 456523776. Throughput: 0: 6048.0. Samples: 114136600. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 19:55:18,126][15372] Avg episode reward: [(0, '44.004')] [2024-08-05 19:55:19,343][15444] Updated weights for policy 0, policy_version 55731 (0.0012) [2024-08-05 19:55:22,467][15444] Updated weights for policy 0, policy_version 55741 (0.0015) [2024-08-05 19:55:23,119][15372] Fps is (10 sec: 24574.7, 60 sec: 24302.7, 300 sec: 24159.4). Total num frames: 456646656. Throughput: 0: 6029.0. Samples: 114154710. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 19:55:23,120][15372] Avg episode reward: [(0, '43.230')] [2024-08-05 19:55:25,892][15444] Updated weights for policy 0, policy_version 55751 (0.0013) [2024-08-05 19:55:28,119][15372] Fps is (10 sec: 23756.1, 60 sec: 24166.3, 300 sec: 24131.7). Total num frames: 456761344. Throughput: 0: 6040.0. Samples: 114190670. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 19:55:28,130][15372] Avg episode reward: [(0, '43.037')] [2024-08-05 19:55:29,328][15444] Updated weights for policy 0, policy_version 55761 (0.0011) [2024-08-05 19:55:32,634][15444] Updated weights for policy 0, policy_version 55771 (0.0021) [2024-08-05 19:55:33,118][15372] Fps is (10 sec: 23758.1, 60 sec: 24166.5, 300 sec: 24131.7). Total num frames: 456884224. Throughput: 0: 6036.4. Samples: 114226520. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 19:55:33,119][15372] Avg episode reward: [(0, '43.719')] [2024-08-05 19:55:36,280][15444] Updated weights for policy 0, policy_version 55781 (0.0012) [2024-08-05 19:55:37,310][15417] Signal inference workers to stop experience collection... (20450 times) [2024-08-05 19:55:37,312][15417] Signal inference workers to resume experience collection... (20450 times) [2024-08-05 19:55:37,386][15444] InferenceWorker_p0-w0: stopping experience collection (20450 times) [2024-08-05 19:55:37,386][15444] InferenceWorker_p0-w0: resuming experience collection (20450 times) [2024-08-05 19:55:38,118][15372] Fps is (10 sec: 23757.6, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 456998912. Throughput: 0: 6039.8. Samples: 114245040. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 19:55:38,119][15372] Avg episode reward: [(0, '43.638')] [2024-08-05 19:55:39,365][15444] Updated weights for policy 0, policy_version 55791 (0.0013) [2024-08-05 19:55:42,908][15444] Updated weights for policy 0, policy_version 55801 (0.0030) [2024-08-05 19:55:43,119][15372] Fps is (10 sec: 23756.3, 60 sec: 24029.8, 300 sec: 24131.7). Total num frames: 457121792. Throughput: 0: 6037.1. Samples: 114281580. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 19:55:43,119][15372] Avg episode reward: [(0, '42.753')] [2024-08-05 19:55:46,410][15444] Updated weights for policy 0, policy_version 55811 (0.0021) [2024-08-05 19:55:48,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 457244672. Throughput: 0: 6021.8. Samples: 114316990. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 19:55:48,126][15372] Avg episode reward: [(0, '43.162')] [2024-08-05 19:55:49,721][15444] Updated weights for policy 0, policy_version 55821 (0.0018) [2024-08-05 19:55:53,089][15444] Updated weights for policy 0, policy_version 55831 (0.0013) [2024-08-05 19:55:53,118][15372] Fps is (10 sec: 24576.4, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 457367552. Throughput: 0: 6027.6. Samples: 114335990. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 19:55:53,119][15372] Avg episode reward: [(0, '43.861')] [2024-08-05 19:55:56,467][15444] Updated weights for policy 0, policy_version 55841 (0.0024) [2024-08-05 19:55:58,119][15372] Fps is (10 sec: 24575.7, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 457490432. Throughput: 0: 6022.9. Samples: 114371800. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 19:55:58,126][15372] Avg episode reward: [(0, '43.719')] [2024-08-05 19:55:59,877][15444] Updated weights for policy 0, policy_version 55851 (0.0010) [2024-08-05 19:56:03,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 457605120. Throughput: 0: 6037.1. Samples: 114408270. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 19:56:03,126][15372] Avg episode reward: [(0, '43.201')] [2024-08-05 19:56:03,388][15444] Updated weights for policy 0, policy_version 55861 (0.0021) [2024-08-05 19:56:06,502][15444] Updated weights for policy 0, policy_version 55871 (0.0025) [2024-08-05 19:56:08,118][15372] Fps is (10 sec: 23757.1, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 457728000. Throughput: 0: 6033.0. Samples: 114426190. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 19:56:08,126][15372] Avg episode reward: [(0, '42.894')] [2024-08-05 19:56:10,064][15444] Updated weights for policy 0, policy_version 55881 (0.0020) [2024-08-05 19:56:13,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.4, 300 sec: 24159.6). Total num frames: 457850880. Throughput: 0: 6053.6. Samples: 114463080. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 19:56:13,126][15372] Avg episode reward: [(0, '43.259')] [2024-08-05 19:56:13,259][15444] Updated weights for policy 0, policy_version 55891 (0.0018) [2024-08-05 19:56:16,591][15444] Updated weights for policy 0, policy_version 55901 (0.0026) [2024-08-05 19:56:18,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 457965568. Throughput: 0: 6052.2. Samples: 114498870. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 19:56:18,126][15372] Avg episode reward: [(0, '43.107')] [2024-08-05 19:56:18,129][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000055904_457965568.pth... [2024-08-05 19:56:18,266][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000055197_452173824.pth [2024-08-05 19:56:20,278][15444] Updated weights for policy 0, policy_version 55911 (0.0011) [2024-08-05 19:56:23,119][15372] Fps is (10 sec: 23756.3, 60 sec: 24030.0, 300 sec: 24131.7). Total num frames: 458088448. Throughput: 0: 6046.9. Samples: 114517150. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 19:56:23,119][15372] Avg episode reward: [(0, '43.407')] [2024-08-05 19:56:23,490][15444] Updated weights for policy 0, policy_version 55921 (0.0029) [2024-08-05 19:56:26,991][15444] Updated weights for policy 0, policy_version 55931 (0.0018) [2024-08-05 19:56:28,119][15372] Fps is (10 sec: 24575.8, 60 sec: 24166.5, 300 sec: 24131.7). Total num frames: 458211328. Throughput: 0: 6031.1. Samples: 114552980. Policy #0 lag: (min: 1.0, avg: 3.9, max: 7.0) [2024-08-05 19:56:28,119][15372] Avg episode reward: [(0, '43.504')] [2024-08-05 19:56:30,347][15444] Updated weights for policy 0, policy_version 55941 (0.0019) [2024-08-05 19:56:33,118][15372] Fps is (10 sec: 23757.3, 60 sec: 24029.9, 300 sec: 24103.9). Total num frames: 458326016. Throughput: 0: 6045.6. Samples: 114589040. Policy #0 lag: (min: 1.0, avg: 3.9, max: 7.0) [2024-08-05 19:56:33,119][15372] Avg episode reward: [(0, '43.826')] [2024-08-05 19:56:33,420][15417] Signal inference workers to stop experience collection... (20500 times) [2024-08-05 19:56:33,423][15417] Signal inference workers to resume experience collection... (20500 times) [2024-08-05 19:56:33,494][15444] InferenceWorker_p0-w0: stopping experience collection (20500 times) [2024-08-05 19:56:33,494][15444] InferenceWorker_p0-w0: resuming experience collection (20500 times) [2024-08-05 19:56:33,759][15444] Updated weights for policy 0, policy_version 55951 (0.0014) [2024-08-05 19:56:37,331][15444] Updated weights for policy 0, policy_version 55961 (0.0020) [2024-08-05 19:56:38,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24166.4, 300 sec: 24131.8). Total num frames: 458448896. Throughput: 0: 6020.0. Samples: 114606890. Policy #0 lag: (min: 1.0, avg: 3.9, max: 7.0) [2024-08-05 19:56:38,119][15372] Avg episode reward: [(0, '44.215')] [2024-08-05 19:56:40,564][15444] Updated weights for policy 0, policy_version 55971 (0.0020) [2024-08-05 19:56:43,119][15372] Fps is (10 sec: 24575.6, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 458571776. Throughput: 0: 6018.7. Samples: 114642640. Policy #0 lag: (min: 1.0, avg: 3.9, max: 7.0) [2024-08-05 19:56:43,126][15372] Avg episode reward: [(0, '44.022')] [2024-08-05 19:56:44,115][15444] Updated weights for policy 0, policy_version 55981 (0.0029) [2024-08-05 19:56:47,485][15444] Updated weights for policy 0, policy_version 55991 (0.0011) [2024-08-05 19:56:48,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24029.9, 300 sec: 24103.9). Total num frames: 458686464. Throughput: 0: 5992.2. Samples: 114677920. Policy #0 lag: (min: 1.0, avg: 3.9, max: 7.0) [2024-08-05 19:56:48,119][15372] Avg episode reward: [(0, '43.766')] [2024-08-05 19:56:51,109][15444] Updated weights for policy 0, policy_version 56001 (0.0014) [2024-08-05 19:56:53,118][15372] Fps is (10 sec: 23757.2, 60 sec: 24029.9, 300 sec: 24103.9). Total num frames: 458809344. Throughput: 0: 6000.2. Samples: 114696200. Policy #0 lag: (min: 1.0, avg: 3.9, max: 7.0) [2024-08-05 19:56:53,119][15372] Avg episode reward: [(0, '43.758')] [2024-08-05 19:56:54,452][15444] Updated weights for policy 0, policy_version 56011 (0.0011) [2024-08-05 19:56:57,707][15444] Updated weights for policy 0, policy_version 56021 (0.0010) [2024-08-05 19:56:58,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 458932224. Throughput: 0: 5997.1. Samples: 114732950. Policy #0 lag: (min: 1.0, avg: 3.9, max: 7.0) [2024-08-05 19:56:58,119][15372] Avg episode reward: [(0, '45.089')] [2024-08-05 19:57:01,392][15444] Updated weights for policy 0, policy_version 56031 (0.0032) [2024-08-05 19:57:03,119][15372] Fps is (10 sec: 23756.6, 60 sec: 24029.8, 300 sec: 24103.9). Total num frames: 459046912. Throughput: 0: 5986.9. Samples: 114768280. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 19:57:03,119][15372] Avg episode reward: [(0, '44.613')] [2024-08-05 19:57:04,458][15444] Updated weights for policy 0, policy_version 56041 (0.0028) [2024-08-05 19:57:08,061][15444] Updated weights for policy 0, policy_version 56051 (0.0017) [2024-08-05 19:57:08,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.9, 300 sec: 24104.0). Total num frames: 459169792. Throughput: 0: 5990.9. Samples: 114786740. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 19:57:08,119][15372] Avg episode reward: [(0, '42.933')] [2024-08-05 19:57:11,288][15444] Updated weights for policy 0, policy_version 56061 (0.0019) [2024-08-05 19:57:13,119][15372] Fps is (10 sec: 24576.0, 60 sec: 24029.8, 300 sec: 24131.7). Total num frames: 459292672. Throughput: 0: 5992.9. Samples: 114822660. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 19:57:13,126][15372] Avg episode reward: [(0, '44.020')] [2024-08-05 19:57:14,709][15417] Signal inference workers to stop experience collection... (20550 times) [2024-08-05 19:57:14,709][15417] Signal inference workers to resume experience collection... (20550 times) [2024-08-05 19:57:14,768][15444] InferenceWorker_p0-w0: stopping experience collection (20550 times) [2024-08-05 19:57:14,768][15444] InferenceWorker_p0-w0: resuming experience collection (20550 times) [2024-08-05 19:57:14,807][15444] Updated weights for policy 0, policy_version 56071 (0.0019) [2024-08-05 19:57:18,132][15372] Fps is (10 sec: 23724.9, 60 sec: 24024.5, 300 sec: 24102.8). Total num frames: 459407360. Throughput: 0: 6008.2. Samples: 114859490. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 19:57:18,132][15372] Avg episode reward: [(0, '44.549')] [2024-08-05 19:57:18,377][15444] Updated weights for policy 0, policy_version 56081 (0.0020) [2024-08-05 19:57:21,447][15444] Updated weights for policy 0, policy_version 56091 (0.0044) [2024-08-05 19:57:23,118][15372] Fps is (10 sec: 23757.1, 60 sec: 24030.0, 300 sec: 24131.7). Total num frames: 459530240. Throughput: 0: 6016.2. Samples: 114877620. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 19:57:23,126][15372] Avg episode reward: [(0, '43.674')] [2024-08-05 19:57:25,062][15444] Updated weights for policy 0, policy_version 56101 (0.0026) [2024-08-05 19:57:28,118][15372] Fps is (10 sec: 24609.1, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 459653120. Throughput: 0: 6022.5. Samples: 114913650. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 19:57:28,126][15372] Avg episode reward: [(0, '42.712')] [2024-08-05 19:57:28,483][15444] Updated weights for policy 0, policy_version 56111 (0.0015) [2024-08-05 19:57:31,759][15444] Updated weights for policy 0, policy_version 56121 (0.0014) [2024-08-05 19:57:33,119][15372] Fps is (10 sec: 24574.4, 60 sec: 24166.1, 300 sec: 24159.4). Total num frames: 459776000. Throughput: 0: 6025.7. Samples: 114949080. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 19:57:33,120][15372] Avg episode reward: [(0, '43.752')] [2024-08-05 19:57:35,186][15444] Updated weights for policy 0, policy_version 56131 (0.0023) [2024-08-05 19:57:38,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24029.9, 300 sec: 24103.9). Total num frames: 459890688. Throughput: 0: 6024.9. Samples: 114967320. Policy #0 lag: (min: 0.0, avg: 3.4, max: 9.0) [2024-08-05 19:57:38,126][15372] Avg episode reward: [(0, '44.475')] [2024-08-05 19:57:38,840][15444] Updated weights for policy 0, policy_version 56141 (0.0029) [2024-08-05 19:57:42,385][15444] Updated weights for policy 0, policy_version 56151 (0.0012) [2024-08-05 19:57:43,119][15372] Fps is (10 sec: 23758.2, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 460013568. Throughput: 0: 5983.1. Samples: 115002190. Policy #0 lag: (min: 0.0, avg: 3.4, max: 9.0) [2024-08-05 19:57:43,119][15372] Avg episode reward: [(0, '43.532')] [2024-08-05 19:57:45,418][15444] Updated weights for policy 0, policy_version 56161 (0.0021) [2024-08-05 19:57:48,119][15372] Fps is (10 sec: 23756.4, 60 sec: 24029.8, 300 sec: 24103.9). Total num frames: 460128256. Throughput: 0: 5996.9. Samples: 115038140. Policy #0 lag: (min: 0.0, avg: 3.4, max: 9.0) [2024-08-05 19:57:48,126][15372] Avg episode reward: [(0, '42.947')] [2024-08-05 19:57:48,998][15444] Updated weights for policy 0, policy_version 56171 (0.0012) [2024-08-05 19:57:52,278][15444] Updated weights for policy 0, policy_version 56181 (0.0017) [2024-08-05 19:57:53,119][15372] Fps is (10 sec: 22937.4, 60 sec: 23893.3, 300 sec: 24076.1). Total num frames: 460242944. Throughput: 0: 5997.8. Samples: 115056640. Policy #0 lag: (min: 0.0, avg: 3.4, max: 9.0) [2024-08-05 19:57:53,119][15372] Avg episode reward: [(0, '42.851')] [2024-08-05 19:57:55,720][15444] Updated weights for policy 0, policy_version 56191 (0.0032) [2024-08-05 19:57:57,795][15417] Signal inference workers to stop experience collection... (20600 times) [2024-08-05 19:57:57,797][15417] Signal inference workers to resume experience collection... (20600 times) [2024-08-05 19:57:57,837][15444] InferenceWorker_p0-w0: stopping experience collection (20600 times) [2024-08-05 19:57:57,837][15444] InferenceWorker_p0-w0: resuming experience collection (20600 times) [2024-08-05 19:57:58,118][15372] Fps is (10 sec: 23757.3, 60 sec: 23893.3, 300 sec: 24103.9). Total num frames: 460365824. Throughput: 0: 5985.8. Samples: 115092020. Policy #0 lag: (min: 0.0, avg: 3.4, max: 9.0) [2024-08-05 19:57:58,119][15372] Avg episode reward: [(0, '44.130')] [2024-08-05 19:57:59,506][15444] Updated weights for policy 0, policy_version 56201 (0.0012) [2024-08-05 19:58:02,688][15444] Updated weights for policy 0, policy_version 56211 (0.0026) [2024-08-05 19:58:03,118][15372] Fps is (10 sec: 24576.4, 60 sec: 24029.9, 300 sec: 24103.9). Total num frames: 460488704. Throughput: 0: 5962.7. Samples: 115127730. Policy #0 lag: (min: 0.0, avg: 3.4, max: 9.0) [2024-08-05 19:58:03,119][15372] Avg episode reward: [(0, '44.173')] [2024-08-05 19:58:06,131][15444] Updated weights for policy 0, policy_version 56221 (0.0019) [2024-08-05 19:58:08,120][15372] Fps is (10 sec: 23754.1, 60 sec: 23892.9, 300 sec: 24076.1). Total num frames: 460603392. Throughput: 0: 5975.6. Samples: 115146530. Policy #0 lag: (min: 0.0, avg: 3.4, max: 9.0) [2024-08-05 19:58:08,128][15372] Avg episode reward: [(0, '42.524')] [2024-08-05 19:58:09,487][15444] Updated weights for policy 0, policy_version 56231 (0.0026) [2024-08-05 19:58:12,954][15444] Updated weights for policy 0, policy_version 56241 (0.0028) [2024-08-05 19:58:13,119][15372] Fps is (10 sec: 23756.4, 60 sec: 23893.3, 300 sec: 24104.0). Total num frames: 460726272. Throughput: 0: 5977.5. Samples: 115182640. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 19:58:13,119][15372] Avg episode reward: [(0, '43.075')] [2024-08-05 19:58:16,837][15444] Updated weights for policy 0, policy_version 56251 (0.0010) [2024-08-05 19:58:18,119][15372] Fps is (10 sec: 23758.9, 60 sec: 23898.6, 300 sec: 24048.4). Total num frames: 460840960. Throughput: 0: 5909.4. Samples: 115215000. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 19:58:18,119][15372] Avg episode reward: [(0, '42.828')] [2024-08-05 19:58:18,123][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000056255_460840960.pth... [2024-08-05 19:58:18,243][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000055551_455073792.pth [2024-08-05 19:58:20,247][15444] Updated weights for policy 0, policy_version 56261 (0.0022) [2024-08-05 19:58:23,118][15372] Fps is (10 sec: 21299.5, 60 sec: 23483.7, 300 sec: 23992.8). Total num frames: 460939264. Throughput: 0: 5884.2. Samples: 115232110. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 19:58:23,126][15372] Avg episode reward: [(0, '42.437')] [2024-08-05 19:58:24,068][15444] Updated weights for policy 0, policy_version 56271 (0.0029) [2024-08-05 19:58:28,029][15444] Updated weights for policy 0, policy_version 56281 (0.0019) [2024-08-05 19:58:28,119][15372] Fps is (10 sec: 21299.1, 60 sec: 23347.1, 300 sec: 23965.1). Total num frames: 461053952. Throughput: 0: 5858.9. Samples: 115265840. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 19:58:28,119][15372] Avg episode reward: [(0, '43.767')] [2024-08-05 19:58:31,067][15444] Updated weights for policy 0, policy_version 56291 (0.0022) [2024-08-05 19:58:33,118][15372] Fps is (10 sec: 23756.9, 60 sec: 23347.5, 300 sec: 23965.1). Total num frames: 461176832. Throughput: 0: 5825.6. Samples: 115300290. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 19:58:33,119][15372] Avg episode reward: [(0, '43.953')] [2024-08-05 19:58:35,193][15444] Updated weights for policy 0, policy_version 56301 (0.0011) [2024-08-05 19:58:37,717][15417] Signal inference workers to stop experience collection... (20650 times) [2024-08-05 19:58:37,718][15417] Signal inference workers to resume experience collection... (20650 times) [2024-08-05 19:58:37,779][15444] InferenceWorker_p0-w0: stopping experience collection (20650 times) [2024-08-05 19:58:37,794][15444] InferenceWorker_p0-w0: resuming experience collection (20650 times) [2024-08-05 19:58:38,119][15372] Fps is (10 sec: 23756.7, 60 sec: 23347.1, 300 sec: 23965.1). Total num frames: 461291520. Throughput: 0: 5774.4. Samples: 115316490. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 19:58:38,119][15372] Avg episode reward: [(0, '44.021')] [2024-08-05 19:58:38,364][15444] Updated weights for policy 0, policy_version 56311 (0.0027) [2024-08-05 19:58:42,278][15444] Updated weights for policy 0, policy_version 56321 (0.0036) [2024-08-05 19:58:43,121][15372] Fps is (10 sec: 22931.0, 60 sec: 23209.6, 300 sec: 23964.9). Total num frames: 461406208. Throughput: 0: 5732.5. Samples: 115350000. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 19:58:43,122][15372] Avg episode reward: [(0, '44.860')] [2024-08-05 19:58:45,788][15444] Updated weights for policy 0, policy_version 56331 (0.0012) [2024-08-05 19:58:48,118][15372] Fps is (10 sec: 22119.2, 60 sec: 23074.2, 300 sec: 23881.8). Total num frames: 461512704. Throughput: 0: 5722.9. Samples: 115385260. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 19:58:48,119][15372] Avg episode reward: [(0, '43.356')] [2024-08-05 19:58:49,093][15444] Updated weights for policy 0, policy_version 56341 (0.0038) [2024-08-05 19:58:53,118][15372] Fps is (10 sec: 20485.9, 60 sec: 22801.1, 300 sec: 23826.2). Total num frames: 461611008. Throughput: 0: 5665.9. Samples: 115401490. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 19:58:53,126][15372] Avg episode reward: [(0, '42.891')] [2024-08-05 19:58:53,747][15444] Updated weights for policy 0, policy_version 56351 (0.0016) [2024-08-05 19:58:56,979][15444] Updated weights for policy 0, policy_version 56361 (0.0012) [2024-08-05 19:58:58,118][15372] Fps is (10 sec: 21299.1, 60 sec: 22664.5, 300 sec: 23798.5). Total num frames: 461725696. Throughput: 0: 5555.4. Samples: 115432630. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 19:58:58,119][15372] Avg episode reward: [(0, '43.523')] [2024-08-05 19:59:00,785][15444] Updated weights for policy 0, policy_version 56371 (0.0022) [2024-08-05 19:59:03,118][15372] Fps is (10 sec: 22937.6, 60 sec: 22528.0, 300 sec: 23798.5). Total num frames: 461840384. Throughput: 0: 5617.1. Samples: 115467770. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 19:59:03,126][15372] Avg episode reward: [(0, '43.793')] [2024-08-05 19:59:04,121][15444] Updated weights for policy 0, policy_version 56381 (0.0020) [2024-08-05 19:59:08,118][15372] Fps is (10 sec: 22118.5, 60 sec: 22391.9, 300 sec: 23715.2). Total num frames: 461946880. Throughput: 0: 5567.8. Samples: 115482660. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 19:59:08,126][15372] Avg episode reward: [(0, '43.714')] [2024-08-05 19:59:08,214][15444] Updated weights for policy 0, policy_version 56391 (0.0031) [2024-08-05 19:59:12,172][15444] Updated weights for policy 0, policy_version 56401 (0.0025) [2024-08-05 19:59:13,119][15372] Fps is (10 sec: 21298.9, 60 sec: 22118.4, 300 sec: 23659.6). Total num frames: 462053376. Throughput: 0: 5557.6. Samples: 115515930. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 19:59:13,119][15372] Avg episode reward: [(0, '43.249')] [2024-08-05 19:59:16,785][15444] Updated weights for policy 0, policy_version 56411 (0.0026) [2024-08-05 19:59:18,118][15372] Fps is (10 sec: 20479.9, 60 sec: 21845.4, 300 sec: 23604.1). Total num frames: 462151680. Throughput: 0: 5396.9. Samples: 115543150. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 19:59:18,120][15372] Avg episode reward: [(0, '43.471')] [2024-08-05 19:59:20,217][15444] Updated weights for policy 0, policy_version 56421 (0.0013) [2024-08-05 19:59:23,120][15372] Fps is (10 sec: 20477.0, 60 sec: 21981.3, 300 sec: 23548.4). Total num frames: 462258176. Throughput: 0: 5425.8. Samples: 115560660. Policy #0 lag: (min: 0.0, avg: 4.4, max: 8.0) [2024-08-05 19:59:23,121][15372] Avg episode reward: [(0, '44.854')] [2024-08-05 19:59:23,924][15444] Updated weights for policy 0, policy_version 56431 (0.0018) [2024-08-05 19:59:27,431][15444] Updated weights for policy 0, policy_version 56441 (0.0020) [2024-08-05 19:59:28,118][15372] Fps is (10 sec: 22118.4, 60 sec: 21982.0, 300 sec: 23520.8). Total num frames: 462372864. Throughput: 0: 5431.0. Samples: 115594380. Policy #0 lag: (min: 0.0, avg: 4.4, max: 8.0) [2024-08-05 19:59:28,126][15372] Avg episode reward: [(0, '44.385')] [2024-08-05 19:59:31,223][15444] Updated weights for policy 0, policy_version 56451 (0.0028) [2024-08-05 19:59:33,118][15372] Fps is (10 sec: 22941.3, 60 sec: 21845.3, 300 sec: 23520.8). Total num frames: 462487552. Throughput: 0: 5405.3. Samples: 115628500. Policy #0 lag: (min: 0.0, avg: 4.4, max: 8.0) [2024-08-05 19:59:33,119][15372] Avg episode reward: [(0, '43.531')] [2024-08-05 19:59:34,609][15444] Updated weights for policy 0, policy_version 56461 (0.0026) [2024-08-05 19:59:38,054][15444] Updated weights for policy 0, policy_version 56471 (0.0011) [2024-08-05 19:59:38,118][15372] Fps is (10 sec: 23756.8, 60 sec: 21982.0, 300 sec: 23493.0). Total num frames: 462610432. Throughput: 0: 5444.4. Samples: 115646490. Policy #0 lag: (min: 0.0, avg: 4.4, max: 8.0) [2024-08-05 19:59:38,119][15372] Avg episode reward: [(0, '43.473')] [2024-08-05 19:59:41,766][15444] Updated weights for policy 0, policy_version 56481 (0.0014) [2024-08-05 19:59:43,118][15372] Fps is (10 sec: 23756.8, 60 sec: 21982.9, 300 sec: 23493.0). Total num frames: 462725120. Throughput: 0: 5516.4. Samples: 115680870. Policy #0 lag: (min: 0.0, avg: 4.4, max: 8.0) [2024-08-05 19:59:43,119][15372] Avg episode reward: [(0, '44.254')] [2024-08-05 19:59:44,342][15417] Signal inference workers to stop experience collection... (20700 times) [2024-08-05 19:59:44,347][15417] Signal inference workers to resume experience collection... (20700 times) [2024-08-05 19:59:44,390][15444] InferenceWorker_p0-w0: stopping experience collection (20700 times) [2024-08-05 19:59:44,396][15444] InferenceWorker_p0-w0: resuming experience collection (20700 times) [2024-08-05 19:59:45,216][15444] Updated weights for policy 0, policy_version 56491 (0.0025) [2024-08-05 19:59:48,119][15372] Fps is (10 sec: 22937.2, 60 sec: 22118.3, 300 sec: 23465.2). Total num frames: 462839808. Throughput: 0: 5525.5. Samples: 115716420. Policy #0 lag: (min: 0.0, avg: 4.4, max: 8.0) [2024-08-05 19:59:48,119][15372] Avg episode reward: [(0, '44.350')] [2024-08-05 19:59:48,755][15444] Updated weights for policy 0, policy_version 56501 (0.0037) [2024-08-05 19:59:52,304][15444] Updated weights for policy 0, policy_version 56511 (0.0012) [2024-08-05 19:59:53,118][15372] Fps is (10 sec: 22937.7, 60 sec: 22391.5, 300 sec: 23437.5). Total num frames: 462954496. Throughput: 0: 5564.0. Samples: 115733040. Policy #0 lag: (min: 0.0, avg: 4.4, max: 8.0) [2024-08-05 19:59:53,119][15372] Avg episode reward: [(0, '44.002')] [2024-08-05 19:59:55,862][15444] Updated weights for policy 0, policy_version 56521 (0.0024) [2024-08-05 19:59:58,119][15372] Fps is (10 sec: 23756.4, 60 sec: 22527.9, 300 sec: 23465.2). Total num frames: 463077376. Throughput: 0: 5610.2. Samples: 115768390. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 19:59:58,119][15372] Avg episode reward: [(0, '44.153')] [2024-08-05 19:59:59,452][15444] Updated weights for policy 0, policy_version 56531 (0.0020) [2024-08-05 20:00:03,119][15372] Fps is (10 sec: 22117.5, 60 sec: 22254.8, 300 sec: 23354.1). Total num frames: 463175680. Throughput: 0: 5763.5. Samples: 115802510. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 20:00:03,127][15372] Avg episode reward: [(0, '43.619')] [2024-08-05 20:00:03,182][15444] Updated weights for policy 0, policy_version 56541 (0.0019) [2024-08-05 20:00:06,493][15444] Updated weights for policy 0, policy_version 56551 (0.0017) [2024-08-05 20:00:08,119][15372] Fps is (10 sec: 22119.0, 60 sec: 22528.0, 300 sec: 23381.9). Total num frames: 463298560. Throughput: 0: 5766.2. Samples: 115820130. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 20:00:08,126][15372] Avg episode reward: [(0, '43.302')] [2024-08-05 20:00:10,057][15444] Updated weights for policy 0, policy_version 56561 (0.0022) [2024-08-05 20:00:13,119][15372] Fps is (10 sec: 24576.4, 60 sec: 22801.0, 300 sec: 23381.9). Total num frames: 463421440. Throughput: 0: 5796.2. Samples: 115855210. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 20:00:13,127][15372] Avg episode reward: [(0, '44.350')] [2024-08-05 20:00:13,632][15444] Updated weights for policy 0, policy_version 56571 (0.0014) [2024-08-05 20:00:17,284][15444] Updated weights for policy 0, policy_version 56581 (0.0024) [2024-08-05 20:00:18,120][15372] Fps is (10 sec: 23753.2, 60 sec: 23073.5, 300 sec: 23354.1). Total num frames: 463536128. Throughput: 0: 5795.3. Samples: 115889300. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 20:00:18,120][15372] Avg episode reward: [(0, '44.905')] [2024-08-05 20:00:18,126][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000056584_463536128.pth... [2024-08-05 20:00:18,238][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000055904_457965568.pth [2024-08-05 20:00:20,878][15444] Updated weights for policy 0, policy_version 56591 (0.0013) [2024-08-05 20:00:23,118][15372] Fps is (10 sec: 22938.0, 60 sec: 23211.3, 300 sec: 23354.2). Total num frames: 463650816. Throughput: 0: 5789.8. Samples: 115907030. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 20:00:23,119][15372] Avg episode reward: [(0, '45.045')] [2024-08-05 20:00:24,253][15444] Updated weights for policy 0, policy_version 56601 (0.0028) [2024-08-05 20:00:27,771][15444] Updated weights for policy 0, policy_version 56611 (0.0013) [2024-08-05 20:00:28,118][15372] Fps is (10 sec: 22941.2, 60 sec: 23210.7, 300 sec: 23326.4). Total num frames: 463765504. Throughput: 0: 5782.4. Samples: 115941080. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 20:00:28,119][15372] Avg episode reward: [(0, '44.766')] [2024-08-05 20:00:31,503][15444] Updated weights for policy 0, policy_version 56621 (0.0027) [2024-08-05 20:00:33,118][15372] Fps is (10 sec: 22118.5, 60 sec: 23074.1, 300 sec: 23298.6). Total num frames: 463872000. Throughput: 0: 5744.5. Samples: 115974920. Policy #0 lag: (min: 1.0, avg: 3.5, max: 7.0) [2024-08-05 20:00:33,119][15372] Avg episode reward: [(0, '43.303')] [2024-08-05 20:00:34,857][15444] Updated weights for policy 0, policy_version 56631 (0.0013) [2024-08-05 20:00:38,118][15372] Fps is (10 sec: 22118.5, 60 sec: 22937.6, 300 sec: 23270.8). Total num frames: 463986688. Throughput: 0: 5777.3. Samples: 115993020. Policy #0 lag: (min: 1.0, avg: 3.5, max: 7.0) [2024-08-05 20:00:38,126][15372] Avg episode reward: [(0, '43.163')] [2024-08-05 20:00:38,662][15444] Updated weights for policy 0, policy_version 56641 (0.0024) [2024-08-05 20:00:41,956][15444] Updated weights for policy 0, policy_version 56651 (0.0019) [2024-08-05 20:00:43,119][15372] Fps is (10 sec: 22937.2, 60 sec: 22937.6, 300 sec: 23243.1). Total num frames: 464101376. Throughput: 0: 5737.6. Samples: 116026580. Policy #0 lag: (min: 1.0, avg: 3.5, max: 7.0) [2024-08-05 20:00:43,119][15372] Avg episode reward: [(0, '43.394')] [2024-08-05 20:00:43,440][15417] Signal inference workers to stop experience collection... (20750 times) [2024-08-05 20:00:43,441][15417] Signal inference workers to resume experience collection... (20750 times) [2024-08-05 20:00:43,481][15444] InferenceWorker_p0-w0: stopping experience collection (20750 times) [2024-08-05 20:00:43,489][15444] InferenceWorker_p0-w0: resuming experience collection (20750 times) [2024-08-05 20:00:45,672][15444] Updated weights for policy 0, policy_version 56661 (0.0044) [2024-08-05 20:00:48,119][15372] Fps is (10 sec: 23756.7, 60 sec: 23074.2, 300 sec: 23243.1). Total num frames: 464224256. Throughput: 0: 5773.4. Samples: 116062310. Policy #0 lag: (min: 1.0, avg: 3.5, max: 7.0) [2024-08-05 20:00:48,119][15372] Avg episode reward: [(0, '42.882')] [2024-08-05 20:00:49,410][15444] Updated weights for policy 0, policy_version 56671 (0.0047) [2024-08-05 20:00:52,634][15444] Updated weights for policy 0, policy_version 56681 (0.0012) [2024-08-05 20:00:53,118][15372] Fps is (10 sec: 23757.2, 60 sec: 23074.1, 300 sec: 23215.3). Total num frames: 464338944. Throughput: 0: 5753.6. Samples: 116079040. Policy #0 lag: (min: 1.0, avg: 3.5, max: 7.0) [2024-08-05 20:00:53,119][15372] Avg episode reward: [(0, '43.116')] [2024-08-05 20:00:56,449][15444] Updated weights for policy 0, policy_version 56691 (0.0017) [2024-08-05 20:00:58,118][15372] Fps is (10 sec: 22118.5, 60 sec: 22801.2, 300 sec: 23187.5). Total num frames: 464445440. Throughput: 0: 5735.1. Samples: 116113290. Policy #0 lag: (min: 1.0, avg: 3.5, max: 7.0) [2024-08-05 20:00:58,119][15372] Avg episode reward: [(0, '43.635')] [2024-08-05 20:00:59,789][15444] Updated weights for policy 0, policy_version 56701 (0.0013) [2024-08-05 20:01:03,119][15372] Fps is (10 sec: 22118.1, 60 sec: 23074.2, 300 sec: 23159.7). Total num frames: 464560128. Throughput: 0: 5752.0. Samples: 116148130. Policy #0 lag: (min: 1.0, avg: 3.5, max: 7.0) [2024-08-05 20:01:03,126][15372] Avg episode reward: [(0, '42.669')] [2024-08-05 20:01:03,423][15444] Updated weights for policy 0, policy_version 56711 (0.0013) [2024-08-05 20:01:07,335][15444] Updated weights for policy 0, policy_version 56721 (0.0015) [2024-08-05 20:01:08,119][15372] Fps is (10 sec: 23756.6, 60 sec: 23074.1, 300 sec: 23159.7). Total num frames: 464683008. Throughput: 0: 5741.3. Samples: 116165390. Policy #0 lag: (min: 0.0, avg: 4.6, max: 9.0) [2024-08-05 20:01:08,119][15372] Avg episode reward: [(0, '42.207')] [2024-08-05 20:01:10,295][15444] Updated weights for policy 0, policy_version 56731 (0.0020) [2024-08-05 20:01:13,119][15372] Fps is (10 sec: 23755.8, 60 sec: 22937.5, 300 sec: 23159.7). Total num frames: 464797696. Throughput: 0: 5743.3. Samples: 116199530. Policy #0 lag: (min: 0.0, avg: 4.6, max: 9.0) [2024-08-05 20:01:13,120][15372] Avg episode reward: [(0, '43.762')] [2024-08-05 20:01:14,337][15444] Updated weights for policy 0, policy_version 56741 (0.0034) [2024-08-05 20:01:17,517][15444] Updated weights for policy 0, policy_version 56751 (0.0012) [2024-08-05 20:01:18,119][15372] Fps is (10 sec: 22118.1, 60 sec: 22801.6, 300 sec: 23104.2). Total num frames: 464904192. Throughput: 0: 5749.7. Samples: 116233660. Policy #0 lag: (min: 0.0, avg: 4.6, max: 9.0) [2024-08-05 20:01:18,119][15372] Avg episode reward: [(0, '44.063')] [2024-08-05 20:01:21,257][15444] Updated weights for policy 0, policy_version 56761 (0.0035) [2024-08-05 20:01:23,118][15372] Fps is (10 sec: 22938.9, 60 sec: 22937.6, 300 sec: 23104.2). Total num frames: 465027072. Throughput: 0: 5742.0. Samples: 116251410. Policy #0 lag: (min: 0.0, avg: 4.6, max: 9.0) [2024-08-05 20:01:23,119][15372] Avg episode reward: [(0, '44.228')] [2024-08-05 20:01:24,905][15444] Updated weights for policy 0, policy_version 56771 (0.0013) [2024-08-05 20:01:25,035][15417] Signal inference workers to stop experience collection... (20800 times) [2024-08-05 20:01:25,036][15417] Signal inference workers to resume experience collection... (20800 times) [2024-08-05 20:01:25,112][15444] InferenceWorker_p0-w0: stopping experience collection (20800 times) [2024-08-05 20:01:25,112][15444] InferenceWorker_p0-w0: resuming experience collection (20800 times) [2024-08-05 20:01:28,118][15372] Fps is (10 sec: 23757.2, 60 sec: 22937.6, 300 sec: 23104.2). Total num frames: 465141760. Throughput: 0: 5773.1. Samples: 116286370. Policy #0 lag: (min: 0.0, avg: 4.6, max: 9.0) [2024-08-05 20:01:28,119][15372] Avg episode reward: [(0, '44.501')] [2024-08-05 20:01:28,153][15444] Updated weights for policy 0, policy_version 56781 (0.0014) [2024-08-05 20:01:32,044][15444] Updated weights for policy 0, policy_version 56791 (0.0016) [2024-08-05 20:01:33,119][15372] Fps is (10 sec: 22937.4, 60 sec: 23074.1, 300 sec: 23076.4). Total num frames: 465256448. Throughput: 0: 5730.9. Samples: 116320200. Policy #0 lag: (min: 0.0, avg: 4.6, max: 9.0) [2024-08-05 20:01:33,119][15372] Avg episode reward: [(0, '44.140')] [2024-08-05 20:01:35,301][15444] Updated weights for policy 0, policy_version 56801 (0.0012) [2024-08-05 20:01:38,119][15372] Fps is (10 sec: 22937.1, 60 sec: 23074.0, 300 sec: 23048.7). Total num frames: 465371136. Throughput: 0: 5755.3. Samples: 116338030. Policy #0 lag: (min: 0.0, avg: 4.6, max: 9.0) [2024-08-05 20:01:38,119][15372] Avg episode reward: [(0, '43.430')] [2024-08-05 20:01:39,093][15444] Updated weights for policy 0, policy_version 56811 (0.0017) [2024-08-05 20:01:42,500][15444] Updated weights for policy 0, policy_version 56821 (0.0030) [2024-08-05 20:01:43,118][15372] Fps is (10 sec: 22937.9, 60 sec: 23074.2, 300 sec: 23048.7). Total num frames: 465485824. Throughput: 0: 5766.2. Samples: 116372770. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 20:01:43,126][15372] Avg episode reward: [(0, '42.649')] [2024-08-05 20:01:46,050][15444] Updated weights for policy 0, policy_version 56831 (0.0022) [2024-08-05 20:01:48,119][15372] Fps is (10 sec: 22937.3, 60 sec: 22937.5, 300 sec: 23020.9). Total num frames: 465600512. Throughput: 0: 5745.5. Samples: 116406680. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 20:01:48,120][15372] Avg episode reward: [(0, '43.175')] [2024-08-05 20:01:49,651][15444] Updated weights for policy 0, policy_version 56841 (0.0015) [2024-08-05 20:01:53,118][15372] Fps is (10 sec: 22937.6, 60 sec: 22937.6, 300 sec: 22993.1). Total num frames: 465715200. Throughput: 0: 5741.1. Samples: 116423740. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 20:01:53,126][15372] Avg episode reward: [(0, '42.832')] [2024-08-05 20:01:53,411][15444] Updated weights for policy 0, policy_version 56851 (0.0024) [2024-08-05 20:01:57,127][15444] Updated weights for policy 0, policy_version 56861 (0.0015) [2024-08-05 20:01:58,118][15372] Fps is (10 sec: 22938.5, 60 sec: 23074.1, 300 sec: 22993.1). Total num frames: 465829888. Throughput: 0: 5730.1. Samples: 116457380. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 20:01:58,119][15372] Avg episode reward: [(0, '43.236')] [2024-08-05 20:02:00,409][15444] Updated weights for policy 0, policy_version 56871 (0.0013) [2024-08-05 20:02:03,118][15372] Fps is (10 sec: 22937.6, 60 sec: 23074.2, 300 sec: 22965.4). Total num frames: 465944576. Throughput: 0: 5742.7. Samples: 116492080. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 20:02:03,126][15372] Avg episode reward: [(0, '44.055')] [2024-08-05 20:02:04,174][15444] Updated weights for policy 0, policy_version 56881 (0.0013) [2024-08-05 20:02:07,671][15444] Updated weights for policy 0, policy_version 56891 (0.0018) [2024-08-05 20:02:08,118][15372] Fps is (10 sec: 22937.6, 60 sec: 22937.6, 300 sec: 22937.6). Total num frames: 466059264. Throughput: 0: 5731.3. Samples: 116509320. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 20:02:08,119][15372] Avg episode reward: [(0, '43.178')] [2024-08-05 20:02:11,307][15444] Updated weights for policy 0, policy_version 56901 (0.0020) [2024-08-05 20:02:13,118][15372] Fps is (10 sec: 22937.6, 60 sec: 22937.8, 300 sec: 22938.6). Total num frames: 466173952. Throughput: 0: 5712.2. Samples: 116543420. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 20:02:13,119][15372] Avg episode reward: [(0, '43.616')] [2024-08-05 20:02:14,729][15444] Updated weights for policy 0, policy_version 56911 (0.0024) [2024-08-05 20:02:18,119][15372] Fps is (10 sec: 22937.3, 60 sec: 23074.2, 300 sec: 22909.8). Total num frames: 466288640. Throughput: 0: 5750.7. Samples: 116578980. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 20:02:18,127][15372] Avg episode reward: [(0, '43.923')] [2024-08-05 20:02:18,131][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000056920_466288640.pth... [2024-08-05 20:02:18,291][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000056255_460840960.pth [2024-08-05 20:02:18,413][15444] Updated weights for policy 0, policy_version 56921 (0.0017) [2024-08-05 20:02:21,895][15444] Updated weights for policy 0, policy_version 56931 (0.0012) [2024-08-05 20:02:23,119][15372] Fps is (10 sec: 23756.3, 60 sec: 23074.0, 300 sec: 22909.8). Total num frames: 466411520. Throughput: 0: 5742.2. Samples: 116596430. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 20:02:23,119][15372] Avg episode reward: [(0, '44.124')] [2024-08-05 20:02:25,091][15444] Updated weights for policy 0, policy_version 56941 (0.0014) [2024-08-05 20:02:26,728][15417] Signal inference workers to stop experience collection... (20850 times) [2024-08-05 20:02:26,729][15417] Signal inference workers to resume experience collection... (20850 times) [2024-08-05 20:02:26,758][15444] InferenceWorker_p0-w0: stopping experience collection (20850 times) [2024-08-05 20:02:26,759][15444] InferenceWorker_p0-w0: resuming experience collection (20850 times) [2024-08-05 20:02:28,118][15372] Fps is (10 sec: 22937.8, 60 sec: 22937.6, 300 sec: 22854.3). Total num frames: 466518016. Throughput: 0: 5745.3. Samples: 116631310. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 20:02:28,119][15372] Avg episode reward: [(0, '42.977')] [2024-08-05 20:02:28,695][15444] Updated weights for policy 0, policy_version 56951 (0.0019) [2024-08-05 20:02:32,570][15444] Updated weights for policy 0, policy_version 56961 (0.0020) [2024-08-05 20:02:33,119][15372] Fps is (10 sec: 22937.8, 60 sec: 23074.1, 300 sec: 22882.1). Total num frames: 466640896. Throughput: 0: 5773.4. Samples: 116666480. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 20:02:33,119][15372] Avg episode reward: [(0, '42.323')] [2024-08-05 20:02:35,580][15444] Updated weights for policy 0, policy_version 56971 (0.0018) [2024-08-05 20:02:38,118][15372] Fps is (10 sec: 23756.8, 60 sec: 23074.2, 300 sec: 22854.3). Total num frames: 466755584. Throughput: 0: 5772.9. Samples: 116683520. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 20:02:38,119][15372] Avg episode reward: [(0, '43.576')] [2024-08-05 20:02:39,561][15444] Updated weights for policy 0, policy_version 56981 (0.0011) [2024-08-05 20:02:42,852][15444] Updated weights for policy 0, policy_version 56991 (0.0017) [2024-08-05 20:02:43,118][15372] Fps is (10 sec: 22938.0, 60 sec: 23074.1, 300 sec: 22854.3). Total num frames: 466870272. Throughput: 0: 5804.2. Samples: 116718570. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 20:02:43,119][15372] Avg episode reward: [(0, '44.099')] [2024-08-05 20:02:46,478][15444] Updated weights for policy 0, policy_version 57001 (0.0025) [2024-08-05 20:02:48,118][15372] Fps is (10 sec: 22937.5, 60 sec: 23074.3, 300 sec: 22854.3). Total num frames: 466984960. Throughput: 0: 5777.3. Samples: 116752060. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 20:02:48,119][15372] Avg episode reward: [(0, '43.707')] [2024-08-05 20:02:49,998][15444] Updated weights for policy 0, policy_version 57011 (0.0028) [2024-08-05 20:02:53,118][15372] Fps is (10 sec: 22937.6, 60 sec: 23074.1, 300 sec: 22826.5). Total num frames: 467099648. Throughput: 0: 5799.8. Samples: 116770310. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 20:02:53,126][15372] Avg episode reward: [(0, '43.652')] [2024-08-05 20:02:53,667][15444] Updated weights for policy 0, policy_version 57021 (0.0013) [2024-08-05 20:02:57,323][15444] Updated weights for policy 0, policy_version 57031 (0.0023) [2024-08-05 20:02:58,118][15372] Fps is (10 sec: 22937.7, 60 sec: 23074.1, 300 sec: 22798.8). Total num frames: 467214336. Throughput: 0: 5798.9. Samples: 116804370. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 20:02:58,119][15372] Avg episode reward: [(0, '43.229')] [2024-08-05 20:03:00,612][15444] Updated weights for policy 0, policy_version 57041 (0.0024) [2024-08-05 20:03:03,118][15372] Fps is (10 sec: 22937.5, 60 sec: 23074.1, 300 sec: 22798.8). Total num frames: 467329024. Throughput: 0: 5778.2. Samples: 116839000. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 20:03:03,127][15372] Avg episode reward: [(0, '42.724')] [2024-08-05 20:03:04,337][15444] Updated weights for policy 0, policy_version 57051 (0.0012) [2024-08-05 20:03:07,831][15444] Updated weights for policy 0, policy_version 57061 (0.0013) [2024-08-05 20:03:08,118][15372] Fps is (10 sec: 22937.6, 60 sec: 23074.1, 300 sec: 22771.0). Total num frames: 467443712. Throughput: 0: 5769.4. Samples: 116856050. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 20:03:08,119][15372] Avg episode reward: [(0, '43.342')] [2024-08-05 20:03:11,601][15444] Updated weights for policy 0, policy_version 57071 (0.0016) [2024-08-05 20:03:13,118][15372] Fps is (10 sec: 22937.8, 60 sec: 23074.2, 300 sec: 22771.0). Total num frames: 467558400. Throughput: 0: 5755.6. Samples: 116890310. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 20:03:13,119][15372] Avg episode reward: [(0, '44.308')] [2024-08-05 20:03:14,990][15444] Updated weights for policy 0, policy_version 57081 (0.0013) [2024-08-05 20:03:17,259][15417] Signal inference workers to stop experience collection... (20900 times) [2024-08-05 20:03:17,260][15417] Signal inference workers to resume experience collection... (20900 times) [2024-08-05 20:03:17,317][15444] InferenceWorker_p0-w0: stopping experience collection (20900 times) [2024-08-05 20:03:17,317][15444] InferenceWorker_p0-w0: resuming experience collection (20900 times) [2024-08-05 20:03:18,118][15372] Fps is (10 sec: 22937.7, 60 sec: 23074.2, 300 sec: 22826.5). Total num frames: 467673088. Throughput: 0: 5741.4. Samples: 116924840. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 20:03:18,126][15372] Avg episode reward: [(0, '44.667')] [2024-08-05 20:03:18,640][15444] Updated weights for policy 0, policy_version 57091 (0.0017) [2024-08-05 20:03:22,059][15444] Updated weights for policy 0, policy_version 57101 (0.0019) [2024-08-05 20:03:23,118][15372] Fps is (10 sec: 22937.5, 60 sec: 22937.7, 300 sec: 22826.5). Total num frames: 467787776. Throughput: 0: 5763.6. Samples: 116942880. Policy #0 lag: (min: 0.0, avg: 4.0, max: 7.0) [2024-08-05 20:03:23,120][15372] Avg episode reward: [(0, '44.152')] [2024-08-05 20:03:25,937][15444] Updated weights for policy 0, policy_version 57111 (0.0039) [2024-08-05 20:03:28,120][15372] Fps is (10 sec: 22933.8, 60 sec: 23073.5, 300 sec: 22798.6). Total num frames: 467902464. Throughput: 0: 5740.5. Samples: 116976900. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 20:03:28,128][15372] Avg episode reward: [(0, '44.025')] [2024-08-05 20:03:29,070][15444] Updated weights for policy 0, policy_version 57121 (0.0014) [2024-08-05 20:03:33,050][15444] Updated weights for policy 0, policy_version 57131 (0.0029) [2024-08-05 20:03:33,118][15372] Fps is (10 sec: 22937.7, 60 sec: 22937.7, 300 sec: 22798.8). Total num frames: 468017152. Throughput: 0: 5756.7. Samples: 117011110. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 20:03:33,119][15372] Avg episode reward: [(0, '43.947')] [2024-08-05 20:03:36,279][15444] Updated weights for policy 0, policy_version 57141 (0.0012) [2024-08-05 20:03:38,119][15372] Fps is (10 sec: 22940.6, 60 sec: 22937.5, 300 sec: 22799.0). Total num frames: 468131840. Throughput: 0: 5737.1. Samples: 117028480. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 20:03:38,127][15372] Avg episode reward: [(0, '43.665')] [2024-08-05 20:03:40,078][15444] Updated weights for policy 0, policy_version 57151 (0.0018) [2024-08-05 20:03:43,118][15372] Fps is (10 sec: 23756.7, 60 sec: 23074.1, 300 sec: 22854.3). Total num frames: 468254720. Throughput: 0: 5753.6. Samples: 117063280. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 20:03:43,126][15372] Avg episode reward: [(0, '44.069')] [2024-08-05 20:03:43,882][15444] Updated weights for policy 0, policy_version 57161 (0.0022) [2024-08-05 20:03:47,104][15444] Updated weights for policy 0, policy_version 57171 (0.0016) [2024-08-05 20:03:48,118][15372] Fps is (10 sec: 22938.2, 60 sec: 22937.6, 300 sec: 22882.1). Total num frames: 468361216. Throughput: 0: 5729.6. Samples: 117096830. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 20:03:48,119][15372] Avg episode reward: [(0, '44.520')] [2024-08-05 20:03:50,814][15444] Updated weights for policy 0, policy_version 57181 (0.0019) [2024-08-05 20:03:53,118][15372] Fps is (10 sec: 22118.4, 60 sec: 22937.6, 300 sec: 22882.1). Total num frames: 468475904. Throughput: 0: 5745.6. Samples: 117114600. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 20:03:53,126][15372] Avg episode reward: [(0, '43.745')] [2024-08-05 20:03:54,284][15444] Updated weights for policy 0, policy_version 57191 (0.0018) [2024-08-05 20:03:57,955][15444] Updated weights for policy 0, policy_version 57201 (0.0015) [2024-08-05 20:03:58,118][15372] Fps is (10 sec: 22937.7, 60 sec: 22937.6, 300 sec: 22882.1). Total num frames: 468590592. Throughput: 0: 5742.7. Samples: 117148730. Policy #0 lag: (min: 0.0, avg: 4.0, max: 8.0) [2024-08-05 20:03:58,119][15372] Avg episode reward: [(0, '42.670')] [2024-08-05 20:04:01,534][15444] Updated weights for policy 0, policy_version 57211 (0.0011) [2024-08-05 20:04:03,118][15372] Fps is (10 sec: 23756.7, 60 sec: 23074.1, 300 sec: 22937.6). Total num frames: 468713472. Throughput: 0: 5730.4. Samples: 117182710. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 20:04:03,119][15372] Avg episode reward: [(0, '43.367')] [2024-08-05 20:04:04,842][15417] Signal inference workers to stop experience collection... (20950 times) [2024-08-05 20:04:04,845][15417] Signal inference workers to resume experience collection... (20950 times) [2024-08-05 20:04:04,881][15444] InferenceWorker_p0-w0: stopping experience collection (20950 times) [2024-08-05 20:04:04,881][15444] InferenceWorker_p0-w0: resuming experience collection (20950 times) [2024-08-05 20:04:04,962][15444] Updated weights for policy 0, policy_version 57221 (0.0034) [2024-08-05 20:04:08,119][15372] Fps is (10 sec: 22937.4, 60 sec: 22937.6, 300 sec: 22937.6). Total num frames: 468819968. Throughput: 0: 5730.0. Samples: 117200730. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 20:04:08,119][15372] Avg episode reward: [(0, '44.072')] [2024-08-05 20:04:08,751][15444] Updated weights for policy 0, policy_version 57231 (0.0014) [2024-08-05 20:04:11,888][15444] Updated weights for policy 0, policy_version 57241 (0.0013) [2024-08-05 20:04:13,119][15372] Fps is (10 sec: 22116.9, 60 sec: 22937.3, 300 sec: 22993.1). Total num frames: 468934656. Throughput: 0: 5738.1. Samples: 117235110. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 20:04:13,119][15372] Avg episode reward: [(0, '45.184')] [2024-08-05 20:04:13,122][15417] Saving new best policy, reward=45.184! [2024-08-05 20:04:15,843][15444] Updated weights for policy 0, policy_version 57251 (0.0010) [2024-08-05 20:04:18,118][15372] Fps is (10 sec: 22937.8, 60 sec: 22937.6, 300 sec: 23021.0). Total num frames: 469049344. Throughput: 0: 5763.3. Samples: 117270460. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 20:04:18,119][15372] Avg episode reward: [(0, '44.319')] [2024-08-05 20:04:18,149][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000057258_469057536.pth... [2024-08-05 20:04:18,411][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000056584_463536128.pth [2024-08-05 20:04:19,274][15444] Updated weights for policy 0, policy_version 57261 (0.0021) [2024-08-05 20:04:22,866][15444] Updated weights for policy 0, policy_version 57271 (0.0014) [2024-08-05 20:04:23,119][15372] Fps is (10 sec: 23757.9, 60 sec: 23074.1, 300 sec: 23048.7). Total num frames: 469172224. Throughput: 0: 5739.8. Samples: 117286770. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 20:04:23,119][15372] Avg episode reward: [(0, '44.214')] [2024-08-05 20:04:27,501][15444] Updated weights for policy 0, policy_version 57281 (0.0015) [2024-08-05 20:04:28,119][15372] Fps is (10 sec: 20479.4, 60 sec: 22528.5, 300 sec: 22937.6). Total num frames: 469254144. Throughput: 0: 5619.7. Samples: 117316170. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 20:04:28,120][15372] Avg episode reward: [(0, '44.136')] [2024-08-05 20:04:31,304][15444] Updated weights for policy 0, policy_version 57291 (0.0015) [2024-08-05 20:04:33,118][15372] Fps is (10 sec: 18842.0, 60 sec: 22391.5, 300 sec: 22882.1). Total num frames: 469360640. Throughput: 0: 5554.2. Samples: 117346770. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 20:04:33,119][15372] Avg episode reward: [(0, '43.539')] [2024-08-05 20:04:35,309][15444] Updated weights for policy 0, policy_version 57301 (0.0013) [2024-08-05 20:04:38,118][15372] Fps is (10 sec: 22119.0, 60 sec: 22391.6, 300 sec: 22882.1). Total num frames: 469475328. Throughput: 0: 5537.5. Samples: 117363790. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 20:04:38,119][15372] Avg episode reward: [(0, '43.658')] [2024-08-05 20:04:38,502][15444] Updated weights for policy 0, policy_version 57311 (0.0015) [2024-08-05 20:04:42,232][15444] Updated weights for policy 0, policy_version 57321 (0.0013) [2024-08-05 20:04:43,118][15372] Fps is (10 sec: 23756.7, 60 sec: 22391.5, 300 sec: 22909.8). Total num frames: 469598208. Throughput: 0: 5570.9. Samples: 117399420. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 20:04:43,119][15372] Avg episode reward: [(0, '44.641')] [2024-08-05 20:04:45,621][15444] Updated weights for policy 0, policy_version 57331 (0.0015) [2024-08-05 20:04:48,119][15372] Fps is (10 sec: 23756.6, 60 sec: 22528.0, 300 sec: 22909.8). Total num frames: 469712896. Throughput: 0: 5602.2. Samples: 117434810. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 20:04:48,126][15372] Avg episode reward: [(0, '45.956')] [2024-08-05 20:04:48,217][15417] Saving new best policy, reward=45.956! [2024-08-05 20:04:48,912][15444] Updated weights for policy 0, policy_version 57341 (0.0019) [2024-08-05 20:04:50,883][15417] Signal inference workers to stop experience collection... (21000 times) [2024-08-05 20:04:50,892][15417] Signal inference workers to resume experience collection... (21000 times) [2024-08-05 20:04:50,934][15444] InferenceWorker_p0-w0: stopping experience collection (21000 times) [2024-08-05 20:04:50,934][15444] InferenceWorker_p0-w0: resuming experience collection (21000 times) [2024-08-05 20:04:52,411][15444] Updated weights for policy 0, policy_version 57351 (0.0025) [2024-08-05 20:04:53,118][15372] Fps is (10 sec: 23756.8, 60 sec: 22664.5, 300 sec: 22909.9). Total num frames: 469835776. Throughput: 0: 5602.5. Samples: 117452840. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 20:04:53,119][15372] Avg episode reward: [(0, '44.800')] [2024-08-05 20:04:55,587][15444] Updated weights for policy 0, policy_version 57361 (0.0024) [2024-08-05 20:04:58,120][15372] Fps is (10 sec: 24572.6, 60 sec: 22800.5, 300 sec: 22993.1). Total num frames: 469958656. Throughput: 0: 5656.1. Samples: 117489640. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 20:04:58,128][15372] Avg episode reward: [(0, '43.330')] [2024-08-05 20:04:59,308][15444] Updated weights for policy 0, policy_version 57371 (0.0013) [2024-08-05 20:05:02,293][15444] Updated weights for policy 0, policy_version 57381 (0.0023) [2024-08-05 20:05:03,118][15372] Fps is (10 sec: 23756.8, 60 sec: 22664.5, 300 sec: 22965.4). Total num frames: 470073344. Throughput: 0: 5666.7. Samples: 117525460. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 20:05:03,126][15372] Avg episode reward: [(0, '43.703')] [2024-08-05 20:05:05,719][15444] Updated weights for policy 0, policy_version 57391 (0.0022) [2024-08-05 20:05:08,119][15372] Fps is (10 sec: 23759.1, 60 sec: 22937.4, 300 sec: 22965.3). Total num frames: 470196224. Throughput: 0: 5719.1. Samples: 117544130. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 20:05:08,119][15372] Avg episode reward: [(0, '43.848')] [2024-08-05 20:05:09,369][15444] Updated weights for policy 0, policy_version 57401 (0.0014) [2024-08-05 20:05:12,528][15444] Updated weights for policy 0, policy_version 57411 (0.0014) [2024-08-05 20:05:13,118][15372] Fps is (10 sec: 24576.0, 60 sec: 23074.4, 300 sec: 22993.3). Total num frames: 470319104. Throughput: 0: 5885.8. Samples: 117581030. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 20:05:13,119][15372] Avg episode reward: [(0, '43.320')] [2024-08-05 20:05:16,341][15444] Updated weights for policy 0, policy_version 57421 (0.0013) [2024-08-05 20:05:18,119][15372] Fps is (10 sec: 23757.8, 60 sec: 23074.1, 300 sec: 22993.1). Total num frames: 470433792. Throughput: 0: 5977.3. Samples: 117615750. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 20:05:18,119][15372] Avg episode reward: [(0, '43.945')] [2024-08-05 20:05:19,264][15444] Updated weights for policy 0, policy_version 57431 (0.0020) [2024-08-05 20:05:22,926][15444] Updated weights for policy 0, policy_version 57441 (0.0013) [2024-08-05 20:05:23,118][15372] Fps is (10 sec: 23756.9, 60 sec: 23074.2, 300 sec: 23020.9). Total num frames: 470556672. Throughput: 0: 6026.5. Samples: 117634980. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 20:05:23,119][15372] Avg episode reward: [(0, '44.015')] [2024-08-05 20:05:25,972][15444] Updated weights for policy 0, policy_version 57451 (0.0012) [2024-08-05 20:05:28,119][15372] Fps is (10 sec: 24576.1, 60 sec: 23756.9, 300 sec: 23076.4). Total num frames: 470679552. Throughput: 0: 6018.7. Samples: 117670260. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 20:05:28,119][15372] Avg episode reward: [(0, '44.049')] [2024-08-05 20:05:29,296][15417] Signal inference workers to stop experience collection... (21050 times) [2024-08-05 20:05:29,304][15417] Signal inference workers to resume experience collection... (21050 times) [2024-08-05 20:05:29,338][15444] InferenceWorker_p0-w0: stopping experience collection (21050 times) [2024-08-05 20:05:29,339][15444] InferenceWorker_p0-w0: resuming experience collection (21050 times) [2024-08-05 20:05:29,827][15444] Updated weights for policy 0, policy_version 57461 (0.0025) [2024-08-05 20:05:33,119][15372] Fps is (10 sec: 23756.4, 60 sec: 23893.3, 300 sec: 23076.4). Total num frames: 470794240. Throughput: 0: 6034.9. Samples: 117706380. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 20:05:33,119][15372] Avg episode reward: [(0, '43.538')] [2024-08-05 20:05:33,200][15444] Updated weights for policy 0, policy_version 57471 (0.0026) [2024-08-05 20:05:36,377][15444] Updated weights for policy 0, policy_version 57481 (0.0013) [2024-08-05 20:05:38,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24029.9, 300 sec: 23104.2). Total num frames: 470917120. Throughput: 0: 6044.4. Samples: 117724840. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 20:05:38,126][15372] Avg episode reward: [(0, '43.193')] [2024-08-05 20:05:39,936][15444] Updated weights for policy 0, policy_version 57491 (0.0011) [2024-08-05 20:05:43,118][15372] Fps is (10 sec: 24576.3, 60 sec: 24029.9, 300 sec: 23104.2). Total num frames: 471040000. Throughput: 0: 6032.0. Samples: 117761070. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 20:05:43,126][15372] Avg episode reward: [(0, '43.118')] [2024-08-05 20:05:43,146][15444] Updated weights for policy 0, policy_version 57501 (0.0020) [2024-08-05 20:05:46,734][15444] Updated weights for policy 0, policy_version 57511 (0.0011) [2024-08-05 20:05:48,119][15372] Fps is (10 sec: 24575.5, 60 sec: 24166.4, 300 sec: 23132.0). Total num frames: 471162880. Throughput: 0: 6022.4. Samples: 117796470. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 20:05:48,119][15372] Avg episode reward: [(0, '43.047')] [2024-08-05 20:05:50,238][15444] Updated weights for policy 0, policy_version 57521 (0.0016) [2024-08-05 20:05:53,119][15372] Fps is (10 sec: 23756.3, 60 sec: 24029.8, 300 sec: 23159.7). Total num frames: 471277568. Throughput: 0: 6011.4. Samples: 117814640. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 20:05:53,126][15372] Avg episode reward: [(0, '43.439')] [2024-08-05 20:05:53,526][15444] Updated weights for policy 0, policy_version 57531 (0.0012) [2024-08-05 20:05:57,276][15444] Updated weights for policy 0, policy_version 57541 (0.0021) [2024-08-05 20:05:58,119][15372] Fps is (10 sec: 23756.3, 60 sec: 24030.3, 300 sec: 23187.5). Total num frames: 471400448. Throughput: 0: 5977.0. Samples: 117850000. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 20:05:58,119][15372] Avg episode reward: [(0, '43.384')] [2024-08-05 20:06:00,227][15444] Updated weights for policy 0, policy_version 57551 (0.0021) [2024-08-05 20:06:03,118][15372] Fps is (10 sec: 23757.1, 60 sec: 24029.8, 300 sec: 23159.8). Total num frames: 471515136. Throughput: 0: 6012.2. Samples: 117886300. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 20:06:03,126][15372] Avg episode reward: [(0, '42.165')] [2024-08-05 20:06:03,851][15444] Updated weights for policy 0, policy_version 57561 (0.0012) [2024-08-05 20:06:07,534][15444] Updated weights for policy 0, policy_version 57571 (0.0022) [2024-08-05 20:06:08,119][15372] Fps is (10 sec: 23756.0, 60 sec: 24029.8, 300 sec: 23187.5). Total num frames: 471638016. Throughput: 0: 5989.0. Samples: 117904490. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 20:06:08,120][15372] Avg episode reward: [(0, '43.224')] [2024-08-05 20:06:09,652][15417] Signal inference workers to stop experience collection... (21100 times) [2024-08-05 20:06:09,653][15417] Signal inference workers to resume experience collection... (21100 times) [2024-08-05 20:06:09,706][15444] InferenceWorker_p0-w0: stopping experience collection (21100 times) [2024-08-05 20:06:09,707][15444] InferenceWorker_p0-w0: resuming experience collection (21100 times) [2024-08-05 20:06:10,529][15444] Updated weights for policy 0, policy_version 57581 (0.0035) [2024-08-05 20:06:13,118][15372] Fps is (10 sec: 24576.3, 60 sec: 24029.9, 300 sec: 23243.1). Total num frames: 471760896. Throughput: 0: 6001.1. Samples: 117940310. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 20:06:13,119][15372] Avg episode reward: [(0, '43.380')] [2024-08-05 20:06:13,940][15444] Updated weights for policy 0, policy_version 57591 (0.0018) [2024-08-05 20:06:17,162][15444] Updated weights for policy 0, policy_version 57601 (0.0017) [2024-08-05 20:06:18,119][15372] Fps is (10 sec: 24577.7, 60 sec: 24166.4, 300 sec: 23243.1). Total num frames: 471883776. Throughput: 0: 6010.9. Samples: 117976870. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 20:06:18,119][15372] Avg episode reward: [(0, '44.270')] [2024-08-05 20:06:18,122][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000057603_471883776.pth... [2024-08-05 20:06:18,260][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000056920_466288640.pth [2024-08-05 20:06:20,925][15444] Updated weights for policy 0, policy_version 57611 (0.0020) [2024-08-05 20:06:23,118][15372] Fps is (10 sec: 23756.6, 60 sec: 24029.8, 300 sec: 23243.1). Total num frames: 471998464. Throughput: 0: 5995.1. Samples: 117994620. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 20:06:23,126][15372] Avg episode reward: [(0, '44.645')] [2024-08-05 20:06:24,446][15444] Updated weights for policy 0, policy_version 57621 (0.0032) [2024-08-05 20:06:27,754][15444] Updated weights for policy 0, policy_version 57631 (0.0012) [2024-08-05 20:06:28,118][15372] Fps is (10 sec: 22937.8, 60 sec: 23893.4, 300 sec: 23243.1). Total num frames: 472113152. Throughput: 0: 5977.1. Samples: 118030040. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 20:06:28,119][15372] Avg episode reward: [(0, '43.993')] [2024-08-05 20:06:31,148][15444] Updated weights for policy 0, policy_version 57641 (0.0031) [2024-08-05 20:06:33,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24029.9, 300 sec: 23270.9). Total num frames: 472236032. Throughput: 0: 5982.3. Samples: 118065670. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 20:06:33,127][15372] Avg episode reward: [(0, '44.156')] [2024-08-05 20:06:34,428][15444] Updated weights for policy 0, policy_version 57651 (0.0022) [2024-08-05 20:06:38,119][15372] Fps is (10 sec: 23756.7, 60 sec: 23893.3, 300 sec: 23270.8). Total num frames: 472350720. Throughput: 0: 5996.9. Samples: 118084500. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 20:06:38,126][15372] Avg episode reward: [(0, '44.179')] [2024-08-05 20:06:38,172][15444] Updated weights for policy 0, policy_version 57661 (0.0032) [2024-08-05 20:06:41,146][15444] Updated weights for policy 0, policy_version 57671 (0.0028) [2024-08-05 20:06:43,118][15372] Fps is (10 sec: 23756.7, 60 sec: 23893.3, 300 sec: 23298.6). Total num frames: 472473600. Throughput: 0: 6009.6. Samples: 118120430. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 20:06:43,126][15372] Avg episode reward: [(0, '43.447')] [2024-08-05 20:06:44,620][15444] Updated weights for policy 0, policy_version 57681 (0.0021) [2024-08-05 20:06:48,119][15372] Fps is (10 sec: 24574.9, 60 sec: 23893.2, 300 sec: 23326.3). Total num frames: 472596480. Throughput: 0: 5999.9. Samples: 118156300. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 20:06:48,127][15372] Avg episode reward: [(0, '43.358')] [2024-08-05 20:06:48,268][15444] Updated weights for policy 0, policy_version 57691 (0.0016) [2024-08-05 20:06:51,315][15444] Updated weights for policy 0, policy_version 57701 (0.0022) [2024-08-05 20:06:52,750][15417] Signal inference workers to stop experience collection... (21150 times) [2024-08-05 20:06:52,759][15417] Signal inference workers to resume experience collection... (21150 times) [2024-08-05 20:06:52,785][15444] InferenceWorker_p0-w0: stopping experience collection (21150 times) [2024-08-05 20:06:52,785][15444] InferenceWorker_p0-w0: resuming experience collection (21150 times) [2024-08-05 20:06:53,119][15372] Fps is (10 sec: 25394.7, 60 sec: 24166.4, 300 sec: 23381.9). Total num frames: 472727552. Throughput: 0: 6013.9. Samples: 118175110. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 20:06:53,119][15372] Avg episode reward: [(0, '43.683')] [2024-08-05 20:06:54,851][15444] Updated weights for policy 0, policy_version 57711 (0.0013) [2024-08-05 20:06:58,119][15372] Fps is (10 sec: 24576.9, 60 sec: 24030.0, 300 sec: 23381.9). Total num frames: 472842240. Throughput: 0: 6036.2. Samples: 118211940. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 20:06:58,126][15372] Avg episode reward: [(0, '43.087')] [2024-08-05 20:06:58,411][15444] Updated weights for policy 0, policy_version 57721 (0.0013) [2024-08-05 20:07:01,443][15444] Updated weights for policy 0, policy_version 57731 (0.0017) [2024-08-05 20:07:03,118][15372] Fps is (10 sec: 23757.3, 60 sec: 24166.4, 300 sec: 23409.7). Total num frames: 472965120. Throughput: 0: 6010.7. Samples: 118247350. Policy #0 lag: (min: 1.0, avg: 3.5, max: 8.0) [2024-08-05 20:07:03,126][15372] Avg episode reward: [(0, '42.720')] [2024-08-05 20:07:05,019][15444] Updated weights for policy 0, policy_version 57741 (0.0027) [2024-08-05 20:07:08,119][15372] Fps is (10 sec: 24575.8, 60 sec: 24166.6, 300 sec: 23437.4). Total num frames: 473088000. Throughput: 0: 6022.0. Samples: 118265610. Policy #0 lag: (min: 1.0, avg: 3.5, max: 8.0) [2024-08-05 20:07:08,126][15372] Avg episode reward: [(0, '43.711')] [2024-08-05 20:07:08,436][15444] Updated weights for policy 0, policy_version 57751 (0.0019) [2024-08-05 20:07:11,871][15444] Updated weights for policy 0, policy_version 57761 (0.0012) [2024-08-05 20:07:13,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24166.4, 300 sec: 23465.2). Total num frames: 473210880. Throughput: 0: 6044.0. Samples: 118302020. Policy #0 lag: (min: 1.0, avg: 3.5, max: 8.0) [2024-08-05 20:07:13,119][15372] Avg episode reward: [(0, '43.254')] [2024-08-05 20:07:14,964][15444] Updated weights for policy 0, policy_version 57771 (0.0021) [2024-08-05 20:07:18,118][15372] Fps is (10 sec: 24576.6, 60 sec: 24166.4, 300 sec: 23465.2). Total num frames: 473333760. Throughput: 0: 6076.9. Samples: 118339130. Policy #0 lag: (min: 1.0, avg: 3.5, max: 8.0) [2024-08-05 20:07:18,126][15372] Avg episode reward: [(0, '43.317')] [2024-08-05 20:07:18,449][15444] Updated weights for policy 0, policy_version 57781 (0.0026) [2024-08-05 20:07:22,138][15444] Updated weights for policy 0, policy_version 57791 (0.0013) [2024-08-05 20:07:23,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24166.4, 300 sec: 23493.0). Total num frames: 473448448. Throughput: 0: 6067.6. Samples: 118357540. Policy #0 lag: (min: 1.0, avg: 3.5, max: 8.0) [2024-08-05 20:07:23,119][15372] Avg episode reward: [(0, '43.336')] [2024-08-05 20:07:25,180][15444] Updated weights for policy 0, policy_version 57801 (0.0022) [2024-08-05 20:07:28,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24302.9, 300 sec: 23493.0). Total num frames: 473571328. Throughput: 0: 6077.6. Samples: 118393920. Policy #0 lag: (min: 1.0, avg: 3.5, max: 8.0) [2024-08-05 20:07:28,126][15372] Avg episode reward: [(0, '42.807')] [2024-08-05 20:07:28,732][15444] Updated weights for policy 0, policy_version 57811 (0.0021) [2024-08-05 20:07:31,766][15444] Updated weights for policy 0, policy_version 57821 (0.0018) [2024-08-05 20:07:33,119][15372] Fps is (10 sec: 24575.4, 60 sec: 24302.9, 300 sec: 23520.7). Total num frames: 473694208. Throughput: 0: 6077.8. Samples: 118429800. Policy #0 lag: (min: 1.0, avg: 3.5, max: 8.0) [2024-08-05 20:07:33,119][15372] Avg episode reward: [(0, '43.832')] [2024-08-05 20:07:35,398][15444] Updated weights for policy 0, policy_version 57831 (0.0012) [2024-08-05 20:07:38,119][15372] Fps is (10 sec: 24575.8, 60 sec: 24439.5, 300 sec: 23548.5). Total num frames: 473817088. Throughput: 0: 6063.4. Samples: 118447960. Policy #0 lag: (min: 0.0, avg: 4.5, max: 8.0) [2024-08-05 20:07:38,119][15372] Avg episode reward: [(0, '44.516')] [2024-08-05 20:07:38,762][15444] Updated weights for policy 0, policy_version 57841 (0.0013) [2024-08-05 20:07:41,964][15444] Updated weights for policy 0, policy_version 57851 (0.0020) [2024-08-05 20:07:43,119][15372] Fps is (10 sec: 23755.8, 60 sec: 24302.7, 300 sec: 23548.5). Total num frames: 473931776. Throughput: 0: 6057.3. Samples: 118484520. Policy #0 lag: (min: 0.0, avg: 4.5, max: 8.0) [2024-08-05 20:07:43,119][15372] Avg episode reward: [(0, '44.141')] [2024-08-05 20:07:45,448][15417] Signal inference workers to stop experience collection... (21200 times) [2024-08-05 20:07:45,449][15417] Signal inference workers to resume experience collection... (21200 times) [2024-08-05 20:07:45,494][15444] InferenceWorker_p0-w0: stopping experience collection (21200 times) [2024-08-05 20:07:45,495][15444] InferenceWorker_p0-w0: resuming experience collection (21200 times) [2024-08-05 20:07:45,542][15444] Updated weights for policy 0, policy_version 57861 (0.0014) [2024-08-05 20:07:48,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24439.7, 300 sec: 23604.1). Total num frames: 474062848. Throughput: 0: 6086.7. Samples: 118521250. Policy #0 lag: (min: 0.0, avg: 4.5, max: 8.0) [2024-08-05 20:07:48,119][15372] Avg episode reward: [(0, '43.794')] [2024-08-05 20:07:48,581][15444] Updated weights for policy 0, policy_version 57871 (0.0010) [2024-08-05 20:07:52,299][15444] Updated weights for policy 0, policy_version 57881 (0.0014) [2024-08-05 20:07:53,118][15372] Fps is (10 sec: 24577.4, 60 sec: 24166.5, 300 sec: 23604.1). Total num frames: 474177536. Throughput: 0: 6083.8. Samples: 118539380. Policy #0 lag: (min: 0.0, avg: 4.5, max: 8.0) [2024-08-05 20:07:53,119][15372] Avg episode reward: [(0, '44.094')] [2024-08-05 20:07:55,649][15444] Updated weights for policy 0, policy_version 57891 (0.0017) [2024-08-05 20:07:58,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24439.5, 300 sec: 23659.6). Total num frames: 474308608. Throughput: 0: 6085.6. Samples: 118575870. Policy #0 lag: (min: 0.0, avg: 4.5, max: 8.0) [2024-08-05 20:07:58,119][15372] Avg episode reward: [(0, '44.128')] [2024-08-05 20:07:58,960][15444] Updated weights for policy 0, policy_version 57901 (0.0011) [2024-08-05 20:08:02,267][15444] Updated weights for policy 0, policy_version 57911 (0.0015) [2024-08-05 20:08:03,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24302.9, 300 sec: 23659.6). Total num frames: 474423296. Throughput: 0: 6055.5. Samples: 118611630. Policy #0 lag: (min: 0.0, avg: 4.5, max: 8.0) [2024-08-05 20:08:03,126][15372] Avg episode reward: [(0, '44.120')] [2024-08-05 20:08:05,640][15444] Updated weights for policy 0, policy_version 57921 (0.0017) [2024-08-05 20:08:08,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24303.0, 300 sec: 23687.4). Total num frames: 474546176. Throughput: 0: 6047.5. Samples: 118629680. Policy #0 lag: (min: 0.0, avg: 4.5, max: 8.0) [2024-08-05 20:08:08,119][15372] Avg episode reward: [(0, '44.020')] [2024-08-05 20:08:09,345][15444] Updated weights for policy 0, policy_version 57931 (0.0012) [2024-08-05 20:08:12,654][15444] Updated weights for policy 0, policy_version 57941 (0.0021) [2024-08-05 20:08:13,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.4, 300 sec: 23687.4). Total num frames: 474660864. Throughput: 0: 6047.3. Samples: 118666050. Policy #0 lag: (min: 2.0, avg: 4.5, max: 9.0) [2024-08-05 20:08:13,119][15372] Avg episode reward: [(0, '43.095')] [2024-08-05 20:08:15,924][15444] Updated weights for policy 0, policy_version 57951 (0.0021) [2024-08-05 20:08:18,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 23715.1). Total num frames: 474783744. Throughput: 0: 6056.0. Samples: 118702320. Policy #0 lag: (min: 2.0, avg: 4.5, max: 9.0) [2024-08-05 20:08:18,126][15372] Avg episode reward: [(0, '43.765')] [2024-08-05 20:08:18,129][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000057957_474783744.pth... [2024-08-05 20:08:18,187][15417] Signal inference workers to stop experience collection... (21250 times) [2024-08-05 20:08:18,188][15417] Signal inference workers to resume experience collection... (21250 times) [2024-08-05 20:08:18,215][15444] InferenceWorker_p0-w0: stopping experience collection (21250 times) [2024-08-05 20:08:18,215][15444] InferenceWorker_p0-w0: resuming experience collection (21250 times) [2024-08-05 20:08:18,275][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000057258_469057536.pth [2024-08-05 20:08:19,224][15444] Updated weights for policy 0, policy_version 57961 (0.0014) [2024-08-05 20:08:22,744][15444] Updated weights for policy 0, policy_version 57971 (0.0019) [2024-08-05 20:08:23,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24302.9, 300 sec: 23743.0). Total num frames: 474906624. Throughput: 0: 6041.6. Samples: 118719830. Policy #0 lag: (min: 2.0, avg: 4.5, max: 9.0) [2024-08-05 20:08:23,119][15372] Avg episode reward: [(0, '43.440')] [2024-08-05 20:08:26,156][15444] Updated weights for policy 0, policy_version 57981 (0.0017) [2024-08-05 20:08:28,119][15372] Fps is (10 sec: 23756.5, 60 sec: 24166.3, 300 sec: 23742.9). Total num frames: 475021312. Throughput: 0: 6035.8. Samples: 118756130. Policy #0 lag: (min: 2.0, avg: 4.5, max: 9.0) [2024-08-05 20:08:28,126][15372] Avg episode reward: [(0, '43.881')] [2024-08-05 20:08:29,462][15444] Updated weights for policy 0, policy_version 57991 (0.0018) [2024-08-05 20:08:33,047][15444] Updated weights for policy 0, policy_version 58001 (0.0034) [2024-08-05 20:08:33,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.5, 300 sec: 23770.7). Total num frames: 475144192. Throughput: 0: 6034.4. Samples: 118792800. Policy #0 lag: (min: 2.0, avg: 4.5, max: 9.0) [2024-08-05 20:08:33,119][15372] Avg episode reward: [(0, '43.649')] [2024-08-05 20:08:36,258][15444] Updated weights for policy 0, policy_version 58011 (0.0041) [2024-08-05 20:08:38,119][15372] Fps is (10 sec: 24576.2, 60 sec: 24166.4, 300 sec: 23770.7). Total num frames: 475267072. Throughput: 0: 6037.8. Samples: 118811080. Policy #0 lag: (min: 2.0, avg: 4.5, max: 9.0) [2024-08-05 20:08:38,126][15372] Avg episode reward: [(0, '43.585')] [2024-08-05 20:08:39,711][15444] Updated weights for policy 0, policy_version 58021 (0.0021) [2024-08-05 20:08:42,857][15444] Updated weights for policy 0, policy_version 58031 (0.0016) [2024-08-05 20:08:43,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24303.2, 300 sec: 23826.2). Total num frames: 475389952. Throughput: 0: 6044.9. Samples: 118847890. Policy #0 lag: (min: 2.0, avg: 4.5, max: 9.0) [2024-08-05 20:08:43,119][15372] Avg episode reward: [(0, '42.895')] [2024-08-05 20:08:46,265][15444] Updated weights for policy 0, policy_version 58041 (0.0013) [2024-08-05 20:08:48,119][15372] Fps is (10 sec: 23756.4, 60 sec: 24029.8, 300 sec: 23826.2). Total num frames: 475504640. Throughput: 0: 6053.8. Samples: 118884050. Policy #0 lag: (min: 1.0, avg: 3.3, max: 8.0) [2024-08-05 20:08:48,126][15372] Avg episode reward: [(0, '43.513')] [2024-08-05 20:08:49,919][15444] Updated weights for policy 0, policy_version 58051 (0.0011) [2024-08-05 20:08:52,942][15444] Updated weights for policy 0, policy_version 58061 (0.0012) [2024-08-05 20:08:53,119][15372] Fps is (10 sec: 24575.6, 60 sec: 24302.9, 300 sec: 23881.8). Total num frames: 475635712. Throughput: 0: 6061.5. Samples: 118902450. Policy #0 lag: (min: 1.0, avg: 3.3, max: 8.0) [2024-08-05 20:08:53,119][15372] Avg episode reward: [(0, '44.183')] [2024-08-05 20:08:56,555][15444] Updated weights for policy 0, policy_version 58071 (0.0035) [2024-08-05 20:08:58,119][15372] Fps is (10 sec: 24576.2, 60 sec: 24029.8, 300 sec: 23854.0). Total num frames: 475750400. Throughput: 0: 6051.3. Samples: 118938360. Policy #0 lag: (min: 1.0, avg: 3.3, max: 8.0) [2024-08-05 20:08:58,119][15372] Avg episode reward: [(0, '45.292')] [2024-08-05 20:08:59,671][15444] Updated weights for policy 0, policy_version 58081 (0.0026) [2024-08-05 20:09:01,617][15417] Signal inference workers to stop experience collection... (21300 times) [2024-08-05 20:09:01,617][15417] Signal inference workers to resume experience collection... (21300 times) [2024-08-05 20:09:01,649][15444] InferenceWorker_p0-w0: stopping experience collection (21300 times) [2024-08-05 20:09:01,650][15444] InferenceWorker_p0-w0: resuming experience collection (21300 times) [2024-08-05 20:09:03,118][15372] Fps is (10 sec: 23757.1, 60 sec: 24166.4, 300 sec: 23909.5). Total num frames: 475873280. Throughput: 0: 6071.8. Samples: 118975550. Policy #0 lag: (min: 1.0, avg: 3.3, max: 8.0) [2024-08-05 20:09:03,119][15372] Avg episode reward: [(0, '45.007')] [2024-08-05 20:09:03,241][15444] Updated weights for policy 0, policy_version 58091 (0.0026) [2024-08-05 20:09:06,624][15444] Updated weights for policy 0, policy_version 58101 (0.0019) [2024-08-05 20:09:08,119][15372] Fps is (10 sec: 24575.7, 60 sec: 24166.3, 300 sec: 23937.3). Total num frames: 475996160. Throughput: 0: 6080.0. Samples: 118993430. Policy #0 lag: (min: 1.0, avg: 3.3, max: 8.0) [2024-08-05 20:09:08,119][15372] Avg episode reward: [(0, '43.703')] [2024-08-05 20:09:09,883][15444] Updated weights for policy 0, policy_version 58111 (0.0024) [2024-08-05 20:09:13,119][15372] Fps is (10 sec: 24575.7, 60 sec: 24302.9, 300 sec: 23965.1). Total num frames: 476119040. Throughput: 0: 6097.6. Samples: 119030520. Policy #0 lag: (min: 1.0, avg: 3.3, max: 8.0) [2024-08-05 20:09:13,126][15372] Avg episode reward: [(0, '42.902')] [2024-08-05 20:09:13,541][15444] Updated weights for policy 0, policy_version 58121 (0.0011) [2024-08-05 20:09:16,498][15444] Updated weights for policy 0, policy_version 58131 (0.0024) [2024-08-05 20:09:18,118][15372] Fps is (10 sec: 24576.7, 60 sec: 24302.9, 300 sec: 23965.1). Total num frames: 476241920. Throughput: 0: 6080.7. Samples: 119066430. Policy #0 lag: (min: 1.0, avg: 3.3, max: 8.0) [2024-08-05 20:09:18,126][15372] Avg episode reward: [(0, '43.660')] [2024-08-05 20:09:20,041][15444] Updated weights for policy 0, policy_version 58141 (0.0010) [2024-08-05 20:09:23,118][15372] Fps is (10 sec: 24576.3, 60 sec: 24303.0, 300 sec: 24103.9). Total num frames: 476364800. Throughput: 0: 6082.5. Samples: 119084790. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 20:09:23,126][15372] Avg episode reward: [(0, '43.959')] [2024-08-05 20:09:23,506][15444] Updated weights for policy 0, policy_version 58151 (0.0014) [2024-08-05 20:09:26,721][15444] Updated weights for policy 0, policy_version 58161 (0.0012) [2024-08-05 20:09:28,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24303.0, 300 sec: 24131.7). Total num frames: 476479488. Throughput: 0: 6069.3. Samples: 119121010. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 20:09:28,119][15372] Avg episode reward: [(0, '44.137')] [2024-08-05 20:09:30,292][15444] Updated weights for policy 0, policy_version 58171 (0.0021) [2024-08-05 20:09:33,119][15372] Fps is (10 sec: 23755.0, 60 sec: 24302.6, 300 sec: 24159.4). Total num frames: 476602368. Throughput: 0: 6077.3. Samples: 119157530. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 20:09:33,120][15372] Avg episode reward: [(0, '44.254')] [2024-08-05 20:09:33,442][15444] Updated weights for policy 0, policy_version 58181 (0.0025) [2024-08-05 20:09:35,556][15417] Signal inference workers to stop experience collection... (21350 times) [2024-08-05 20:09:35,557][15417] Signal inference workers to resume experience collection... (21350 times) [2024-08-05 20:09:35,626][15444] InferenceWorker_p0-w0: stopping experience collection (21350 times) [2024-08-05 20:09:35,626][15444] InferenceWorker_p0-w0: resuming experience collection (21350 times) [2024-08-05 20:09:36,867][15444] Updated weights for policy 0, policy_version 58191 (0.0023) [2024-08-05 20:09:38,119][15372] Fps is (10 sec: 24574.9, 60 sec: 24302.8, 300 sec: 24159.4). Total num frames: 476725248. Throughput: 0: 6061.5. Samples: 119175220. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 20:09:38,119][15372] Avg episode reward: [(0, '44.054')] [2024-08-05 20:09:40,283][15444] Updated weights for policy 0, policy_version 58201 (0.0042) [2024-08-05 20:09:43,118][15372] Fps is (10 sec: 24577.8, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 476848128. Throughput: 0: 6100.9. Samples: 119212900. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 20:09:43,126][15372] Avg episode reward: [(0, '44.808')] [2024-08-05 20:09:43,481][15444] Updated weights for policy 0, policy_version 58211 (0.0022) [2024-08-05 20:09:47,173][15444] Updated weights for policy 0, policy_version 58221 (0.0012) [2024-08-05 20:09:48,118][15372] Fps is (10 sec: 24576.9, 60 sec: 24439.5, 300 sec: 24187.2). Total num frames: 476971008. Throughput: 0: 6062.4. Samples: 119248360. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 20:09:48,119][15372] Avg episode reward: [(0, '44.029')] [2024-08-05 20:09:50,349][15444] Updated weights for policy 0, policy_version 58231 (0.0013) [2024-08-05 20:09:53,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.5, 300 sec: 24159.6). Total num frames: 477085696. Throughput: 0: 6061.8. Samples: 119266210. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 20:09:53,119][15372] Avg episode reward: [(0, '44.036')] [2024-08-05 20:09:54,001][15444] Updated weights for policy 0, policy_version 58241 (0.0018) [2024-08-05 20:09:57,656][15444] Updated weights for policy 0, policy_version 58251 (0.0016) [2024-08-05 20:09:58,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24303.0, 300 sec: 24187.2). Total num frames: 477208576. Throughput: 0: 6041.6. Samples: 119302390. Policy #0 lag: (min: 1.0, avg: 2.7, max: 7.0) [2024-08-05 20:09:58,119][15372] Avg episode reward: [(0, '43.568')] [2024-08-05 20:10:00,554][15444] Updated weights for policy 0, policy_version 58261 (0.0012) [2024-08-05 20:10:03,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24302.9, 300 sec: 24187.3). Total num frames: 477331456. Throughput: 0: 6037.6. Samples: 119338120. Policy #0 lag: (min: 1.0, avg: 2.7, max: 7.0) [2024-08-05 20:10:03,126][15372] Avg episode reward: [(0, '42.728')] [2024-08-05 20:10:04,265][15444] Updated weights for policy 0, policy_version 58271 (0.0023) [2024-08-05 20:10:07,697][15444] Updated weights for policy 0, policy_version 58281 (0.0030) [2024-08-05 20:10:08,118][15372] Fps is (10 sec: 22937.7, 60 sec: 24030.0, 300 sec: 24131.7). Total num frames: 477437952. Throughput: 0: 6022.9. Samples: 119355820. Policy #0 lag: (min: 1.0, avg: 2.7, max: 7.0) [2024-08-05 20:10:08,119][15372] Avg episode reward: [(0, '44.010')] [2024-08-05 20:10:10,961][15444] Updated weights for policy 0, policy_version 58291 (0.0020) [2024-08-05 20:10:13,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.5, 300 sec: 24187.2). Total num frames: 477569024. Throughput: 0: 6011.6. Samples: 119391530. Policy #0 lag: (min: 1.0, avg: 2.7, max: 7.0) [2024-08-05 20:10:13,119][15372] Avg episode reward: [(0, '45.175')] [2024-08-05 20:10:14,563][15444] Updated weights for policy 0, policy_version 58301 (0.0014) [2024-08-05 20:10:17,790][15444] Updated weights for policy 0, policy_version 58311 (0.0017) [2024-08-05 20:10:18,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24029.8, 300 sec: 24159.4). Total num frames: 477683712. Throughput: 0: 5997.0. Samples: 119427390. Policy #0 lag: (min: 1.0, avg: 2.7, max: 7.0) [2024-08-05 20:10:18,119][15372] Avg episode reward: [(0, '44.940')] [2024-08-05 20:10:18,122][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000058311_477683712.pth... [2024-08-05 20:10:18,280][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000057603_471883776.pth [2024-08-05 20:10:21,357][15444] Updated weights for policy 0, policy_version 58321 (0.0019) [2024-08-05 20:10:21,439][15417] Signal inference workers to stop experience collection... (21400 times) [2024-08-05 20:10:21,440][15417] Signal inference workers to resume experience collection... (21400 times) [2024-08-05 20:10:21,512][15444] InferenceWorker_p0-w0: stopping experience collection (21400 times) [2024-08-05 20:10:21,512][15444] InferenceWorker_p0-w0: resuming experience collection (21400 times) [2024-08-05 20:10:23,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 477806592. Throughput: 0: 6017.4. Samples: 119446000. Policy #0 lag: (min: 1.0, avg: 2.7, max: 7.0) [2024-08-05 20:10:23,119][15372] Avg episode reward: [(0, '44.646')] [2024-08-05 20:10:24,589][15444] Updated weights for policy 0, policy_version 58331 (0.0013) [2024-08-05 20:10:28,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.8, 300 sec: 24159.5). Total num frames: 477921280. Throughput: 0: 6001.3. Samples: 119482960. Policy #0 lag: (min: 1.0, avg: 2.7, max: 7.0) [2024-08-05 20:10:28,127][15372] Avg episode reward: [(0, '44.069')] [2024-08-05 20:10:28,160][15444] Updated weights for policy 0, policy_version 58341 (0.0019) [2024-08-05 20:10:31,241][15444] Updated weights for policy 0, policy_version 58351 (0.0021) [2024-08-05 20:10:33,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.7, 300 sec: 24187.2). Total num frames: 478052352. Throughput: 0: 6016.5. Samples: 119519100. Policy #0 lag: (min: 1.0, avg: 2.7, max: 7.0) [2024-08-05 20:10:33,126][15372] Avg episode reward: [(0, '44.037')] [2024-08-05 20:10:34,788][15444] Updated weights for policy 0, policy_version 58361 (0.0013) [2024-08-05 20:10:38,069][15444] Updated weights for policy 0, policy_version 58371 (0.0014) [2024-08-05 20:10:38,119][15372] Fps is (10 sec: 25393.4, 60 sec: 24166.3, 300 sec: 24187.2). Total num frames: 478175232. Throughput: 0: 6024.6. Samples: 119537320. Policy #0 lag: (min: 1.0, avg: 4.2, max: 7.0) [2024-08-05 20:10:38,120][15372] Avg episode reward: [(0, '43.491')] [2024-08-05 20:10:41,600][15444] Updated weights for policy 0, policy_version 58381 (0.0025) [2024-08-05 20:10:43,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 478289920. Throughput: 0: 6012.4. Samples: 119572950. Policy #0 lag: (min: 1.0, avg: 4.2, max: 7.0) [2024-08-05 20:10:43,126][15372] Avg episode reward: [(0, '43.864')] [2024-08-05 20:10:44,954][15444] Updated weights for policy 0, policy_version 58391 (0.0027) [2024-08-05 20:10:48,118][15372] Fps is (10 sec: 23758.6, 60 sec: 24029.9, 300 sec: 24187.2). Total num frames: 478412800. Throughput: 0: 6015.3. Samples: 119608810. Policy #0 lag: (min: 1.0, avg: 4.2, max: 7.0) [2024-08-05 20:10:48,126][15372] Avg episode reward: [(0, '44.038')] [2024-08-05 20:10:48,412][15444] Updated weights for policy 0, policy_version 58401 (0.0018) [2024-08-05 20:10:51,599][15444] Updated weights for policy 0, policy_version 58411 (0.0011) [2024-08-05 20:10:53,119][15372] Fps is (10 sec: 24575.7, 60 sec: 24166.3, 300 sec: 24187.3). Total num frames: 478535680. Throughput: 0: 6044.6. Samples: 119627830. Policy #0 lag: (min: 1.0, avg: 4.2, max: 7.0) [2024-08-05 20:10:53,126][15372] Avg episode reward: [(0, '43.194')] [2024-08-05 20:10:55,056][15444] Updated weights for policy 0, policy_version 58421 (0.0019) [2024-08-05 20:10:58,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 478658560. Throughput: 0: 6066.2. Samples: 119664510. Policy #0 lag: (min: 1.0, avg: 4.2, max: 7.0) [2024-08-05 20:10:58,126][15372] Avg episode reward: [(0, '42.574')] [2024-08-05 20:10:58,518][15444] Updated weights for policy 0, policy_version 58431 (0.0026) [2024-08-05 20:11:01,747][15444] Updated weights for policy 0, policy_version 58441 (0.0033) [2024-08-05 20:11:03,118][15372] Fps is (10 sec: 23757.2, 60 sec: 24029.9, 300 sec: 24187.3). Total num frames: 478773248. Throughput: 0: 6061.1. Samples: 119700140. Policy #0 lag: (min: 1.0, avg: 4.2, max: 7.0) [2024-08-05 20:11:03,119][15372] Avg episode reward: [(0, '42.798')] [2024-08-05 20:11:05,257][15444] Updated weights for policy 0, policy_version 58451 (0.0018) [2024-08-05 20:11:08,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24439.5, 300 sec: 24215.0). Total num frames: 478904320. Throughput: 0: 6060.9. Samples: 119718740. Policy #0 lag: (min: 1.0, avg: 4.2, max: 7.0) [2024-08-05 20:11:08,126][15372] Avg episode reward: [(0, '43.246')] [2024-08-05 20:11:08,564][15444] Updated weights for policy 0, policy_version 58461 (0.0045) [2024-08-05 20:11:11,924][15444] Updated weights for policy 0, policy_version 58471 (0.0012) [2024-08-05 20:11:13,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 479019008. Throughput: 0: 6038.2. Samples: 119754680. Policy #0 lag: (min: 1.0, avg: 4.6, max: 8.0) [2024-08-05 20:11:13,119][15372] Avg episode reward: [(0, '43.510')] [2024-08-05 20:11:15,428][15444] Updated weights for policy 0, policy_version 58481 (0.0016) [2024-08-05 20:11:18,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 479141888. Throughput: 0: 6047.8. Samples: 119791250. Policy #0 lag: (min: 1.0, avg: 4.6, max: 8.0) [2024-08-05 20:11:18,126][15372] Avg episode reward: [(0, '43.633')] [2024-08-05 20:11:18,947][15444] Updated weights for policy 0, policy_version 58491 (0.0019) [2024-08-05 20:11:22,355][15444] Updated weights for policy 0, policy_version 58501 (0.0017) [2024-08-05 20:11:23,119][15372] Fps is (10 sec: 24575.7, 60 sec: 24302.9, 300 sec: 24242.8). Total num frames: 479264768. Throughput: 0: 6035.2. Samples: 119808900. Policy #0 lag: (min: 1.0, avg: 4.6, max: 8.0) [2024-08-05 20:11:23,119][15372] Avg episode reward: [(0, '44.045')] [2024-08-05 20:11:24,990][15417] Signal inference workers to stop experience collection... (21450 times) [2024-08-05 20:11:24,990][15417] Signal inference workers to resume experience collection... (21450 times) [2024-08-05 20:11:25,037][15444] InferenceWorker_p0-w0: stopping experience collection (21450 times) [2024-08-05 20:11:25,037][15444] InferenceWorker_p0-w0: resuming experience collection (21450 times) [2024-08-05 20:11:25,665][15444] Updated weights for policy 0, policy_version 58511 (0.0017) [2024-08-05 20:11:28,120][15372] Fps is (10 sec: 23753.4, 60 sec: 24302.4, 300 sec: 24214.9). Total num frames: 479379456. Throughput: 0: 6054.7. Samples: 119845420. Policy #0 lag: (min: 1.0, avg: 4.6, max: 8.0) [2024-08-05 20:11:28,120][15372] Avg episode reward: [(0, '43.560')] [2024-08-05 20:11:28,954][15444] Updated weights for policy 0, policy_version 58521 (0.0022) [2024-08-05 20:11:32,419][15444] Updated weights for policy 0, policy_version 58531 (0.0017) [2024-08-05 20:11:33,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.4, 300 sec: 24242.8). Total num frames: 479502336. Throughput: 0: 6061.5. Samples: 119881580. Policy #0 lag: (min: 1.0, avg: 4.6, max: 8.0) [2024-08-05 20:11:33,119][15372] Avg episode reward: [(0, '42.927')] [2024-08-05 20:11:35,499][15444] Updated weights for policy 0, policy_version 58541 (0.0026) [2024-08-05 20:11:38,118][15372] Fps is (10 sec: 23760.2, 60 sec: 24030.1, 300 sec: 24215.0). Total num frames: 479617024. Throughput: 0: 6040.9. Samples: 119899670. Policy #0 lag: (min: 1.0, avg: 4.6, max: 8.0) [2024-08-05 20:11:38,126][15372] Avg episode reward: [(0, '43.551')] [2024-08-05 20:11:39,239][15444] Updated weights for policy 0, policy_version 58551 (0.0028) [2024-08-05 20:11:42,715][15444] Updated weights for policy 0, policy_version 58561 (0.0019) [2024-08-05 20:11:43,119][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 479739904. Throughput: 0: 6023.3. Samples: 119935560. Policy #0 lag: (min: 1.0, avg: 4.6, max: 8.0) [2024-08-05 20:11:43,119][15372] Avg episode reward: [(0, '44.252')] [2024-08-05 20:11:45,844][15444] Updated weights for policy 0, policy_version 58571 (0.0019) [2024-08-05 20:11:48,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 479862784. Throughput: 0: 6034.4. Samples: 119971690. Policy #0 lag: (min: 0.0, avg: 4.3, max: 8.0) [2024-08-05 20:11:48,126][15372] Avg episode reward: [(0, '43.759')] [2024-08-05 20:11:49,459][15444] Updated weights for policy 0, policy_version 58581 (0.0012) [2024-08-05 20:11:52,814][15444] Updated weights for policy 0, policy_version 58591 (0.0023) [2024-08-05 20:11:53,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24029.9, 300 sec: 24187.2). Total num frames: 479977472. Throughput: 0: 6012.4. Samples: 119989300. Policy #0 lag: (min: 0.0, avg: 4.3, max: 8.0) [2024-08-05 20:11:53,119][15372] Avg episode reward: [(0, '44.020')] [2024-08-05 20:11:56,226][15444] Updated weights for policy 0, policy_version 58601 (0.0026) [2024-08-05 20:11:58,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.9, 300 sec: 24187.2). Total num frames: 480100352. Throughput: 0: 6019.1. Samples: 120025540. Policy #0 lag: (min: 0.0, avg: 4.3, max: 8.0) [2024-08-05 20:11:58,126][15372] Avg episode reward: [(0, '42.864')] [2024-08-05 20:11:59,558][15444] Updated weights for policy 0, policy_version 58611 (0.0022) [2024-08-05 20:12:02,852][15444] Updated weights for policy 0, policy_version 58621 (0.0033) [2024-08-05 20:12:03,119][15372] Fps is (10 sec: 24575.5, 60 sec: 24166.3, 300 sec: 24187.2). Total num frames: 480223232. Throughput: 0: 6021.1. Samples: 120062200. Policy #0 lag: (min: 0.0, avg: 4.3, max: 8.0) [2024-08-05 20:12:03,119][15372] Avg episode reward: [(0, '42.671')] [2024-08-05 20:12:06,372][15444] Updated weights for policy 0, policy_version 58631 (0.0018) [2024-08-05 20:12:08,118][15372] Fps is (10 sec: 23756.9, 60 sec: 23893.4, 300 sec: 24159.5). Total num frames: 480337920. Throughput: 0: 6038.5. Samples: 120080630. Policy #0 lag: (min: 0.0, avg: 4.3, max: 8.0) [2024-08-05 20:12:08,126][15372] Avg episode reward: [(0, '42.378')] [2024-08-05 20:12:09,641][15444] Updated weights for policy 0, policy_version 58641 (0.0020) [2024-08-05 20:12:10,805][15417] Signal inference workers to stop experience collection... (21500 times) [2024-08-05 20:12:10,806][15417] Signal inference workers to resume experience collection... (21500 times) [2024-08-05 20:12:10,853][15444] InferenceWorker_p0-w0: stopping experience collection (21500 times) [2024-08-05 20:12:10,853][15444] InferenceWorker_p0-w0: resuming experience collection (21500 times) [2024-08-05 20:12:12,948][15444] Updated weights for policy 0, policy_version 58651 (0.0013) [2024-08-05 20:12:13,118][15372] Fps is (10 sec: 25395.9, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 480477184. Throughput: 0: 6042.4. Samples: 120117320. Policy #0 lag: (min: 0.0, avg: 4.3, max: 8.0) [2024-08-05 20:12:13,119][15372] Avg episode reward: [(0, '44.026')] [2024-08-05 20:12:16,298][15444] Updated weights for policy 0, policy_version 58661 (0.0012) [2024-08-05 20:12:18,119][15372] Fps is (10 sec: 25394.7, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 480591872. Throughput: 0: 6049.5. Samples: 120153810. Policy #0 lag: (min: 0.0, avg: 4.3, max: 8.0) [2024-08-05 20:12:18,119][15372] Avg episode reward: [(0, '43.444')] [2024-08-05 20:12:18,122][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000058666_480591872.pth... [2024-08-05 20:12:18,266][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000057957_474783744.pth [2024-08-05 20:12:19,591][15444] Updated weights for policy 0, policy_version 58671 (0.0015) [2024-08-05 20:12:23,058][15444] Updated weights for policy 0, policy_version 58681 (0.0019) [2024-08-05 20:12:23,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 480714752. Throughput: 0: 6044.0. Samples: 120171650. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 20:12:23,119][15372] Avg episode reward: [(0, '44.005')] [2024-08-05 20:12:26,627][15444] Updated weights for policy 0, policy_version 58691 (0.0027) [2024-08-05 20:12:28,119][15372] Fps is (10 sec: 23756.6, 60 sec: 24166.9, 300 sec: 24187.2). Total num frames: 480829440. Throughput: 0: 6048.4. Samples: 120207740. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 20:12:28,119][15372] Avg episode reward: [(0, '44.273')] [2024-08-05 20:12:30,000][15444] Updated weights for policy 0, policy_version 58701 (0.0020) [2024-08-05 20:12:33,119][15372] Fps is (10 sec: 23756.6, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 480952320. Throughput: 0: 6060.0. Samples: 120244390. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 20:12:33,126][15372] Avg episode reward: [(0, '43.485')] [2024-08-05 20:12:33,429][15444] Updated weights for policy 0, policy_version 58711 (0.0015) [2024-08-05 20:12:36,565][15444] Updated weights for policy 0, policy_version 58721 (0.0012) [2024-08-05 20:12:38,118][15372] Fps is (10 sec: 24576.6, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 481075200. Throughput: 0: 6072.9. Samples: 120262580. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 20:12:38,126][15372] Avg episode reward: [(0, '44.148')] [2024-08-05 20:12:39,945][15444] Updated weights for policy 0, policy_version 58731 (0.0018) [2024-08-05 20:12:43,119][15372] Fps is (10 sec: 24575.9, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 481198080. Throughput: 0: 6087.3. Samples: 120299470. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 20:12:43,126][15372] Avg episode reward: [(0, '44.689')] [2024-08-05 20:12:43,651][15444] Updated weights for policy 0, policy_version 58741 (0.0012) [2024-08-05 20:12:46,816][15444] Updated weights for policy 0, policy_version 58751 (0.0017) [2024-08-05 20:12:48,119][15372] Fps is (10 sec: 23756.3, 60 sec: 24166.3, 300 sec: 24187.2). Total num frames: 481312768. Throughput: 0: 6059.8. Samples: 120334890. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 20:12:48,119][15372] Avg episode reward: [(0, '43.335')] [2024-08-05 20:12:50,216][15444] Updated weights for policy 0, policy_version 58761 (0.0020) [2024-08-05 20:12:53,119][15372] Fps is (10 sec: 23755.0, 60 sec: 24302.6, 300 sec: 24159.4). Total num frames: 481435648. Throughput: 0: 6056.3. Samples: 120353170. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 20:12:53,128][15372] Avg episode reward: [(0, '44.807')] [2024-08-05 20:12:53,709][15444] Updated weights for policy 0, policy_version 58771 (0.0011) [2024-08-05 20:12:56,885][15444] Updated weights for policy 0, policy_version 58781 (0.0011) [2024-08-05 20:12:58,118][15372] Fps is (10 sec: 24576.6, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 481558528. Throughput: 0: 6045.1. Samples: 120389350. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 20:12:58,119][15372] Avg episode reward: [(0, '44.073')] [2024-08-05 20:13:00,417][15444] Updated weights for policy 0, policy_version 58791 (0.0011) [2024-08-05 20:13:02,287][15417] Signal inference workers to stop experience collection... (21550 times) [2024-08-05 20:13:02,287][15417] Signal inference workers to resume experience collection... (21550 times) [2024-08-05 20:13:02,330][15444] InferenceWorker_p0-w0: stopping experience collection (21550 times) [2024-08-05 20:13:02,330][15444] InferenceWorker_p0-w0: resuming experience collection (21550 times) [2024-08-05 20:13:03,118][15372] Fps is (10 sec: 24578.3, 60 sec: 24303.1, 300 sec: 24187.2). Total num frames: 481681408. Throughput: 0: 6064.9. Samples: 120426730. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 20:13:03,119][15372] Avg episode reward: [(0, '43.515')] [2024-08-05 20:13:03,660][15444] Updated weights for policy 0, policy_version 58801 (0.0018) [2024-08-05 20:13:06,841][15444] Updated weights for policy 0, policy_version 58811 (0.0019) [2024-08-05 20:13:08,119][15372] Fps is (10 sec: 24575.5, 60 sec: 24439.4, 300 sec: 24215.0). Total num frames: 481804288. Throughput: 0: 6083.5. Samples: 120445410. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 20:13:08,119][15372] Avg episode reward: [(0, '43.472')] [2024-08-05 20:13:10,539][15444] Updated weights for policy 0, policy_version 58821 (0.0016) [2024-08-05 20:13:13,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 481927168. Throughput: 0: 6081.1. Samples: 120481390. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 20:13:13,126][15372] Avg episode reward: [(0, '42.905')] [2024-08-05 20:13:13,941][15444] Updated weights for policy 0, policy_version 58831 (0.0031) [2024-08-05 20:13:17,163][15444] Updated weights for policy 0, policy_version 58841 (0.0022) [2024-08-05 20:13:18,119][15372] Fps is (10 sec: 23757.1, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 482041856. Throughput: 0: 6052.2. Samples: 120516740. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 20:13:18,129][15372] Avg episode reward: [(0, '44.049')] [2024-08-05 20:13:20,702][15444] Updated weights for policy 0, policy_version 58851 (0.0019) [2024-08-05 20:13:23,118][15372] Fps is (10 sec: 22937.6, 60 sec: 24029.9, 300 sec: 24187.2). Total num frames: 482156544. Throughput: 0: 6066.0. Samples: 120535550. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 20:13:23,119][15372] Avg episode reward: [(0, '43.383')] [2024-08-05 20:13:23,901][15444] Updated weights for policy 0, policy_version 58861 (0.0019) [2024-08-05 20:13:27,568][15444] Updated weights for policy 0, policy_version 58871 (0.0012) [2024-08-05 20:13:28,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24166.5, 300 sec: 24187.2). Total num frames: 482279424. Throughput: 0: 6033.1. Samples: 120570960. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 20:13:28,119][15372] Avg episode reward: [(0, '42.972')] [2024-08-05 20:13:30,750][15444] Updated weights for policy 0, policy_version 58881 (0.0017) [2024-08-05 20:13:33,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 482402304. Throughput: 0: 6053.4. Samples: 120607290. Policy #0 lag: (min: 1.0, avg: 4.1, max: 8.0) [2024-08-05 20:13:33,126][15372] Avg episode reward: [(0, '42.969')] [2024-08-05 20:13:34,396][15444] Updated weights for policy 0, policy_version 58891 (0.0012) [2024-08-05 20:13:37,708][15444] Updated weights for policy 0, policy_version 58901 (0.0012) [2024-08-05 20:13:38,119][15372] Fps is (10 sec: 24574.9, 60 sec: 24166.2, 300 sec: 24187.2). Total num frames: 482525184. Throughput: 0: 6043.2. Samples: 120625110. Policy #0 lag: (min: 1.0, avg: 4.1, max: 9.0) [2024-08-05 20:13:38,119][15372] Avg episode reward: [(0, '44.398')] [2024-08-05 20:13:41,088][15444] Updated weights for policy 0, policy_version 58911 (0.0023) [2024-08-05 20:13:43,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24166.5, 300 sec: 24215.0). Total num frames: 482648064. Throughput: 0: 6036.9. Samples: 120661010. Policy #0 lag: (min: 1.0, avg: 4.1, max: 9.0) [2024-08-05 20:13:43,120][15372] Avg episode reward: [(0, '44.433')] [2024-08-05 20:13:44,699][15444] Updated weights for policy 0, policy_version 58921 (0.0034) [2024-08-05 20:13:47,949][15444] Updated weights for policy 0, policy_version 58931 (0.0019) [2024-08-05 20:13:48,118][15372] Fps is (10 sec: 23757.9, 60 sec: 24166.5, 300 sec: 24159.5). Total num frames: 482762752. Throughput: 0: 5990.9. Samples: 120696320. Policy #0 lag: (min: 1.0, avg: 4.1, max: 9.0) [2024-08-05 20:13:48,119][15372] Avg episode reward: [(0, '43.707')] [2024-08-05 20:13:51,446][15444] Updated weights for policy 0, policy_version 58941 (0.0012) [2024-08-05 20:13:53,118][15372] Fps is (10 sec: 22937.5, 60 sec: 24030.2, 300 sec: 24159.5). Total num frames: 482877440. Throughput: 0: 5992.2. Samples: 120715060. Policy #0 lag: (min: 1.0, avg: 4.1, max: 9.0) [2024-08-05 20:13:53,126][15372] Avg episode reward: [(0, '43.815')] [2024-08-05 20:13:53,926][15417] Signal inference workers to stop experience collection... (21600 times) [2024-08-05 20:13:53,928][15417] Signal inference workers to resume experience collection... (21600 times) [2024-08-05 20:13:53,977][15444] InferenceWorker_p0-w0: stopping experience collection (21600 times) [2024-08-05 20:13:53,982][15444] InferenceWorker_p0-w0: resuming experience collection (21600 times) [2024-08-05 20:13:55,085][15444] Updated weights for policy 0, policy_version 58951 (0.0036) [2024-08-05 20:13:58,119][15372] Fps is (10 sec: 23754.6, 60 sec: 24029.5, 300 sec: 24159.4). Total num frames: 483000320. Throughput: 0: 5993.7. Samples: 120751110. Policy #0 lag: (min: 1.0, avg: 4.1, max: 9.0) [2024-08-05 20:13:58,120][15372] Avg episode reward: [(0, '44.089')] [2024-08-05 20:13:58,190][15444] Updated weights for policy 0, policy_version 58961 (0.0012) [2024-08-05 20:14:01,771][15444] Updated weights for policy 0, policy_version 58971 (0.0021) [2024-08-05 20:14:03,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24029.8, 300 sec: 24159.5). Total num frames: 483123200. Throughput: 0: 5996.9. Samples: 120786600. Policy #0 lag: (min: 1.0, avg: 4.1, max: 9.0) [2024-08-05 20:14:03,119][15372] Avg episode reward: [(0, '43.562')] [2024-08-05 20:14:04,843][15444] Updated weights for policy 0, policy_version 58981 (0.0032) [2024-08-05 20:14:08,118][15372] Fps is (10 sec: 24578.1, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 483246080. Throughput: 0: 6003.3. Samples: 120805700. Policy #0 lag: (min: 1.0, avg: 4.1, max: 9.0) [2024-08-05 20:14:08,126][15372] Avg episode reward: [(0, '43.667')] [2024-08-05 20:14:08,286][15444] Updated weights for policy 0, policy_version 58991 (0.0013) [2024-08-05 20:14:11,839][15444] Updated weights for policy 0, policy_version 59001 (0.0012) [2024-08-05 20:14:13,118][15372] Fps is (10 sec: 23756.7, 60 sec: 23893.3, 300 sec: 24131.7). Total num frames: 483360768. Throughput: 0: 6017.3. Samples: 120841740. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 20:14:13,119][15372] Avg episode reward: [(0, '42.939')] [2024-08-05 20:14:15,231][15444] Updated weights for policy 0, policy_version 59011 (0.0015) [2024-08-05 20:14:18,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 483483648. Throughput: 0: 6007.6. Samples: 120877630. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 20:14:18,136][15372] Avg episode reward: [(0, '43.278')] [2024-08-05 20:14:18,216][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000059020_483491840.pth... [2024-08-05 20:14:18,380][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000058311_477683712.pth [2024-08-05 20:14:18,561][15444] Updated weights for policy 0, policy_version 59021 (0.0020) [2024-08-05 20:14:22,097][15444] Updated weights for policy 0, policy_version 59031 (0.0012) [2024-08-05 20:14:23,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 483606528. Throughput: 0: 5998.3. Samples: 120895030. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 20:14:23,119][15372] Avg episode reward: [(0, '44.489')] [2024-08-05 20:14:25,323][15444] Updated weights for policy 0, policy_version 59041 (0.0020) [2024-08-05 20:14:28,118][15372] Fps is (10 sec: 22937.7, 60 sec: 23893.3, 300 sec: 24104.0). Total num frames: 483713024. Throughput: 0: 5992.4. Samples: 120930670. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 20:14:28,119][15372] Avg episode reward: [(0, '44.362')] [2024-08-05 20:14:29,110][15444] Updated weights for policy 0, policy_version 59051 (0.0012) [2024-08-05 20:14:32,602][15444] Updated weights for policy 0, policy_version 59061 (0.0011) [2024-08-05 20:14:33,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 483844096. Throughput: 0: 6010.0. Samples: 120966770. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 20:14:33,119][15372] Avg episode reward: [(0, '43.950')] [2024-08-05 20:14:34,848][15417] Signal inference workers to stop experience collection... (21650 times) [2024-08-05 20:14:34,849][15417] Signal inference workers to resume experience collection... (21650 times) [2024-08-05 20:14:34,902][15444] InferenceWorker_p0-w0: stopping experience collection (21650 times) [2024-08-05 20:14:34,913][15444] InferenceWorker_p0-w0: resuming experience collection (21650 times) [2024-08-05 20:14:35,645][15444] Updated weights for policy 0, policy_version 59071 (0.0020) [2024-08-05 20:14:38,119][15372] Fps is (10 sec: 25394.8, 60 sec: 24030.0, 300 sec: 24131.7). Total num frames: 483966976. Throughput: 0: 5998.0. Samples: 120984970. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 20:14:38,119][15372] Avg episode reward: [(0, '43.752')] [2024-08-05 20:14:39,338][15444] Updated weights for policy 0, policy_version 59081 (0.0031) [2024-08-05 20:14:42,774][15444] Updated weights for policy 0, policy_version 59091 (0.0017) [2024-08-05 20:14:43,119][15372] Fps is (10 sec: 23756.3, 60 sec: 23893.2, 300 sec: 24103.9). Total num frames: 484081664. Throughput: 0: 6007.6. Samples: 121021450. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 20:14:43,119][15372] Avg episode reward: [(0, '43.919')] [2024-08-05 20:14:45,927][15444] Updated weights for policy 0, policy_version 59101 (0.0018) [2024-08-05 20:14:48,120][15372] Fps is (10 sec: 23753.1, 60 sec: 24029.2, 300 sec: 24131.5). Total num frames: 484204544. Throughput: 0: 6009.8. Samples: 121057050. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 20:14:48,128][15372] Avg episode reward: [(0, '43.089')] [2024-08-05 20:14:49,544][15444] Updated weights for policy 0, policy_version 59111 (0.0014) [2024-08-05 20:14:52,781][15444] Updated weights for policy 0, policy_version 59121 (0.0011) [2024-08-05 20:14:53,118][15372] Fps is (10 sec: 23757.3, 60 sec: 24029.9, 300 sec: 24103.9). Total num frames: 484319232. Throughput: 0: 5973.8. Samples: 121074520. Policy #0 lag: (min: 1.0, avg: 4.4, max: 9.0) [2024-08-05 20:14:53,119][15372] Avg episode reward: [(0, '43.619')] [2024-08-05 20:14:56,470][15444] Updated weights for policy 0, policy_version 59131 (0.0030) [2024-08-05 20:14:58,119][15372] Fps is (10 sec: 23760.1, 60 sec: 24030.1, 300 sec: 24103.9). Total num frames: 484442112. Throughput: 0: 5980.4. Samples: 121110860. Policy #0 lag: (min: 1.0, avg: 4.4, max: 9.0) [2024-08-05 20:14:58,119][15372] Avg episode reward: [(0, '45.520')] [2024-08-05 20:14:59,674][15444] Updated weights for policy 0, policy_version 59141 (0.0014) [2024-08-05 20:15:03,118][15372] Fps is (10 sec: 23756.7, 60 sec: 23893.3, 300 sec: 24131.7). Total num frames: 484556800. Throughput: 0: 5988.9. Samples: 121147130. Policy #0 lag: (min: 1.0, avg: 4.4, max: 9.0) [2024-08-05 20:15:03,126][15372] Avg episode reward: [(0, '45.668')] [2024-08-05 20:15:03,182][15444] Updated weights for policy 0, policy_version 59151 (0.0021) [2024-08-05 20:15:06,326][15444] Updated weights for policy 0, policy_version 59161 (0.0018) [2024-08-05 20:15:08,118][15372] Fps is (10 sec: 23757.6, 60 sec: 23893.4, 300 sec: 24103.9). Total num frames: 484679680. Throughput: 0: 6003.1. Samples: 121165170. Policy #0 lag: (min: 1.0, avg: 4.4, max: 9.0) [2024-08-05 20:15:08,126][15372] Avg episode reward: [(0, '44.578')] [2024-08-05 20:15:09,796][15444] Updated weights for policy 0, policy_version 59171 (0.0022) [2024-08-05 20:15:12,681][15417] Signal inference workers to stop experience collection... (21700 times) [2024-08-05 20:15:12,690][15417] Signal inference workers to resume experience collection... (21700 times) [2024-08-05 20:15:12,747][15444] InferenceWorker_p0-w0: stopping experience collection (21700 times) [2024-08-05 20:15:12,748][15444] InferenceWorker_p0-w0: resuming experience collection (21700 times) [2024-08-05 20:15:13,119][15372] Fps is (10 sec: 24575.7, 60 sec: 24029.8, 300 sec: 24131.7). Total num frames: 484802560. Throughput: 0: 6023.5. Samples: 121201730. Policy #0 lag: (min: 1.0, avg: 4.4, max: 9.0) [2024-08-05 20:15:13,119][15372] Avg episode reward: [(0, '44.222')] [2024-08-05 20:15:13,329][15444] Updated weights for policy 0, policy_version 59181 (0.0015) [2024-08-05 20:15:16,702][15444] Updated weights for policy 0, policy_version 59191 (0.0021) [2024-08-05 20:15:18,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 484925440. Throughput: 0: 6018.4. Samples: 121237600. Policy #0 lag: (min: 1.0, avg: 4.4, max: 9.0) [2024-08-05 20:15:18,119][15372] Avg episode reward: [(0, '43.848')] [2024-08-05 20:15:19,951][15444] Updated weights for policy 0, policy_version 59201 (0.0013) [2024-08-05 20:15:23,119][15372] Fps is (10 sec: 24576.3, 60 sec: 24029.8, 300 sec: 24159.5). Total num frames: 485048320. Throughput: 0: 6022.5. Samples: 121255980. Policy #0 lag: (min: 1.0, avg: 4.4, max: 9.0) [2024-08-05 20:15:23,126][15372] Avg episode reward: [(0, '44.435')] [2024-08-05 20:15:23,539][15444] Updated weights for policy 0, policy_version 59211 (0.0023) [2024-08-05 20:15:26,628][15444] Updated weights for policy 0, policy_version 59221 (0.0011) [2024-08-05 20:15:28,120][15372] Fps is (10 sec: 23751.9, 60 sec: 24165.6, 300 sec: 24103.8). Total num frames: 485163008. Throughput: 0: 6009.8. Samples: 121291900. Policy #0 lag: (min: 0.0, avg: 4.1, max: 9.0) [2024-08-05 20:15:28,128][15372] Avg episode reward: [(0, '44.791')] [2024-08-05 20:15:30,188][15444] Updated weights for policy 0, policy_version 59231 (0.0013) [2024-08-05 20:15:33,119][15372] Fps is (10 sec: 24575.9, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 485294080. Throughput: 0: 6027.5. Samples: 121328280. Policy #0 lag: (min: 0.0, avg: 4.1, max: 9.0) [2024-08-05 20:15:33,126][15372] Avg episode reward: [(0, '45.112')] [2024-08-05 20:15:33,767][15444] Updated weights for policy 0, policy_version 59241 (0.0011) [2024-08-05 20:15:36,973][15444] Updated weights for policy 0, policy_version 59251 (0.0012) [2024-08-05 20:15:38,118][15372] Fps is (10 sec: 24581.1, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 485408768. Throughput: 0: 6044.2. Samples: 121346510. Policy #0 lag: (min: 0.0, avg: 4.1, max: 9.0) [2024-08-05 20:15:38,119][15372] Avg episode reward: [(0, '45.001')] [2024-08-05 20:15:40,382][15444] Updated weights for policy 0, policy_version 59261 (0.0011) [2024-08-05 20:15:43,119][15372] Fps is (10 sec: 23756.4, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 485531648. Throughput: 0: 6052.9. Samples: 121383240. Policy #0 lag: (min: 0.0, avg: 4.1, max: 9.0) [2024-08-05 20:15:43,123][15372] Avg episode reward: [(0, '44.315')] [2024-08-05 20:15:43,612][15444] Updated weights for policy 0, policy_version 59271 (0.0036) [2024-08-05 20:15:47,253][15444] Updated weights for policy 0, policy_version 59281 (0.0019) [2024-08-05 20:15:48,119][15372] Fps is (10 sec: 24575.6, 60 sec: 24167.0, 300 sec: 24131.7). Total num frames: 485654528. Throughput: 0: 6033.3. Samples: 121418630. Policy #0 lag: (min: 0.0, avg: 4.1, max: 9.0) [2024-08-05 20:15:48,119][15372] Avg episode reward: [(0, '44.724')] [2024-08-05 20:15:50,423][15444] Updated weights for policy 0, policy_version 59291 (0.0017) [2024-08-05 20:15:53,119][15372] Fps is (10 sec: 23757.1, 60 sec: 24166.3, 300 sec: 24103.9). Total num frames: 485769216. Throughput: 0: 6042.2. Samples: 121437070. Policy #0 lag: (min: 0.0, avg: 4.1, max: 9.0) [2024-08-05 20:15:53,126][15372] Avg episode reward: [(0, '44.595')] [2024-08-05 20:15:53,995][15444] Updated weights for policy 0, policy_version 59301 (0.0028) [2024-08-05 20:15:54,130][15417] Signal inference workers to stop experience collection... (21750 times) [2024-08-05 20:15:54,130][15417] Signal inference workers to resume experience collection... (21750 times) [2024-08-05 20:15:54,173][15444] InferenceWorker_p0-w0: stopping experience collection (21750 times) [2024-08-05 20:15:54,173][15444] InferenceWorker_p0-w0: resuming experience collection (21750 times) [2024-08-05 20:15:57,341][15444] Updated weights for policy 0, policy_version 59311 (0.0025) [2024-08-05 20:15:58,119][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.5, 300 sec: 24131.7). Total num frames: 485892096. Throughput: 0: 6025.3. Samples: 121472870. Policy #0 lag: (min: 0.0, avg: 4.1, max: 9.0) [2024-08-05 20:15:58,126][15372] Avg episode reward: [(0, '45.030')] [2024-08-05 20:16:00,597][15444] Updated weights for policy 0, policy_version 59321 (0.0025) [2024-08-05 20:16:03,118][15372] Fps is (10 sec: 24576.4, 60 sec: 24303.0, 300 sec: 24103.9). Total num frames: 486014976. Throughput: 0: 6024.0. Samples: 121508680. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 20:16:03,126][15372] Avg episode reward: [(0, '44.704')] [2024-08-05 20:16:04,468][15444] Updated weights for policy 0, policy_version 59331 (0.0012) [2024-08-05 20:16:07,767][15444] Updated weights for policy 0, policy_version 59341 (0.0011) [2024-08-05 20:16:08,119][15372] Fps is (10 sec: 23756.6, 60 sec: 24166.3, 300 sec: 24103.9). Total num frames: 486129664. Throughput: 0: 6010.0. Samples: 121526430. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 20:16:08,119][15372] Avg episode reward: [(0, '44.074')] [2024-08-05 20:16:10,961][15444] Updated weights for policy 0, policy_version 59351 (0.0023) [2024-08-05 20:16:13,118][15372] Fps is (10 sec: 22937.5, 60 sec: 24029.9, 300 sec: 24076.2). Total num frames: 486244352. Throughput: 0: 6007.6. Samples: 121562230. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 20:16:13,127][15372] Avg episode reward: [(0, '43.901')] [2024-08-05 20:16:14,475][15444] Updated weights for policy 0, policy_version 59361 (0.0013) [2024-08-05 20:16:17,908][15444] Updated weights for policy 0, policy_version 59371 (0.0019) [2024-08-05 20:16:18,118][15372] Fps is (10 sec: 23757.2, 60 sec: 24029.9, 300 sec: 24076.2). Total num frames: 486367232. Throughput: 0: 5998.7. Samples: 121598220. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 20:16:18,119][15372] Avg episode reward: [(0, '44.071')] [2024-08-05 20:16:18,122][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000059371_486367232.pth... [2024-08-05 20:16:18,236][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000058666_480591872.pth [2024-08-05 20:16:21,369][15444] Updated weights for policy 0, policy_version 59381 (0.0031) [2024-08-05 20:16:23,119][15372] Fps is (10 sec: 24575.7, 60 sec: 24029.8, 300 sec: 24104.0). Total num frames: 486490112. Throughput: 0: 5997.5. Samples: 121616400. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 20:16:23,126][15372] Avg episode reward: [(0, '44.706')] [2024-08-05 20:16:24,639][15444] Updated weights for policy 0, policy_version 59391 (0.0025) [2024-08-05 20:16:28,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24030.7, 300 sec: 24076.2). Total num frames: 486604800. Throughput: 0: 5996.9. Samples: 121653100. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 20:16:28,126][15372] Avg episode reward: [(0, '43.280')] [2024-08-05 20:16:28,235][15444] Updated weights for policy 0, policy_version 59401 (0.0035) [2024-08-05 20:16:31,351][15444] Updated weights for policy 0, policy_version 59411 (0.0029) [2024-08-05 20:16:33,119][15372] Fps is (10 sec: 23756.9, 60 sec: 23893.3, 300 sec: 24103.9). Total num frames: 486727680. Throughput: 0: 6002.4. Samples: 121688740. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 20:16:33,126][15372] Avg episode reward: [(0, '43.816')] [2024-08-05 20:16:34,948][15444] Updated weights for policy 0, policy_version 59421 (0.0023) [2024-08-05 20:16:38,119][15372] Fps is (10 sec: 24575.7, 60 sec: 24029.8, 300 sec: 24103.9). Total num frames: 486850560. Throughput: 0: 5997.8. Samples: 121706970. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 20:16:38,126][15372] Avg episode reward: [(0, '44.280')] [2024-08-05 20:16:38,494][15444] Updated weights for policy 0, policy_version 59431 (0.0011) [2024-08-05 20:16:39,220][15417] Signal inference workers to stop experience collection... (21800 times) [2024-08-05 20:16:39,220][15417] Signal inference workers to resume experience collection... (21800 times) [2024-08-05 20:16:39,285][15444] InferenceWorker_p0-w0: stopping experience collection (21800 times) [2024-08-05 20:16:39,285][15444] InferenceWorker_p0-w0: resuming experience collection (21800 times) [2024-08-05 20:16:41,561][15444] Updated weights for policy 0, policy_version 59441 (0.0020) [2024-08-05 20:16:43,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24030.0, 300 sec: 24103.9). Total num frames: 486973440. Throughput: 0: 6012.7. Samples: 121743440. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 20:16:43,119][15372] Avg episode reward: [(0, '43.988')] [2024-08-05 20:16:45,101][15444] Updated weights for policy 0, policy_version 59451 (0.0018) [2024-08-05 20:16:48,118][15372] Fps is (10 sec: 25395.5, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 487104512. Throughput: 0: 6034.2. Samples: 121780220. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 20:16:48,119][15444] Updated weights for policy 0, policy_version 59461 (0.0013) [2024-08-05 20:16:48,126][15372] Avg episode reward: [(0, '44.379')] [2024-08-05 20:16:51,809][15444] Updated weights for policy 0, policy_version 59471 (0.0023) [2024-08-05 20:16:53,119][15372] Fps is (10 sec: 23755.6, 60 sec: 24029.7, 300 sec: 24103.9). Total num frames: 487211008. Throughput: 0: 6046.2. Samples: 121798510. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 20:16:53,120][15372] Avg episode reward: [(0, '44.824')] [2024-08-05 20:16:55,087][15444] Updated weights for policy 0, policy_version 59481 (0.0016) [2024-08-05 20:16:58,127][15372] Fps is (10 sec: 23736.2, 60 sec: 24162.9, 300 sec: 24131.0). Total num frames: 487342080. Throughput: 0: 6051.5. Samples: 121834600. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 20:16:58,135][15372] Avg episode reward: [(0, '43.839')] [2024-08-05 20:16:58,483][15444] Updated weights for policy 0, policy_version 59491 (0.0022) [2024-08-05 20:17:01,952][15444] Updated weights for policy 0, policy_version 59501 (0.0014) [2024-08-05 20:17:03,119][15372] Fps is (10 sec: 24577.0, 60 sec: 24029.8, 300 sec: 24131.7). Total num frames: 487456768. Throughput: 0: 6053.8. Samples: 121870640. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 20:17:03,119][15372] Avg episode reward: [(0, '44.395')] [2024-08-05 20:17:05,169][15444] Updated weights for policy 0, policy_version 59511 (0.0029) [2024-08-05 20:17:08,119][15372] Fps is (10 sec: 22957.3, 60 sec: 24029.9, 300 sec: 24048.4). Total num frames: 487571456. Throughput: 0: 6056.9. Samples: 121888960. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 20:17:08,126][15372] Avg episode reward: [(0, '44.442')] [2024-08-05 20:17:08,784][15444] Updated weights for policy 0, policy_version 59521 (0.0016) [2024-08-05 20:17:12,369][15444] Updated weights for policy 0, policy_version 59531 (0.0019) [2024-08-05 20:17:13,118][15372] Fps is (10 sec: 24576.4, 60 sec: 24303.0, 300 sec: 24103.9). Total num frames: 487702528. Throughput: 0: 6033.6. Samples: 121924610. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 20:17:13,119][15372] Avg episode reward: [(0, '44.204')] [2024-08-05 20:17:15,633][15444] Updated weights for policy 0, policy_version 59541 (0.0022) [2024-08-05 20:17:18,119][15372] Fps is (10 sec: 24575.9, 60 sec: 24166.3, 300 sec: 24076.1). Total num frames: 487817216. Throughput: 0: 6053.5. Samples: 121961150. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 20:17:18,126][15372] Avg episode reward: [(0, '43.865')] [2024-08-05 20:17:18,986][15444] Updated weights for policy 0, policy_version 59551 (0.0014) [2024-08-05 20:17:22,221][15444] Updated weights for policy 0, policy_version 59561 (0.0013) [2024-08-05 20:17:23,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.5, 300 sec: 24103.9). Total num frames: 487940096. Throughput: 0: 6049.4. Samples: 121979190. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 20:17:23,126][15372] Avg episode reward: [(0, '43.816')] [2024-08-05 20:17:25,590][15417] Signal inference workers to stop experience collection... (21850 times) [2024-08-05 20:17:25,592][15417] Signal inference workers to resume experience collection... (21850 times) [2024-08-05 20:17:25,649][15444] InferenceWorker_p0-w0: stopping experience collection (21850 times) [2024-08-05 20:17:25,649][15444] InferenceWorker_p0-w0: resuming experience collection (21850 times) [2024-08-05 20:17:25,688][15444] Updated weights for policy 0, policy_version 59571 (0.0021) [2024-08-05 20:17:28,118][15372] Fps is (10 sec: 24576.4, 60 sec: 24302.9, 300 sec: 24103.9). Total num frames: 488062976. Throughput: 0: 6026.2. Samples: 122014620. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 20:17:28,130][15372] Avg episode reward: [(0, '44.222')] [2024-08-05 20:17:29,212][15444] Updated weights for policy 0, policy_version 59581 (0.0023) [2024-08-05 20:17:32,393][15444] Updated weights for policy 0, policy_version 59591 (0.0019) [2024-08-05 20:17:33,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24303.0, 300 sec: 24103.9). Total num frames: 488185856. Throughput: 0: 6030.2. Samples: 122051580. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 20:17:33,126][15372] Avg episode reward: [(0, '43.711')] [2024-08-05 20:17:35,825][15444] Updated weights for policy 0, policy_version 59601 (0.0013) [2024-08-05 20:17:38,119][15372] Fps is (10 sec: 23755.6, 60 sec: 24166.3, 300 sec: 24076.1). Total num frames: 488300544. Throughput: 0: 6033.1. Samples: 122070000. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 20:17:38,127][15372] Avg episode reward: [(0, '44.014')] [2024-08-05 20:17:39,057][15444] Updated weights for policy 0, policy_version 59611 (0.0019) [2024-08-05 20:17:42,821][15444] Updated weights for policy 0, policy_version 59621 (0.0022) [2024-08-05 20:17:43,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24103.9). Total num frames: 488423424. Throughput: 0: 6032.1. Samples: 122105990. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 20:17:43,119][15372] Avg episode reward: [(0, '43.744')] [2024-08-05 20:17:46,020][15444] Updated weights for policy 0, policy_version 59631 (0.0020) [2024-08-05 20:17:48,119][15372] Fps is (10 sec: 24576.9, 60 sec: 24029.8, 300 sec: 24104.0). Total num frames: 488546304. Throughput: 0: 6036.2. Samples: 122142270. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 20:17:48,119][15372] Avg episode reward: [(0, '43.807')] [2024-08-05 20:17:49,437][15444] Updated weights for policy 0, policy_version 59641 (0.0023) [2024-08-05 20:17:53,115][15444] Updated weights for policy 0, policy_version 59651 (0.0021) [2024-08-05 20:17:53,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.6, 300 sec: 24076.1). Total num frames: 488660992. Throughput: 0: 6026.7. Samples: 122160160. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 20:17:53,123][15372] Avg episode reward: [(0, '44.597')] [2024-08-05 20:17:56,200][15444] Updated weights for policy 0, policy_version 59661 (0.0017) [2024-08-05 20:17:58,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24033.4, 300 sec: 24076.1). Total num frames: 488783872. Throughput: 0: 6028.4. Samples: 122195890. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 20:17:58,126][15372] Avg episode reward: [(0, '44.173')] [2024-08-05 20:17:59,777][15444] Updated weights for policy 0, policy_version 59671 (0.0015) [2024-08-05 20:18:03,118][15372] Fps is (10 sec: 23756.6, 60 sec: 24029.9, 300 sec: 24048.4). Total num frames: 488898560. Throughput: 0: 6025.3. Samples: 122232290. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 20:18:03,126][15372] Avg episode reward: [(0, '42.930')] [2024-08-05 20:18:03,287][15444] Updated weights for policy 0, policy_version 59681 (0.0011) [2024-08-05 20:18:06,426][15444] Updated weights for policy 0, policy_version 59691 (0.0014) [2024-08-05 20:18:08,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24303.0, 300 sec: 24076.1). Total num frames: 489029632. Throughput: 0: 6032.2. Samples: 122250640. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 20:18:08,126][15372] Avg episode reward: [(0, '42.866')] [2024-08-05 20:18:09,944][15444] Updated weights for policy 0, policy_version 59701 (0.0011) [2024-08-05 20:18:13,119][15372] Fps is (10 sec: 24575.1, 60 sec: 24029.7, 300 sec: 24076.1). Total num frames: 489144320. Throughput: 0: 6053.0. Samples: 122287010. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 20:18:13,127][15372] Avg episode reward: [(0, '44.132')] [2024-08-05 20:18:13,219][15444] Updated weights for policy 0, policy_version 59711 (0.0012) [2024-08-05 20:18:16,796][15444] Updated weights for policy 0, policy_version 59721 (0.0014) [2024-08-05 20:18:18,119][15372] Fps is (10 sec: 23756.5, 60 sec: 24166.4, 300 sec: 24103.9). Total num frames: 489267200. Throughput: 0: 6023.3. Samples: 122322630. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 20:18:18,119][15372] Avg episode reward: [(0, '44.529')] [2024-08-05 20:18:18,196][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000059726_489275392.pth... [2024-08-05 20:18:18,307][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000059020_483491840.pth [2024-08-05 20:18:20,034][15444] Updated weights for policy 0, policy_version 59731 (0.0021) [2024-08-05 20:18:21,042][15417] Signal inference workers to stop experience collection... (21900 times) [2024-08-05 20:18:21,042][15417] Signal inference workers to resume experience collection... (21900 times) [2024-08-05 20:18:21,088][15444] InferenceWorker_p0-w0: stopping experience collection (21900 times) [2024-08-05 20:18:21,089][15444] InferenceWorker_p0-w0: resuming experience collection (21900 times) [2024-08-05 20:18:23,119][15372] Fps is (10 sec: 24576.6, 60 sec: 24166.3, 300 sec: 24103.9). Total num frames: 489390080. Throughput: 0: 6028.5. Samples: 122341280. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 20:18:23,119][15372] Avg episode reward: [(0, '44.405')] [2024-08-05 20:18:23,353][15444] Updated weights for policy 0, policy_version 59741 (0.0031) [2024-08-05 20:18:26,873][15444] Updated weights for policy 0, policy_version 59751 (0.0018) [2024-08-05 20:18:28,119][15372] Fps is (10 sec: 24576.2, 60 sec: 24166.4, 300 sec: 24103.9). Total num frames: 489512960. Throughput: 0: 6038.7. Samples: 122377730. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 20:18:28,119][15372] Avg episode reward: [(0, '43.762')] [2024-08-05 20:18:29,973][15444] Updated weights for policy 0, policy_version 59761 (0.0013) [2024-08-05 20:18:33,118][15372] Fps is (10 sec: 24576.4, 60 sec: 24166.4, 300 sec: 24104.0). Total num frames: 489635840. Throughput: 0: 6038.7. Samples: 122414010. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 20:18:33,126][15372] Avg episode reward: [(0, '43.047')] [2024-08-05 20:18:33,586][15444] Updated weights for policy 0, policy_version 59771 (0.0025) [2024-08-05 20:18:36,798][15444] Updated weights for policy 0, policy_version 59781 (0.0042) [2024-08-05 20:18:38,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24166.6, 300 sec: 24076.1). Total num frames: 489750528. Throughput: 0: 6052.9. Samples: 122432540. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 20:18:38,119][15372] Avg episode reward: [(0, '43.792')] [2024-08-05 20:18:40,308][15444] Updated weights for policy 0, policy_version 59791 (0.0025) [2024-08-05 20:18:43,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.4, 300 sec: 24103.9). Total num frames: 489873408. Throughput: 0: 6055.6. Samples: 122468390. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 20:18:43,119][15372] Avg episode reward: [(0, '43.529')] [2024-08-05 20:18:43,797][15444] Updated weights for policy 0, policy_version 59801 (0.0017) [2024-08-05 20:18:47,288][15444] Updated weights for policy 0, policy_version 59811 (0.0025) [2024-08-05 20:18:48,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24029.9, 300 sec: 24103.9). Total num frames: 489988096. Throughput: 0: 6032.5. Samples: 122503750. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 20:18:48,119][15372] Avg episode reward: [(0, '43.852')] [2024-08-05 20:18:50,695][15444] Updated weights for policy 0, policy_version 59821 (0.0017) [2024-08-05 20:18:53,119][15372] Fps is (10 sec: 23756.3, 60 sec: 24166.3, 300 sec: 24104.0). Total num frames: 490110976. Throughput: 0: 6042.6. Samples: 122522560. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 20:18:53,119][15372] Avg episode reward: [(0, '43.750')] [2024-08-05 20:18:53,841][15444] Updated weights for policy 0, policy_version 59831 (0.0022) [2024-08-05 20:18:57,506][15444] Updated weights for policy 0, policy_version 59841 (0.0019) [2024-08-05 20:18:58,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24166.4, 300 sec: 24103.9). Total num frames: 490233856. Throughput: 0: 6028.7. Samples: 122558300. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 20:18:58,119][15372] Avg episode reward: [(0, '43.848')] [2024-08-05 20:19:00,581][15444] Updated weights for policy 0, policy_version 59851 (0.0019) [2024-08-05 20:19:03,119][15372] Fps is (10 sec: 23756.1, 60 sec: 24166.2, 300 sec: 24076.1). Total num frames: 490348544. Throughput: 0: 6038.2. Samples: 122594350. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 20:19:03,127][15372] Avg episode reward: [(0, '44.098')] [2024-08-05 20:19:04,170][15444] Updated weights for policy 0, policy_version 59861 (0.0020) [2024-08-05 20:19:07,648][15444] Updated weights for policy 0, policy_version 59871 (0.0015) [2024-08-05 20:19:08,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.9, 300 sec: 24103.9). Total num frames: 490471424. Throughput: 0: 6033.1. Samples: 122612770. Policy #0 lag: (min: 1.0, avg: 3.6, max: 8.0) [2024-08-05 20:19:08,119][15372] Avg episode reward: [(0, '43.852')] [2024-08-05 20:19:10,740][15444] Updated weights for policy 0, policy_version 59881 (0.0011) [2024-08-05 20:19:13,118][15372] Fps is (10 sec: 24577.2, 60 sec: 24166.6, 300 sec: 24103.9). Total num frames: 490594304. Throughput: 0: 6020.5. Samples: 122648650. Policy #0 lag: (min: 1.0, avg: 3.6, max: 8.0) [2024-08-05 20:19:13,126][15372] Avg episode reward: [(0, '44.716')] [2024-08-05 20:19:14,391][15444] Updated weights for policy 0, policy_version 59891 (0.0012) [2024-08-05 20:19:17,830][15444] Updated weights for policy 0, policy_version 59901 (0.0020) [2024-08-05 20:19:18,119][15372] Fps is (10 sec: 24574.5, 60 sec: 24166.2, 300 sec: 24103.9). Total num frames: 490717184. Throughput: 0: 6012.4. Samples: 122684570. Policy #0 lag: (min: 1.0, avg: 3.6, max: 8.0) [2024-08-05 20:19:18,119][15372] Avg episode reward: [(0, '44.856')] [2024-08-05 20:19:18,982][15417] Signal inference workers to stop experience collection... (21950 times) [2024-08-05 20:19:18,982][15417] Signal inference workers to resume experience collection... (21950 times) [2024-08-05 20:19:19,011][15444] InferenceWorker_p0-w0: stopping experience collection (21950 times) [2024-08-05 20:19:19,057][15444] InferenceWorker_p0-w0: resuming experience collection (21950 times) [2024-08-05 20:19:21,098][15444] Updated weights for policy 0, policy_version 59911 (0.0011) [2024-08-05 20:19:23,125][15372] Fps is (10 sec: 23741.5, 60 sec: 24027.4, 300 sec: 24131.2). Total num frames: 490831872. Throughput: 0: 6009.1. Samples: 122702990. Policy #0 lag: (min: 1.0, avg: 3.6, max: 8.0) [2024-08-05 20:19:23,125][15372] Avg episode reward: [(0, '44.552')] [2024-08-05 20:19:24,652][15444] Updated weights for policy 0, policy_version 59921 (0.0014) [2024-08-05 20:19:27,859][15444] Updated weights for policy 0, policy_version 59931 (0.0012) [2024-08-05 20:19:28,118][15372] Fps is (10 sec: 23758.3, 60 sec: 24029.9, 300 sec: 24103.9). Total num frames: 490954752. Throughput: 0: 6015.6. Samples: 122739090. Policy #0 lag: (min: 1.0, avg: 3.6, max: 8.0) [2024-08-05 20:19:28,119][15372] Avg episode reward: [(0, '44.327')] [2024-08-05 20:19:31,627][15444] Updated weights for policy 0, policy_version 59941 (0.0022) [2024-08-05 20:19:33,118][15372] Fps is (10 sec: 23772.0, 60 sec: 23893.3, 300 sec: 24076.2). Total num frames: 491069440. Throughput: 0: 6017.3. Samples: 122774530. Policy #0 lag: (min: 1.0, avg: 3.6, max: 8.0) [2024-08-05 20:19:33,119][15372] Avg episode reward: [(0, '43.944')] [2024-08-05 20:19:34,643][15444] Updated weights for policy 0, policy_version 59951 (0.0012) [2024-08-05 20:19:38,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.9, 300 sec: 24103.9). Total num frames: 491192320. Throughput: 0: 6023.4. Samples: 122793610. Policy #0 lag: (min: 1.0, avg: 3.6, max: 8.0) [2024-08-05 20:19:38,126][15372] Avg episode reward: [(0, '44.083')] [2024-08-05 20:19:38,185][15444] Updated weights for policy 0, policy_version 59961 (0.0017) [2024-08-05 20:19:41,620][15444] Updated weights for policy 0, policy_version 59971 (0.0020) [2024-08-05 20:19:43,119][15372] Fps is (10 sec: 24574.8, 60 sec: 24029.6, 300 sec: 24104.0). Total num frames: 491315200. Throughput: 0: 6029.7. Samples: 122829640. Policy #0 lag: (min: 1.0, avg: 3.6, max: 8.0) [2024-08-05 20:19:43,127][15372] Avg episode reward: [(0, '44.304')] [2024-08-05 20:19:44,849][15444] Updated weights for policy 0, policy_version 59981 (0.0013) [2024-08-05 20:19:48,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 491438080. Throughput: 0: 6026.3. Samples: 122865530. Policy #0 lag: (min: 0.0, avg: 4.6, max: 9.0) [2024-08-05 20:19:48,126][15372] Avg episode reward: [(0, '44.346')] [2024-08-05 20:19:48,522][15444] Updated weights for policy 0, policy_version 59991 (0.0011) [2024-08-05 20:19:51,823][15444] Updated weights for policy 0, policy_version 60001 (0.0031) [2024-08-05 20:19:53,119][15372] Fps is (10 sec: 23757.0, 60 sec: 24029.8, 300 sec: 24103.9). Total num frames: 491552768. Throughput: 0: 6026.8. Samples: 122883980. Policy #0 lag: (min: 0.0, avg: 4.6, max: 9.0) [2024-08-05 20:19:53,119][15372] Avg episode reward: [(0, '44.643')] [2024-08-05 20:19:55,152][15444] Updated weights for policy 0, policy_version 60011 (0.0013) [2024-08-05 20:19:57,525][15417] Signal inference workers to stop experience collection... (22000 times) [2024-08-05 20:19:57,533][15417] Signal inference workers to resume experience collection... (22000 times) [2024-08-05 20:19:57,563][15444] InferenceWorker_p0-w0: stopping experience collection (22000 times) [2024-08-05 20:19:57,568][15444] InferenceWorker_p0-w0: resuming experience collection (22000 times) [2024-08-05 20:19:58,119][15372] Fps is (10 sec: 23756.4, 60 sec: 24029.8, 300 sec: 24131.7). Total num frames: 491675648. Throughput: 0: 6012.9. Samples: 122919230. Policy #0 lag: (min: 0.0, avg: 4.6, max: 9.0) [2024-08-05 20:19:58,126][15372] Avg episode reward: [(0, '45.193')] [2024-08-05 20:19:59,036][15444] Updated weights for policy 0, policy_version 60021 (0.0011) [2024-08-05 20:20:01,940][15444] Updated weights for policy 0, policy_version 60031 (0.0013) [2024-08-05 20:20:03,118][15372] Fps is (10 sec: 23757.9, 60 sec: 24030.1, 300 sec: 24103.9). Total num frames: 491790336. Throughput: 0: 6014.1. Samples: 122955200. Policy #0 lag: (min: 0.0, avg: 4.6, max: 9.0) [2024-08-05 20:20:03,119][15372] Avg episode reward: [(0, '44.961')] [2024-08-05 20:20:05,512][15444] Updated weights for policy 0, policy_version 60041 (0.0030) [2024-08-05 20:20:08,119][15372] Fps is (10 sec: 24575.0, 60 sec: 24166.2, 300 sec: 24131.7). Total num frames: 491921408. Throughput: 0: 6015.5. Samples: 122973650. Policy #0 lag: (min: 0.0, avg: 4.6, max: 9.0) [2024-08-05 20:20:08,119][15372] Avg episode reward: [(0, '43.790')] [2024-08-05 20:20:08,927][15444] Updated weights for policy 0, policy_version 60051 (0.0019) [2024-08-05 20:20:12,108][15444] Updated weights for policy 0, policy_version 60061 (0.0011) [2024-08-05 20:20:13,119][15372] Fps is (10 sec: 24575.7, 60 sec: 24029.8, 300 sec: 24103.9). Total num frames: 492036096. Throughput: 0: 6029.1. Samples: 123010400. Policy #0 lag: (min: 0.0, avg: 4.6, max: 9.0) [2024-08-05 20:20:13,119][15372] Avg episode reward: [(0, '42.992')] [2024-08-05 20:20:15,573][15444] Updated weights for policy 0, policy_version 60071 (0.0034) [2024-08-05 20:20:18,118][15372] Fps is (10 sec: 23758.1, 60 sec: 24030.1, 300 sec: 24103.9). Total num frames: 492158976. Throughput: 0: 6057.3. Samples: 123047110. Policy #0 lag: (min: 0.0, avg: 4.6, max: 9.0) [2024-08-05 20:20:18,119][15372] Avg episode reward: [(0, '43.299')] [2024-08-05 20:20:18,144][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000060079_492167168.pth... [2024-08-05 20:20:18,285][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000059371_486367232.pth [2024-08-05 20:20:18,926][15444] Updated weights for policy 0, policy_version 60081 (0.0021) [2024-08-05 20:20:22,259][15444] Updated weights for policy 0, policy_version 60091 (0.0014) [2024-08-05 20:20:23,118][15372] Fps is (10 sec: 24576.3, 60 sec: 24169.0, 300 sec: 24131.9). Total num frames: 492281856. Throughput: 0: 6017.1. Samples: 123064380. Policy #0 lag: (min: 1.0, avg: 3.6, max: 7.0) [2024-08-05 20:20:23,126][15372] Avg episode reward: [(0, '44.848')] [2024-08-05 20:20:25,711][15444] Updated weights for policy 0, policy_version 60101 (0.0013) [2024-08-05 20:20:28,119][15372] Fps is (10 sec: 24575.1, 60 sec: 24166.2, 300 sec: 24103.9). Total num frames: 492404736. Throughput: 0: 6032.5. Samples: 123101100. Policy #0 lag: (min: 1.0, avg: 3.6, max: 7.0) [2024-08-05 20:20:28,119][15372] Avg episode reward: [(0, '44.523')] [2024-08-05 20:20:28,995][15444] Updated weights for policy 0, policy_version 60111 (0.0019) [2024-08-05 20:20:32,462][15444] Updated weights for policy 0, policy_version 60121 (0.0017) [2024-08-05 20:20:33,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.4, 300 sec: 24103.9). Total num frames: 492519424. Throughput: 0: 6023.8. Samples: 123136600. Policy #0 lag: (min: 1.0, avg: 3.6, max: 7.0) [2024-08-05 20:20:33,119][15372] Avg episode reward: [(0, '43.921')] [2024-08-05 20:20:35,914][15444] Updated weights for policy 0, policy_version 60131 (0.0016) [2024-08-05 20:20:38,118][15372] Fps is (10 sec: 23757.7, 60 sec: 24166.4, 300 sec: 24103.9). Total num frames: 492642304. Throughput: 0: 6027.6. Samples: 123155220. Policy #0 lag: (min: 1.0, avg: 3.6, max: 7.0) [2024-08-05 20:20:38,126][15372] Avg episode reward: [(0, '44.389')] [2024-08-05 20:20:39,268][15444] Updated weights for policy 0, policy_version 60141 (0.0013) [2024-08-05 20:20:42,729][15444] Updated weights for policy 0, policy_version 60151 (0.0014) [2024-08-05 20:20:43,118][15372] Fps is (10 sec: 24575.8, 60 sec: 24166.6, 300 sec: 24103.9). Total num frames: 492765184. Throughput: 0: 6044.5. Samples: 123191230. Policy #0 lag: (min: 1.0, avg: 3.6, max: 7.0) [2024-08-05 20:20:43,119][15372] Avg episode reward: [(0, '44.217')] [2024-08-05 20:20:45,958][15444] Updated weights for policy 0, policy_version 60161 (0.0026) [2024-08-05 20:20:48,120][15372] Fps is (10 sec: 24572.2, 60 sec: 24165.8, 300 sec: 24131.6). Total num frames: 492888064. Throughput: 0: 6040.0. Samples: 123227010. Policy #0 lag: (min: 1.0, avg: 3.6, max: 7.0) [2024-08-05 20:20:48,128][15372] Avg episode reward: [(0, '44.012')] [2024-08-05 20:20:49,534][15444] Updated weights for policy 0, policy_version 60171 (0.0016) [2024-08-05 20:20:51,393][15417] Signal inference workers to stop experience collection... (22050 times) [2024-08-05 20:20:51,393][15417] Signal inference workers to resume experience collection... (22050 times) [2024-08-05 20:20:51,442][15444] InferenceWorker_p0-w0: stopping experience collection (22050 times) [2024-08-05 20:20:51,442][15444] InferenceWorker_p0-w0: resuming experience collection (22050 times) [2024-08-05 20:20:52,874][15444] Updated weights for policy 0, policy_version 60181 (0.0017) [2024-08-05 20:20:53,119][15372] Fps is (10 sec: 23756.6, 60 sec: 24166.5, 300 sec: 24103.9). Total num frames: 493002752. Throughput: 0: 6035.6. Samples: 123245250. Policy #0 lag: (min: 1.0, avg: 3.6, max: 7.0) [2024-08-05 20:20:53,119][15372] Avg episode reward: [(0, '43.881')] [2024-08-05 20:20:56,155][15444] Updated weights for policy 0, policy_version 60191 (0.0014) [2024-08-05 20:20:58,118][15372] Fps is (10 sec: 23760.6, 60 sec: 24166.5, 300 sec: 24103.9). Total num frames: 493125632. Throughput: 0: 6030.9. Samples: 123281790. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 20:20:58,126][15372] Avg episode reward: [(0, '43.366')] [2024-08-05 20:20:59,558][15444] Updated weights for policy 0, policy_version 60201 (0.0033) [2024-08-05 20:21:02,946][15444] Updated weights for policy 0, policy_version 60211 (0.0022) [2024-08-05 20:21:03,119][15372] Fps is (10 sec: 24575.6, 60 sec: 24302.8, 300 sec: 24131.7). Total num frames: 493248512. Throughput: 0: 6014.4. Samples: 123317760. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 20:21:03,119][15372] Avg episode reward: [(0, '42.391')] [2024-08-05 20:21:06,327][15444] Updated weights for policy 0, policy_version 60221 (0.0023) [2024-08-05 20:21:08,119][15372] Fps is (10 sec: 23754.9, 60 sec: 24029.8, 300 sec: 24131.6). Total num frames: 493363200. Throughput: 0: 6050.8. Samples: 123336670. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 20:21:08,128][15372] Avg episode reward: [(0, '42.326')] [2024-08-05 20:21:09,871][15444] Updated weights for policy 0, policy_version 60231 (0.0015) [2024-08-05 20:21:13,089][15444] Updated weights for policy 0, policy_version 60241 (0.0022) [2024-08-05 20:21:13,118][15372] Fps is (10 sec: 24576.8, 60 sec: 24303.0, 300 sec: 24159.5). Total num frames: 493494272. Throughput: 0: 6045.2. Samples: 123373130. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 20:21:13,119][15372] Avg episode reward: [(0, '43.058')] [2024-08-05 20:21:16,703][15444] Updated weights for policy 0, policy_version 60251 (0.0011) [2024-08-05 20:21:18,119][15372] Fps is (10 sec: 24577.2, 60 sec: 24166.3, 300 sec: 24131.7). Total num frames: 493608960. Throughput: 0: 6051.7. Samples: 123408930. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 20:21:18,119][15372] Avg episode reward: [(0, '43.840')] [2024-08-05 20:21:19,642][15444] Updated weights for policy 0, policy_version 60261 (0.0023) [2024-08-05 20:21:23,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 493731840. Throughput: 0: 6064.5. Samples: 123428120. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 20:21:23,126][15372] Avg episode reward: [(0, '43.625')] [2024-08-05 20:21:23,284][15444] Updated weights for policy 0, policy_version 60271 (0.0013) [2024-08-05 20:21:26,837][15444] Updated weights for policy 0, policy_version 60281 (0.0019) [2024-08-05 20:21:28,118][15372] Fps is (10 sec: 24576.7, 60 sec: 24166.6, 300 sec: 24159.5). Total num frames: 493854720. Throughput: 0: 6061.3. Samples: 123463990. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 20:21:28,119][15372] Avg episode reward: [(0, '44.307')] [2024-08-05 20:21:29,884][15444] Updated weights for policy 0, policy_version 60291 (0.0025) [2024-08-05 20:21:33,119][15372] Fps is (10 sec: 23756.5, 60 sec: 24166.3, 300 sec: 24131.7). Total num frames: 493969408. Throughput: 0: 6067.8. Samples: 123500050. Policy #0 lag: (min: 0.0, avg: 3.5, max: 7.0) [2024-08-05 20:21:33,126][15372] Avg episode reward: [(0, '44.984')] [2024-08-05 20:21:33,605][15444] Updated weights for policy 0, policy_version 60301 (0.0011) [2024-08-05 20:21:36,570][15444] Updated weights for policy 0, policy_version 60311 (0.0011) [2024-08-05 20:21:38,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 494092288. Throughput: 0: 6086.7. Samples: 123519150. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 20:21:38,126][15372] Avg episode reward: [(0, '44.509')] [2024-08-05 20:21:38,532][15417] Signal inference workers to stop experience collection... (22100 times) [2024-08-05 20:21:38,538][15417] Signal inference workers to resume experience collection... (22100 times) [2024-08-05 20:21:38,613][15444] InferenceWorker_p0-w0: stopping experience collection (22100 times) [2024-08-05 20:21:38,613][15444] InferenceWorker_p0-w0: resuming experience collection (22100 times) [2024-08-05 20:21:40,251][15444] Updated weights for policy 0, policy_version 60321 (0.0012) [2024-08-05 20:21:43,118][15372] Fps is (10 sec: 25395.4, 60 sec: 24302.9, 300 sec: 24131.7). Total num frames: 494223360. Throughput: 0: 6084.7. Samples: 123555600. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 20:21:43,119][15372] Avg episode reward: [(0, '44.013')] [2024-08-05 20:21:43,402][15444] Updated weights for policy 0, policy_version 60331 (0.0045) [2024-08-05 20:21:46,970][15444] Updated weights for policy 0, policy_version 60341 (0.0020) [2024-08-05 20:21:48,119][15372] Fps is (10 sec: 24575.3, 60 sec: 24166.9, 300 sec: 24159.5). Total num frames: 494338048. Throughput: 0: 6068.0. Samples: 123590820. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 20:21:48,119][15372] Avg episode reward: [(0, '43.891')] [2024-08-05 20:21:50,265][15444] Updated weights for policy 0, policy_version 60351 (0.0020) [2024-08-05 20:21:53,119][15372] Fps is (10 sec: 23756.6, 60 sec: 24303.0, 300 sec: 24132.4). Total num frames: 494460928. Throughput: 0: 6074.8. Samples: 123610030. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 20:21:53,126][15372] Avg episode reward: [(0, '43.968')] [2024-08-05 20:21:53,502][15444] Updated weights for policy 0, policy_version 60361 (0.0014) [2024-08-05 20:21:57,367][15444] Updated weights for policy 0, policy_version 60371 (0.0028) [2024-08-05 20:21:58,118][15372] Fps is (10 sec: 24576.6, 60 sec: 24302.9, 300 sec: 24159.5). Total num frames: 494583808. Throughput: 0: 6049.1. Samples: 123645340. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 20:21:58,119][15372] Avg episode reward: [(0, '42.712')] [2024-08-05 20:22:00,457][15444] Updated weights for policy 0, policy_version 60381 (0.0012) [2024-08-05 20:22:03,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24166.5, 300 sec: 24159.5). Total num frames: 494698496. Throughput: 0: 6040.9. Samples: 123680770. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 20:22:03,126][15372] Avg episode reward: [(0, '42.720')] [2024-08-05 20:22:04,172][15444] Updated weights for policy 0, policy_version 60391 (0.0020) [2024-08-05 20:22:07,531][15444] Updated weights for policy 0, policy_version 60401 (0.0025) [2024-08-05 20:22:08,118][15372] Fps is (10 sec: 22937.6, 60 sec: 24166.7, 300 sec: 24103.9). Total num frames: 494813184. Throughput: 0: 6017.1. Samples: 123698890. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 20:22:08,119][15372] Avg episode reward: [(0, '44.096')] [2024-08-05 20:22:10,675][15444] Updated weights for policy 0, policy_version 60411 (0.0013) [2024-08-05 20:22:12,913][15417] Signal inference workers to stop experience collection... (22150 times) [2024-08-05 20:22:12,921][15417] Signal inference workers to resume experience collection... (22150 times) [2024-08-05 20:22:12,959][15444] InferenceWorker_p0-w0: stopping experience collection (22150 times) [2024-08-05 20:22:12,959][15444] InferenceWorker_p0-w0: resuming experience collection (22150 times) [2024-08-05 20:22:13,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 494944256. Throughput: 0: 6022.2. Samples: 123734990. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 20:22:13,119][15372] Avg episode reward: [(0, '44.757')] [2024-08-05 20:22:14,448][15444] Updated weights for policy 0, policy_version 60421 (0.0031) [2024-08-05 20:22:17,838][15444] Updated weights for policy 0, policy_version 60431 (0.0029) [2024-08-05 20:22:18,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24166.5, 300 sec: 24131.7). Total num frames: 495058944. Throughput: 0: 6016.0. Samples: 123770770. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 20:22:18,119][15372] Avg episode reward: [(0, '43.776')] [2024-08-05 20:22:18,122][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000060432_495058944.pth... [2024-08-05 20:22:18,268][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000059726_489275392.pth [2024-08-05 20:22:21,115][15444] Updated weights for policy 0, policy_version 60441 (0.0021) [2024-08-05 20:22:23,118][15372] Fps is (10 sec: 22937.8, 60 sec: 24029.9, 300 sec: 24103.9). Total num frames: 495173632. Throughput: 0: 5993.8. Samples: 123788870. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 20:22:23,119][15372] Avg episode reward: [(0, '44.103')] [2024-08-05 20:22:24,524][15444] Updated weights for policy 0, policy_version 60451 (0.0016) [2024-08-05 20:22:27,711][15444] Updated weights for policy 0, policy_version 60461 (0.0019) [2024-08-05 20:22:28,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 495304704. Throughput: 0: 5997.1. Samples: 123825470. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 20:22:28,119][15372] Avg episode reward: [(0, '44.475')] [2024-08-05 20:22:31,386][15444] Updated weights for policy 0, policy_version 60471 (0.0027) [2024-08-05 20:22:33,119][15372] Fps is (10 sec: 24575.7, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 495419392. Throughput: 0: 6008.9. Samples: 123861220. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 20:22:33,119][15372] Avg episode reward: [(0, '43.791')] [2024-08-05 20:22:34,596][15444] Updated weights for policy 0, policy_version 60481 (0.0010) [2024-08-05 20:22:37,971][15444] Updated weights for policy 0, policy_version 60491 (0.0019) [2024-08-05 20:22:38,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 495542272. Throughput: 0: 6001.1. Samples: 123880080. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 20:22:38,119][15372] Avg episode reward: [(0, '43.594')] [2024-08-05 20:22:41,267][15444] Updated weights for policy 0, policy_version 60501 (0.0024) [2024-08-05 20:22:43,119][15372] Fps is (10 sec: 23754.9, 60 sec: 23893.0, 300 sec: 24103.9). Total num frames: 495656960. Throughput: 0: 6013.2. Samples: 123915940. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 20:22:43,128][15372] Avg episode reward: [(0, '44.303')] [2024-08-05 20:22:44,775][15444] Updated weights for policy 0, policy_version 60511 (0.0016) [2024-08-05 20:22:48,026][15444] Updated weights for policy 0, policy_version 60521 (0.0018) [2024-08-05 20:22:48,120][15372] Fps is (10 sec: 24573.0, 60 sec: 24166.0, 300 sec: 24159.4). Total num frames: 495788032. Throughput: 0: 6040.9. Samples: 123952620. Policy #0 lag: (min: 0.0, avg: 3.6, max: 8.0) [2024-08-05 20:22:48,120][15372] Avg episode reward: [(0, '44.867')] [2024-08-05 20:22:51,454][15444] Updated weights for policy 0, policy_version 60531 (0.0013) [2024-08-05 20:22:53,119][15372] Fps is (10 sec: 25397.2, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 495910912. Throughput: 0: 6048.2. Samples: 123971060. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 20:22:53,127][15372] Avg episode reward: [(0, '44.037')] [2024-08-05 20:22:54,973][15444] Updated weights for policy 0, policy_version 60541 (0.0023) [2024-08-05 20:22:58,118][15372] Fps is (10 sec: 23759.7, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 496025600. Throughput: 0: 6056.9. Samples: 124007550. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 20:22:58,126][15372] Avg episode reward: [(0, '44.002')] [2024-08-05 20:22:58,199][15444] Updated weights for policy 0, policy_version 60551 (0.0029) [2024-08-05 20:23:01,507][15444] Updated weights for policy 0, policy_version 60561 (0.0016) [2024-08-05 20:23:03,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 496148480. Throughput: 0: 6050.2. Samples: 124043030. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 20:23:03,126][15372] Avg episode reward: [(0, '43.802')] [2024-08-05 20:23:05,067][15444] Updated weights for policy 0, policy_version 60571 (0.0013) [2024-08-05 20:23:08,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24302.9, 300 sec: 24159.5). Total num frames: 496271360. Throughput: 0: 6056.4. Samples: 124061410. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 20:23:08,126][15372] Avg episode reward: [(0, '43.908')] [2024-08-05 20:23:08,534][15444] Updated weights for policy 0, policy_version 60581 (0.0013) [2024-08-05 20:23:11,703][15444] Updated weights for policy 0, policy_version 60591 (0.0023) [2024-08-05 20:23:13,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 496386048. Throughput: 0: 6052.0. Samples: 124097810. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 20:23:13,119][15372] Avg episode reward: [(0, '43.148')] [2024-08-05 20:23:13,948][15417] Signal inference workers to stop experience collection... (22200 times) [2024-08-05 20:23:13,949][15417] Signal inference workers to resume experience collection... (22200 times) [2024-08-05 20:23:13,995][15444] InferenceWorker_p0-w0: stopping experience collection (22200 times) [2024-08-05 20:23:13,996][15444] InferenceWorker_p0-w0: resuming experience collection (22200 times) [2024-08-05 20:23:15,192][15444] Updated weights for policy 0, policy_version 60601 (0.0018) [2024-08-05 20:23:18,119][15372] Fps is (10 sec: 24575.6, 60 sec: 24302.9, 300 sec: 24159.5). Total num frames: 496517120. Throughput: 0: 6070.2. Samples: 124134380. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 20:23:18,119][15372] Avg episode reward: [(0, '43.787')] [2024-08-05 20:23:18,326][15444] Updated weights for policy 0, policy_version 60611 (0.0014) [2024-08-05 20:23:22,186][15444] Updated weights for policy 0, policy_version 60621 (0.0042) [2024-08-05 20:23:23,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24302.9, 300 sec: 24131.7). Total num frames: 496631808. Throughput: 0: 6048.0. Samples: 124152240. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 20:23:23,119][15372] Avg episode reward: [(0, '43.599')] [2024-08-05 20:23:25,296][15444] Updated weights for policy 0, policy_version 60631 (0.0012) [2024-08-05 20:23:28,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 496754688. Throughput: 0: 6063.7. Samples: 124188800. Policy #0 lag: (min: 1.0, avg: 3.8, max: 8.0) [2024-08-05 20:23:28,119][15372] Avg episode reward: [(0, '44.978')] [2024-08-05 20:23:28,698][15444] Updated weights for policy 0, policy_version 60641 (0.0018) [2024-08-05 20:23:32,210][15444] Updated weights for policy 0, policy_version 60651 (0.0019) [2024-08-05 20:23:33,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24303.0, 300 sec: 24159.5). Total num frames: 496877568. Throughput: 0: 6043.7. Samples: 124224580. Policy #0 lag: (min: 1.0, avg: 3.8, max: 8.0) [2024-08-05 20:23:33,119][15372] Avg episode reward: [(0, '44.918')] [2024-08-05 20:23:35,417][15444] Updated weights for policy 0, policy_version 60661 (0.0016) [2024-08-05 20:23:38,119][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 496992256. Throughput: 0: 6049.1. Samples: 124243270. Policy #0 lag: (min: 1.0, avg: 3.8, max: 8.0) [2024-08-05 20:23:38,119][15372] Avg episode reward: [(0, '43.737')] [2024-08-05 20:23:38,905][15444] Updated weights for policy 0, policy_version 60671 (0.0023) [2024-08-05 20:23:42,184][15444] Updated weights for policy 0, policy_version 60681 (0.0017) [2024-08-05 20:23:43,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24303.3, 300 sec: 24159.5). Total num frames: 497115136. Throughput: 0: 6031.8. Samples: 124278980. Policy #0 lag: (min: 1.0, avg: 3.8, max: 8.0) [2024-08-05 20:23:43,119][15372] Avg episode reward: [(0, '43.257')] [2024-08-05 20:23:45,613][15444] Updated weights for policy 0, policy_version 60691 (0.0018) [2024-08-05 20:23:48,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24166.9, 300 sec: 24159.5). Total num frames: 497238016. Throughput: 0: 6067.1. Samples: 124316050. Policy #0 lag: (min: 1.0, avg: 3.8, max: 8.0) [2024-08-05 20:23:48,119][15372] Avg episode reward: [(0, '44.567')] [2024-08-05 20:23:49,006][15444] Updated weights for policy 0, policy_version 60701 (0.0024) [2024-08-05 20:23:52,407][15444] Updated weights for policy 0, policy_version 60711 (0.0012) [2024-08-05 20:23:53,118][15372] Fps is (10 sec: 23756.6, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 497352704. Throughput: 0: 6060.2. Samples: 124334120. Policy #0 lag: (min: 1.0, avg: 3.8, max: 8.0) [2024-08-05 20:23:53,119][15372] Avg episode reward: [(0, '44.247')] [2024-08-05 20:23:55,811][15444] Updated weights for policy 0, policy_version 60721 (0.0019) [2024-08-05 20:23:58,119][15372] Fps is (10 sec: 24575.7, 60 sec: 24302.9, 300 sec: 24187.3). Total num frames: 497483776. Throughput: 0: 6049.3. Samples: 124370030. Policy #0 lag: (min: 1.0, avg: 3.8, max: 8.0) [2024-08-05 20:23:58,128][15372] Avg episode reward: [(0, '43.043')] [2024-08-05 20:23:59,142][15444] Updated weights for policy 0, policy_version 60731 (0.0029) [2024-08-05 20:24:02,685][15444] Updated weights for policy 0, policy_version 60741 (0.0030) [2024-08-05 20:24:03,119][15372] Fps is (10 sec: 24575.9, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 497598464. Throughput: 0: 6022.0. Samples: 124405370. Policy #0 lag: (min: 1.0, avg: 3.8, max: 8.0) [2024-08-05 20:24:03,119][15372] Avg episode reward: [(0, '44.082')] [2024-08-05 20:24:06,108][15444] Updated weights for policy 0, policy_version 60751 (0.0016) [2024-08-05 20:24:07,559][15417] Signal inference workers to stop experience collection... (22250 times) [2024-08-05 20:24:07,567][15417] Signal inference workers to resume experience collection... (22250 times) [2024-08-05 20:24:07,597][15444] InferenceWorker_p0-w0: stopping experience collection (22250 times) [2024-08-05 20:24:07,597][15444] InferenceWorker_p0-w0: resuming experience collection (22250 times) [2024-08-05 20:24:08,119][15372] Fps is (10 sec: 23757.0, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 497721344. Throughput: 0: 6036.2. Samples: 124423870. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 20:24:08,119][15372] Avg episode reward: [(0, '45.255')] [2024-08-05 20:24:09,426][15444] Updated weights for policy 0, policy_version 60761 (0.0012) [2024-08-05 20:24:12,722][15444] Updated weights for policy 0, policy_version 60771 (0.0023) [2024-08-05 20:24:13,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 497836032. Throughput: 0: 6039.1. Samples: 124460560. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 20:24:13,119][15372] Avg episode reward: [(0, '44.497')] [2024-08-05 20:24:16,076][15444] Updated weights for policy 0, policy_version 60781 (0.0011) [2024-08-05 20:24:18,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24166.4, 300 sec: 24187.8). Total num frames: 497967104. Throughput: 0: 6050.9. Samples: 124496870. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 20:24:18,120][15372] Avg episode reward: [(0, '44.472')] [2024-08-05 20:24:18,123][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000060787_497967104.pth... [2024-08-05 20:24:18,313][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000060079_492167168.pth [2024-08-05 20:24:19,746][15444] Updated weights for policy 0, policy_version 60791 (0.0012) [2024-08-05 20:24:22,980][15444] Updated weights for policy 0, policy_version 60801 (0.0012) [2024-08-05 20:24:23,119][15372] Fps is (10 sec: 24575.7, 60 sec: 24166.3, 300 sec: 24159.4). Total num frames: 498081792. Throughput: 0: 6028.4. Samples: 124514550. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 20:24:23,119][15372] Avg episode reward: [(0, '44.580')] [2024-08-05 20:24:26,465][15444] Updated weights for policy 0, policy_version 60811 (0.0012) [2024-08-05 20:24:28,118][15372] Fps is (10 sec: 22937.7, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 498196480. Throughput: 0: 6034.0. Samples: 124550510. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 20:24:28,126][15372] Avg episode reward: [(0, '43.966')] [2024-08-05 20:24:29,734][15444] Updated weights for policy 0, policy_version 60821 (0.0019) [2024-08-05 20:24:33,119][15372] Fps is (10 sec: 23756.6, 60 sec: 24029.8, 300 sec: 24159.4). Total num frames: 498319360. Throughput: 0: 6015.1. Samples: 124586730. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 20:24:33,127][15372] Avg episode reward: [(0, '42.966')] [2024-08-05 20:24:33,189][15444] Updated weights for policy 0, policy_version 60831 (0.0024) [2024-08-05 20:24:36,759][15444] Updated weights for policy 0, policy_version 60841 (0.0017) [2024-08-05 20:24:38,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 498442240. Throughput: 0: 6019.6. Samples: 124605000. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 20:24:38,119][15372] Avg episode reward: [(0, '43.649')] [2024-08-05 20:24:39,789][15444] Updated weights for policy 0, policy_version 60851 (0.0019) [2024-08-05 20:24:43,118][15372] Fps is (10 sec: 24576.6, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 498565120. Throughput: 0: 6038.0. Samples: 124641740. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 20:24:43,126][15372] Avg episode reward: [(0, '44.103')] [2024-08-05 20:24:43,530][15444] Updated weights for policy 0, policy_version 60861 (0.0015) [2024-08-05 20:24:46,482][15444] Updated weights for policy 0, policy_version 60871 (0.0020) [2024-08-05 20:24:48,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24166.4, 300 sec: 24187.3). Total num frames: 498688000. Throughput: 0: 6059.6. Samples: 124678050. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 20:24:48,126][15372] Avg episode reward: [(0, '43.956')] [2024-08-05 20:24:48,196][15417] Signal inference workers to stop experience collection... (22300 times) [2024-08-05 20:24:48,202][15417] Signal inference workers to resume experience collection... (22300 times) [2024-08-05 20:24:48,255][15444] InferenceWorker_p0-w0: stopping experience collection (22300 times) [2024-08-05 20:24:48,258][15444] InferenceWorker_p0-w0: resuming experience collection (22300 times) [2024-08-05 20:24:50,164][15444] Updated weights for policy 0, policy_version 60881 (0.0041) [2024-08-05 20:24:53,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 498810880. Throughput: 0: 6046.0. Samples: 124695940. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 20:24:53,126][15372] Avg episode reward: [(0, '43.018')] [2024-08-05 20:24:53,574][15444] Updated weights for policy 0, policy_version 60891 (0.0021) [2024-08-05 20:24:56,731][15444] Updated weights for policy 0, policy_version 60901 (0.0013) [2024-08-05 20:24:58,119][15372] Fps is (10 sec: 23756.2, 60 sec: 24029.8, 300 sec: 24187.2). Total num frames: 498925568. Throughput: 0: 6037.1. Samples: 124732230. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 20:24:58,119][15372] Avg episode reward: [(0, '42.958')] [2024-08-05 20:25:00,248][15444] Updated weights for policy 0, policy_version 60911 (0.0026) [2024-08-05 20:25:03,119][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 499048448. Throughput: 0: 6020.0. Samples: 124767770. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 20:25:03,126][15372] Avg episode reward: [(0, '43.199')] [2024-08-05 20:25:03,815][15444] Updated weights for policy 0, policy_version 60921 (0.0014) [2024-08-05 20:25:07,155][15444] Updated weights for policy 0, policy_version 60931 (0.0015) [2024-08-05 20:25:08,119][15372] Fps is (10 sec: 23756.6, 60 sec: 24029.8, 300 sec: 24159.4). Total num frames: 499163136. Throughput: 0: 6040.9. Samples: 124786390. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 20:25:08,119][15372] Avg episode reward: [(0, '44.230')] [2024-08-05 20:25:10,427][15444] Updated weights for policy 0, policy_version 60941 (0.0017) [2024-08-05 20:25:13,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 499286016. Throughput: 0: 6053.8. Samples: 124822930. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 20:25:13,119][15372] Avg episode reward: [(0, '44.579')] [2024-08-05 20:25:13,859][15444] Updated weights for policy 0, policy_version 60951 (0.0012) [2024-08-05 20:25:17,419][15444] Updated weights for policy 0, policy_version 60961 (0.0011) [2024-08-05 20:25:18,118][15372] Fps is (10 sec: 24576.7, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 499408896. Throughput: 0: 6041.1. Samples: 124858580. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 20:25:18,119][15372] Avg episode reward: [(0, '44.003')] [2024-08-05 20:25:20,481][15444] Updated weights for policy 0, policy_version 60971 (0.0013) [2024-08-05 20:25:23,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.5, 300 sec: 24159.5). Total num frames: 499531776. Throughput: 0: 6040.4. Samples: 124876820. Policy #0 lag: (min: 0.0, avg: 4.4, max: 8.0) [2024-08-05 20:25:23,126][15372] Avg episode reward: [(0, '43.631')] [2024-08-05 20:25:24,205][15444] Updated weights for policy 0, policy_version 60981 (0.0010) [2024-08-05 20:25:27,446][15444] Updated weights for policy 0, policy_version 60991 (0.0030) [2024-08-05 20:25:28,119][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.4, 300 sec: 24159.4). Total num frames: 499646464. Throughput: 0: 6021.8. Samples: 124912720. Policy #0 lag: (min: 0.0, avg: 4.4, max: 8.0) [2024-08-05 20:25:28,126][15372] Avg episode reward: [(0, '43.640')] [2024-08-05 20:25:30,883][15444] Updated weights for policy 0, policy_version 61001 (0.0023) [2024-08-05 20:25:31,029][15417] Signal inference workers to stop experience collection... (22350 times) [2024-08-05 20:25:31,036][15417] Signal inference workers to resume experience collection... (22350 times) [2024-08-05 20:25:31,108][15444] InferenceWorker_p0-w0: stopping experience collection (22350 times) [2024-08-05 20:25:31,114][15444] InferenceWorker_p0-w0: resuming experience collection (22350 times) [2024-08-05 20:25:33,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.5, 300 sec: 24159.5). Total num frames: 499769344. Throughput: 0: 6016.9. Samples: 124948810. Policy #0 lag: (min: 0.0, avg: 4.4, max: 8.0) [2024-08-05 20:25:33,119][15372] Avg episode reward: [(0, '44.318')] [2024-08-05 20:25:34,280][15444] Updated weights for policy 0, policy_version 61011 (0.0016) [2024-08-05 20:25:37,407][15444] Updated weights for policy 0, policy_version 61021 (0.0013) [2024-08-05 20:25:38,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 499892224. Throughput: 0: 6036.9. Samples: 124967600. Policy #0 lag: (min: 0.0, avg: 4.4, max: 8.0) [2024-08-05 20:25:38,126][15372] Avg episode reward: [(0, '43.723')] [2024-08-05 20:25:41,125][15444] Updated weights for policy 0, policy_version 61031 (0.0025) [2024-08-05 20:25:43,119][15372] Fps is (10 sec: 24575.7, 60 sec: 24166.3, 300 sec: 24159.6). Total num frames: 500015104. Throughput: 0: 6017.8. Samples: 125003030. Policy #0 lag: (min: 0.0, avg: 4.4, max: 8.0) [2024-08-05 20:25:43,119][15372] Avg episode reward: [(0, '44.484')] [2024-08-05 20:25:44,500][15444] Updated weights for policy 0, policy_version 61041 (0.0013) [2024-08-05 20:25:47,783][15444] Updated weights for policy 0, policy_version 61051 (0.0020) [2024-08-05 20:25:48,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 500137984. Throughput: 0: 6046.4. Samples: 125039860. Policy #0 lag: (min: 0.0, avg: 4.4, max: 8.0) [2024-08-05 20:25:48,119][15372] Avg episode reward: [(0, '44.440')] [2024-08-05 20:25:51,188][15444] Updated weights for policy 0, policy_version 61061 (0.0010) [2024-08-05 20:25:53,118][15372] Fps is (10 sec: 23757.1, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 500252672. Throughput: 0: 6053.8. Samples: 125058810. Policy #0 lag: (min: 0.0, avg: 4.4, max: 8.0) [2024-08-05 20:25:53,126][15372] Avg episode reward: [(0, '44.487')] [2024-08-05 20:25:54,482][15444] Updated weights for policy 0, policy_version 61071 (0.0022) [2024-08-05 20:25:57,883][15444] Updated weights for policy 0, policy_version 61081 (0.0010) [2024-08-05 20:25:58,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.5, 300 sec: 24159.5). Total num frames: 500375552. Throughput: 0: 6058.9. Samples: 125095580. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 20:25:58,119][15372] Avg episode reward: [(0, '44.403')] [2024-08-05 20:26:01,135][15444] Updated weights for policy 0, policy_version 61091 (0.0022) [2024-08-05 20:26:03,119][15372] Fps is (10 sec: 24575.7, 60 sec: 24166.4, 300 sec: 24187.3). Total num frames: 500498432. Throughput: 0: 6055.8. Samples: 125131090. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 20:26:03,127][15372] Avg episode reward: [(0, '44.799')] [2024-08-05 20:26:04,578][15444] Updated weights for policy 0, policy_version 61101 (0.0018) [2024-08-05 20:26:07,889][15444] Updated weights for policy 0, policy_version 61111 (0.0016) [2024-08-05 20:26:08,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24303.0, 300 sec: 24159.4). Total num frames: 500621312. Throughput: 0: 6063.5. Samples: 125149680. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 20:26:08,119][15372] Avg episode reward: [(0, '43.301')] [2024-08-05 20:26:11,277][15444] Updated weights for policy 0, policy_version 61121 (0.0015) [2024-08-05 20:26:13,118][15372] Fps is (10 sec: 24576.4, 60 sec: 24303.0, 300 sec: 24187.3). Total num frames: 500744192. Throughput: 0: 6069.3. Samples: 125185840. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 20:26:13,126][15372] Avg episode reward: [(0, '43.865')] [2024-08-05 20:26:14,654][15444] Updated weights for policy 0, policy_version 61131 (0.0012) [2024-08-05 20:26:17,902][15444] Updated weights for policy 0, policy_version 61141 (0.0029) [2024-08-05 20:26:18,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 500867072. Throughput: 0: 6087.1. Samples: 125222730. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 20:26:18,119][15372] Avg episode reward: [(0, '44.061')] [2024-08-05 20:26:18,122][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000061141_500867072.pth... [2024-08-05 20:26:18,258][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000060432_495058944.pth [2024-08-05 20:26:21,413][15444] Updated weights for policy 0, policy_version 61151 (0.0014) [2024-08-05 20:26:23,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 500981760. Throughput: 0: 6066.0. Samples: 125240570. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 20:26:23,127][15372] Avg episode reward: [(0, '43.969')] [2024-08-05 20:26:24,679][15444] Updated weights for policy 0, policy_version 61161 (0.0027) [2024-08-05 20:26:28,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24303.0, 300 sec: 24187.2). Total num frames: 501104640. Throughput: 0: 6092.9. Samples: 125277210. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 20:26:28,126][15372] Avg episode reward: [(0, '43.202')] [2024-08-05 20:26:28,168][15444] Updated weights for policy 0, policy_version 61171 (0.0012) [2024-08-05 20:26:28,941][15417] Signal inference workers to stop experience collection... (22400 times) [2024-08-05 20:26:28,942][15417] Signal inference workers to resume experience collection... (22400 times) [2024-08-05 20:26:28,990][15444] InferenceWorker_p0-w0: stopping experience collection (22400 times) [2024-08-05 20:26:28,998][15444] InferenceWorker_p0-w0: resuming experience collection (22400 times) [2024-08-05 20:26:31,876][15444] Updated weights for policy 0, policy_version 61181 (0.0034) [2024-08-05 20:26:33,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24303.0, 300 sec: 24187.2). Total num frames: 501227520. Throughput: 0: 6064.7. Samples: 125312770. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 20:26:33,126][15372] Avg episode reward: [(0, '42.318')] [2024-08-05 20:26:35,064][15444] Updated weights for policy 0, policy_version 61191 (0.0015) [2024-08-05 20:26:38,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 501342208. Throughput: 0: 6053.1. Samples: 125331200. Policy #0 lag: (min: 1.0, avg: 3.5, max: 8.0) [2024-08-05 20:26:38,126][15372] Avg episode reward: [(0, '43.060')] [2024-08-05 20:26:38,475][15444] Updated weights for policy 0, policy_version 61201 (0.0013) [2024-08-05 20:26:41,704][15444] Updated weights for policy 0, policy_version 61211 (0.0028) [2024-08-05 20:26:43,119][15372] Fps is (10 sec: 24575.4, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 501473280. Throughput: 0: 6042.2. Samples: 125367480. Policy #0 lag: (min: 1.0, avg: 3.5, max: 8.0) [2024-08-05 20:26:43,119][15372] Avg episode reward: [(0, '44.134')] [2024-08-05 20:26:45,037][15444] Updated weights for policy 0, policy_version 61221 (0.0011) [2024-08-05 20:26:48,119][15372] Fps is (10 sec: 25394.8, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 501596160. Throughput: 0: 6070.4. Samples: 125404260. Policy #0 lag: (min: 1.0, avg: 3.5, max: 8.0) [2024-08-05 20:26:48,126][15372] Avg episode reward: [(0, '43.338')] [2024-08-05 20:26:48,668][15444] Updated weights for policy 0, policy_version 61231 (0.0014) [2024-08-05 20:26:51,937][15444] Updated weights for policy 0, policy_version 61241 (0.0041) [2024-08-05 20:26:53,120][15372] Fps is (10 sec: 23753.4, 60 sec: 24302.3, 300 sec: 24159.3). Total num frames: 501710848. Throughput: 0: 6070.5. Samples: 125422860. Policy #0 lag: (min: 1.0, avg: 3.5, max: 8.0) [2024-08-05 20:26:53,120][15372] Avg episode reward: [(0, '43.828')] [2024-08-05 20:26:55,096][15444] Updated weights for policy 0, policy_version 61251 (0.0018) [2024-08-05 20:26:58,118][15372] Fps is (10 sec: 23757.2, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 501833728. Throughput: 0: 6074.2. Samples: 125459180. Policy #0 lag: (min: 1.0, avg: 3.5, max: 8.0) [2024-08-05 20:26:58,126][15372] Avg episode reward: [(0, '44.221')] [2024-08-05 20:26:58,593][15444] Updated weights for policy 0, policy_version 61261 (0.0014) [2024-08-05 20:27:01,874][15444] Updated weights for policy 0, policy_version 61271 (0.0016) [2024-08-05 20:27:03,118][15372] Fps is (10 sec: 24579.8, 60 sec: 24303.0, 300 sec: 24215.0). Total num frames: 501956608. Throughput: 0: 6053.3. Samples: 125495130. Policy #0 lag: (min: 1.0, avg: 3.5, max: 8.0) [2024-08-05 20:27:03,119][15372] Avg episode reward: [(0, '44.126')] [2024-08-05 20:27:05,608][15444] Updated weights for policy 0, policy_version 61281 (0.0014) [2024-08-05 20:27:06,852][15417] Signal inference workers to stop experience collection... (22450 times) [2024-08-05 20:27:06,852][15417] Signal inference workers to resume experience collection... (22450 times) [2024-08-05 20:27:06,890][15444] InferenceWorker_p0-w0: stopping experience collection (22450 times) [2024-08-05 20:27:06,890][15444] InferenceWorker_p0-w0: resuming experience collection (22450 times) [2024-08-05 20:27:08,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 502079488. Throughput: 0: 6071.8. Samples: 125513800. Policy #0 lag: (min: 1.0, avg: 3.5, max: 8.0) [2024-08-05 20:27:08,119][15372] Avg episode reward: [(0, '45.110')] [2024-08-05 20:27:08,754][15444] Updated weights for policy 0, policy_version 61291 (0.0037) [2024-08-05 20:27:12,042][15444] Updated weights for policy 0, policy_version 61301 (0.0015) [2024-08-05 20:27:13,119][15372] Fps is (10 sec: 24574.6, 60 sec: 24302.7, 300 sec: 24214.9). Total num frames: 502202368. Throughput: 0: 6090.1. Samples: 125551270. Policy #0 lag: (min: 1.0, avg: 3.5, max: 7.0) [2024-08-05 20:27:13,119][15372] Avg episode reward: [(0, '42.828')] [2024-08-05 20:27:15,233][15444] Updated weights for policy 0, policy_version 61311 (0.0017) [2024-08-05 20:27:18,119][15372] Fps is (10 sec: 23756.6, 60 sec: 24166.3, 300 sec: 24215.0). Total num frames: 502317056. Throughput: 0: 6108.8. Samples: 125587670. Policy #0 lag: (min: 1.0, avg: 3.5, max: 7.0) [2024-08-05 20:27:18,119][15372] Avg episode reward: [(0, '42.977')] [2024-08-05 20:27:18,605][15444] Updated weights for policy 0, policy_version 61321 (0.0014) [2024-08-05 20:27:22,342][15444] Updated weights for policy 0, policy_version 61331 (0.0027) [2024-08-05 20:27:23,118][15372] Fps is (10 sec: 24577.4, 60 sec: 24439.5, 300 sec: 24215.0). Total num frames: 502448128. Throughput: 0: 6107.8. Samples: 125606050. Policy #0 lag: (min: 1.0, avg: 3.5, max: 7.0) [2024-08-05 20:27:23,119][15372] Avg episode reward: [(0, '44.425')] [2024-08-05 20:27:25,460][15444] Updated weights for policy 0, policy_version 61341 (0.0022) [2024-08-05 20:27:28,119][15372] Fps is (10 sec: 24576.1, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 502562816. Throughput: 0: 6090.4. Samples: 125641550. Policy #0 lag: (min: 1.0, avg: 3.5, max: 7.0) [2024-08-05 20:27:28,126][15372] Avg episode reward: [(0, '44.089')] [2024-08-05 20:27:29,113][15444] Updated weights for policy 0, policy_version 61351 (0.0024) [2024-08-05 20:27:32,429][15444] Updated weights for policy 0, policy_version 61361 (0.0013) [2024-08-05 20:27:33,119][15372] Fps is (10 sec: 22937.1, 60 sec: 24166.3, 300 sec: 24187.2). Total num frames: 502677504. Throughput: 0: 6062.4. Samples: 125677070. Policy #0 lag: (min: 1.0, avg: 3.5, max: 7.0) [2024-08-05 20:27:33,127][15372] Avg episode reward: [(0, '43.302')] [2024-08-05 20:27:35,634][15444] Updated weights for policy 0, policy_version 61371 (0.0013) [2024-08-05 20:27:38,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24439.5, 300 sec: 24242.8). Total num frames: 502808576. Throughput: 0: 6045.8. Samples: 125694910. Policy #0 lag: (min: 1.0, avg: 3.5, max: 7.0) [2024-08-05 20:27:38,126][15372] Avg episode reward: [(0, '43.352')] [2024-08-05 20:27:39,377][15444] Updated weights for policy 0, policy_version 61381 (0.0018) [2024-08-05 20:27:42,574][15444] Updated weights for policy 0, policy_version 61391 (0.0010) [2024-08-05 20:27:43,118][15372] Fps is (10 sec: 24576.5, 60 sec: 24166.4, 300 sec: 24187.3). Total num frames: 502923264. Throughput: 0: 6051.3. Samples: 125731490. Policy #0 lag: (min: 1.0, avg: 3.5, max: 7.0) [2024-08-05 20:27:43,119][15372] Avg episode reward: [(0, '44.872')] [2024-08-05 20:27:45,908][15444] Updated weights for policy 0, policy_version 61401 (0.0013) [2024-08-05 20:27:48,123][15372] Fps is (10 sec: 23745.6, 60 sec: 24164.5, 300 sec: 24186.8). Total num frames: 503046144. Throughput: 0: 6050.9. Samples: 125767450. Policy #0 lag: (min: 1.0, avg: 3.5, max: 7.0) [2024-08-05 20:27:48,131][15372] Avg episode reward: [(0, '44.416')] [2024-08-05 20:27:49,551][15444] Updated weights for policy 0, policy_version 61411 (0.0016) [2024-08-05 20:27:52,124][15417] Signal inference workers to stop experience collection... (22500 times) [2024-08-05 20:27:52,136][15417] Signal inference workers to resume experience collection... (22500 times) [2024-08-05 20:27:52,154][15444] InferenceWorker_p0-w0: stopping experience collection (22500 times) [2024-08-05 20:27:52,160][15444] InferenceWorker_p0-w0: resuming experience collection (22500 times) [2024-08-05 20:27:52,672][15444] Updated weights for policy 0, policy_version 61421 (0.0021) [2024-08-05 20:27:53,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24167.0, 300 sec: 24187.2). Total num frames: 503160832. Throughput: 0: 6028.0. Samples: 125785060. Policy #0 lag: (min: 0.0, avg: 3.2, max: 7.0) [2024-08-05 20:27:53,119][15372] Avg episode reward: [(0, '43.614')] [2024-08-05 20:27:56,251][15444] Updated weights for policy 0, policy_version 61431 (0.0013) [2024-08-05 20:27:58,118][15372] Fps is (10 sec: 23768.0, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 503283712. Throughput: 0: 5997.4. Samples: 125821150. Policy #0 lag: (min: 0.0, avg: 3.2, max: 7.0) [2024-08-05 20:27:58,126][15372] Avg episode reward: [(0, '43.300')] [2024-08-05 20:27:59,660][15444] Updated weights for policy 0, policy_version 61441 (0.0024) [2024-08-05 20:28:02,901][15444] Updated weights for policy 0, policy_version 61451 (0.0018) [2024-08-05 20:28:03,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 503406592. Throughput: 0: 6011.6. Samples: 125858190. Policy #0 lag: (min: 0.0, avg: 3.2, max: 7.0) [2024-08-05 20:28:03,119][15372] Avg episode reward: [(0, '43.042')] [2024-08-05 20:28:06,347][15444] Updated weights for policy 0, policy_version 61461 (0.0022) [2024-08-05 20:28:08,121][15372] Fps is (10 sec: 24569.9, 60 sec: 24165.4, 300 sec: 24214.8). Total num frames: 503529472. Throughput: 0: 6007.4. Samples: 125876400. Policy #0 lag: (min: 0.0, avg: 3.2, max: 7.0) [2024-08-05 20:28:08,129][15372] Avg episode reward: [(0, '43.173')] [2024-08-05 20:28:09,834][15444] Updated weights for policy 0, policy_version 61471 (0.0032) [2024-08-05 20:28:13,119][15372] Fps is (10 sec: 23755.8, 60 sec: 24029.9, 300 sec: 24159.4). Total num frames: 503644160. Throughput: 0: 6037.1. Samples: 125913220. Policy #0 lag: (min: 0.0, avg: 3.2, max: 7.0) [2024-08-05 20:28:13,127][15372] Avg episode reward: [(0, '43.835')] [2024-08-05 20:28:13,175][15444] Updated weights for policy 0, policy_version 61481 (0.0021) [2024-08-05 20:28:16,342][15444] Updated weights for policy 0, policy_version 61491 (0.0014) [2024-08-05 20:28:18,119][15372] Fps is (10 sec: 23762.0, 60 sec: 24166.3, 300 sec: 24187.2). Total num frames: 503767040. Throughput: 0: 6031.3. Samples: 125948480. Policy #0 lag: (min: 0.0, avg: 3.2, max: 7.0) [2024-08-05 20:28:18,127][15372] Avg episode reward: [(0, '44.921')] [2024-08-05 20:28:18,201][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000061496_503775232.pth... [2024-08-05 20:28:18,342][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000060787_497967104.pth [2024-08-05 20:28:20,040][15444] Updated weights for policy 0, policy_version 61501 (0.0018) [2024-08-05 20:28:23,063][15444] Updated weights for policy 0, policy_version 61511 (0.0018) [2024-08-05 20:28:23,119][15372] Fps is (10 sec: 25396.2, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 503898112. Throughput: 0: 6046.0. Samples: 125966980. Policy #0 lag: (min: 0.0, avg: 3.2, max: 7.0) [2024-08-05 20:28:23,119][15372] Avg episode reward: [(0, '44.093')] [2024-08-05 20:28:26,570][15444] Updated weights for policy 0, policy_version 61521 (0.0011) [2024-08-05 20:28:28,118][15372] Fps is (10 sec: 24576.8, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 504012800. Throughput: 0: 6042.7. Samples: 126003410. Policy #0 lag: (min: 1.0, avg: 3.8, max: 7.0) [2024-08-05 20:28:28,126][15372] Avg episode reward: [(0, '44.135')] [2024-08-05 20:28:29,913][15444] Updated weights for policy 0, policy_version 61531 (0.0020) [2024-08-05 20:28:33,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24303.0, 300 sec: 24215.0). Total num frames: 504135680. Throughput: 0: 6058.6. Samples: 126040060. Policy #0 lag: (min: 1.0, avg: 3.8, max: 7.0) [2024-08-05 20:28:33,126][15372] Avg episode reward: [(0, '44.280')] [2024-08-05 20:28:33,267][15444] Updated weights for policy 0, policy_version 61541 (0.0019) [2024-08-05 20:28:36,706][15444] Updated weights for policy 0, policy_version 61551 (0.0011) [2024-08-05 20:28:38,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 504258560. Throughput: 0: 6072.2. Samples: 126058310. Policy #0 lag: (min: 1.0, avg: 3.8, max: 7.0) [2024-08-05 20:28:38,119][15372] Avg episode reward: [(0, '44.019')] [2024-08-05 20:28:40,032][15444] Updated weights for policy 0, policy_version 61561 (0.0016) [2024-08-05 20:28:41,122][15417] Signal inference workers to stop experience collection... (22550 times) [2024-08-05 20:28:41,123][15417] Signal inference workers to resume experience collection... (22550 times) [2024-08-05 20:28:41,170][15444] InferenceWorker_p0-w0: stopping experience collection (22550 times) [2024-08-05 20:28:41,170][15444] InferenceWorker_p0-w0: resuming experience collection (22550 times) [2024-08-05 20:28:43,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 504381440. Throughput: 0: 6089.6. Samples: 126095180. Policy #0 lag: (min: 1.0, avg: 3.8, max: 7.0) [2024-08-05 20:28:43,119][15372] Avg episode reward: [(0, '43.791')] [2024-08-05 20:28:43,292][15444] Updated weights for policy 0, policy_version 61571 (0.0012) [2024-08-05 20:28:46,911][15444] Updated weights for policy 0, policy_version 61581 (0.0018) [2024-08-05 20:28:48,121][15372] Fps is (10 sec: 24568.9, 60 sec: 24303.7, 300 sec: 24242.5). Total num frames: 504504320. Throughput: 0: 6066.3. Samples: 126131190. Policy #0 lag: (min: 1.0, avg: 3.8, max: 7.0) [2024-08-05 20:28:48,122][15372] Avg episode reward: [(0, '43.953')] [2024-08-05 20:28:50,140][15444] Updated weights for policy 0, policy_version 61591 (0.0030) [2024-08-05 20:28:53,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 504619008. Throughput: 0: 6083.0. Samples: 126150120. Policy #0 lag: (min: 1.0, avg: 3.8, max: 7.0) [2024-08-05 20:28:53,126][15372] Avg episode reward: [(0, '44.482')] [2024-08-05 20:28:53,526][15444] Updated weights for policy 0, policy_version 61601 (0.0021) [2024-08-05 20:28:56,967][15444] Updated weights for policy 0, policy_version 61611 (0.0013) [2024-08-05 20:28:58,119][15372] Fps is (10 sec: 23763.4, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 504741888. Throughput: 0: 6049.6. Samples: 126185450. Policy #0 lag: (min: 1.0, avg: 3.8, max: 7.0) [2024-08-05 20:28:58,119][15372] Avg episode reward: [(0, '45.565')] [2024-08-05 20:29:00,340][15444] Updated weights for policy 0, policy_version 61621 (0.0011) [2024-08-05 20:29:03,119][15372] Fps is (10 sec: 23756.0, 60 sec: 24166.3, 300 sec: 24187.2). Total num frames: 504856576. Throughput: 0: 6072.7. Samples: 126221750. Policy #0 lag: (min: 1.0, avg: 3.8, max: 7.0) [2024-08-05 20:29:03,119][15372] Avg episode reward: [(0, '45.359')] [2024-08-05 20:29:03,864][15444] Updated weights for policy 0, policy_version 61631 (0.0012) [2024-08-05 20:29:07,279][15444] Updated weights for policy 0, policy_version 61641 (0.0023) [2024-08-05 20:29:08,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24167.4, 300 sec: 24215.0). Total num frames: 504979456. Throughput: 0: 6059.8. Samples: 126239670. Policy #0 lag: (min: 1.0, avg: 4.0, max: 7.0) [2024-08-05 20:29:08,119][15372] Avg episode reward: [(0, '44.307')] [2024-08-05 20:29:10,561][15444] Updated weights for policy 0, policy_version 61651 (0.0010) [2024-08-05 20:29:13,118][15372] Fps is (10 sec: 24576.7, 60 sec: 24303.1, 300 sec: 24187.2). Total num frames: 505102336. Throughput: 0: 6037.3. Samples: 126275090. Policy #0 lag: (min: 1.0, avg: 4.0, max: 7.0) [2024-08-05 20:29:13,126][15372] Avg episode reward: [(0, '44.506')] [2024-08-05 20:29:14,179][15444] Updated weights for policy 0, policy_version 61661 (0.0011) [2024-08-05 20:29:17,365][15444] Updated weights for policy 0, policy_version 61671 (0.0018) [2024-08-05 20:29:18,119][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.5, 300 sec: 24187.2). Total num frames: 505217024. Throughput: 0: 6022.2. Samples: 126311060. Policy #0 lag: (min: 1.0, avg: 4.0, max: 7.0) [2024-08-05 20:29:18,135][15372] Avg episode reward: [(0, '44.516')] [2024-08-05 20:29:20,932][15444] Updated weights for policy 0, policy_version 61681 (0.0018) [2024-08-05 20:29:23,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24029.9, 300 sec: 24215.0). Total num frames: 505339904. Throughput: 0: 6029.6. Samples: 126329640. Policy #0 lag: (min: 1.0, avg: 4.0, max: 7.0) [2024-08-05 20:29:23,119][15372] Avg episode reward: [(0, '44.784')] [2024-08-05 20:29:24,379][15444] Updated weights for policy 0, policy_version 61691 (0.0024) [2024-08-05 20:29:27,512][15444] Updated weights for policy 0, policy_version 61701 (0.0017) [2024-08-05 20:29:28,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 505462784. Throughput: 0: 6010.4. Samples: 126365650. Policy #0 lag: (min: 1.0, avg: 4.0, max: 7.0) [2024-08-05 20:29:28,119][15372] Avg episode reward: [(0, '44.219')] [2024-08-05 20:29:31,085][15444] Updated weights for policy 0, policy_version 61711 (0.0021) [2024-08-05 20:29:33,122][15372] Fps is (10 sec: 23747.7, 60 sec: 24028.3, 300 sec: 24186.9). Total num frames: 505577472. Throughput: 0: 6019.4. Samples: 126402070. Policy #0 lag: (min: 1.0, avg: 4.0, max: 7.0) [2024-08-05 20:29:33,123][15372] Avg episode reward: [(0, '44.495')] [2024-08-05 20:29:34,532][15444] Updated weights for policy 0, policy_version 61721 (0.0027) [2024-08-05 20:29:35,650][15417] Signal inference workers to stop experience collection... (22600 times) [2024-08-05 20:29:35,650][15417] Signal inference workers to resume experience collection... (22600 times) [2024-08-05 20:29:35,719][15444] InferenceWorker_p0-w0: stopping experience collection (22600 times) [2024-08-05 20:29:35,719][15444] InferenceWorker_p0-w0: resuming experience collection (22600 times) [2024-08-05 20:29:37,763][15444] Updated weights for policy 0, policy_version 61731 (0.0025) [2024-08-05 20:29:38,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24029.9, 300 sec: 24187.2). Total num frames: 505700352. Throughput: 0: 5988.0. Samples: 126419580. Policy #0 lag: (min: 1.0, avg: 4.0, max: 7.0) [2024-08-05 20:29:38,119][15372] Avg episode reward: [(0, '44.388')] [2024-08-05 20:29:41,197][15444] Updated weights for policy 0, policy_version 61741 (0.0045) [2024-08-05 20:29:43,119][15372] Fps is (10 sec: 24583.9, 60 sec: 24029.6, 300 sec: 24187.2). Total num frames: 505823232. Throughput: 0: 6005.7. Samples: 126455710. Policy #0 lag: (min: 1.0, avg: 4.0, max: 7.0) [2024-08-05 20:29:43,127][15372] Avg episode reward: [(0, '43.927')] [2024-08-05 20:29:44,677][15444] Updated weights for policy 0, policy_version 61751 (0.0030) [2024-08-05 20:29:48,112][15444] Updated weights for policy 0, policy_version 61761 (0.0019) [2024-08-05 20:29:48,120][15372] Fps is (10 sec: 24571.5, 60 sec: 24030.3, 300 sec: 24187.1). Total num frames: 505946112. Throughput: 0: 6002.5. Samples: 126491870. Policy #0 lag: (min: 0.0, avg: 4.0, max: 9.0) [2024-08-05 20:29:48,121][15372] Avg episode reward: [(0, '43.959')] [2024-08-05 20:29:51,532][15444] Updated weights for policy 0, policy_version 61771 (0.0023) [2024-08-05 20:29:53,118][15372] Fps is (10 sec: 23758.3, 60 sec: 24029.9, 300 sec: 24187.2). Total num frames: 506060800. Throughput: 0: 6021.3. Samples: 126510630. Policy #0 lag: (min: 0.0, avg: 4.0, max: 9.0) [2024-08-05 20:29:53,126][15372] Avg episode reward: [(0, '43.411')] [2024-08-05 20:29:54,848][15444] Updated weights for policy 0, policy_version 61781 (0.0032) [2024-08-05 20:29:58,118][15372] Fps is (10 sec: 23761.1, 60 sec: 24029.9, 300 sec: 24187.2). Total num frames: 506183680. Throughput: 0: 6042.0. Samples: 126546980. Policy #0 lag: (min: 0.0, avg: 4.0, max: 9.0) [2024-08-05 20:29:58,126][15372] Avg episode reward: [(0, '42.617')] [2024-08-05 20:29:58,267][15444] Updated weights for policy 0, policy_version 61791 (0.0018) [2024-08-05 20:30:01,559][15444] Updated weights for policy 0, policy_version 61801 (0.0016) [2024-08-05 20:30:03,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.5, 300 sec: 24215.0). Total num frames: 506306560. Throughput: 0: 6033.8. Samples: 126582580. Policy #0 lag: (min: 0.0, avg: 4.0, max: 9.0) [2024-08-05 20:30:03,126][15372] Avg episode reward: [(0, '42.505')] [2024-08-05 20:30:04,973][15444] Updated weights for policy 0, policy_version 61811 (0.0021) [2024-08-05 20:30:08,119][15372] Fps is (10 sec: 24575.7, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 506429440. Throughput: 0: 6039.1. Samples: 126601400. Policy #0 lag: (min: 0.0, avg: 4.0, max: 9.0) [2024-08-05 20:30:08,126][15372] Avg episode reward: [(0, '44.168')] [2024-08-05 20:30:08,495][15444] Updated weights for policy 0, policy_version 61821 (0.0013) [2024-08-05 20:30:11,773][15444] Updated weights for policy 0, policy_version 61831 (0.0013) [2024-08-05 20:30:13,119][15372] Fps is (10 sec: 24575.7, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 506552320. Throughput: 0: 6046.7. Samples: 126637750. Policy #0 lag: (min: 0.0, avg: 4.0, max: 9.0) [2024-08-05 20:30:13,119][15372] Avg episode reward: [(0, '45.670')] [2024-08-05 20:30:14,942][15444] Updated weights for policy 0, policy_version 61841 (0.0017) [2024-08-05 20:30:18,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 506667008. Throughput: 0: 6048.5. Samples: 126674230. Policy #0 lag: (min: 0.0, avg: 4.0, max: 9.0) [2024-08-05 20:30:18,126][15372] Avg episode reward: [(0, '44.774')] [2024-08-05 20:30:18,168][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000061850_506675200.pth... [2024-08-05 20:30:18,269][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000061141_500867072.pth [2024-08-05 20:30:18,619][15444] Updated weights for policy 0, policy_version 61851 (0.0020) [2024-08-05 20:30:19,263][15417] Signal inference workers to stop experience collection... (22650 times) [2024-08-05 20:30:19,270][15417] Signal inference workers to resume experience collection... (22650 times) [2024-08-05 20:30:19,329][15444] InferenceWorker_p0-w0: stopping experience collection (22650 times) [2024-08-05 20:30:19,330][15444] InferenceWorker_p0-w0: resuming experience collection (22650 times) [2024-08-05 20:30:21,877][15444] Updated weights for policy 0, policy_version 61861 (0.0021) [2024-08-05 20:30:23,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 506789888. Throughput: 0: 6062.0. Samples: 126692370. Policy #0 lag: (min: 0.0, avg: 4.5, max: 8.0) [2024-08-05 20:30:23,126][15372] Avg episode reward: [(0, '44.916')] [2024-08-05 20:30:25,283][15444] Updated weights for policy 0, policy_version 61871 (0.0028) [2024-08-05 20:30:28,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 506912768. Throughput: 0: 6054.5. Samples: 126728160. Policy #0 lag: (min: 0.0, avg: 4.5, max: 8.0) [2024-08-05 20:30:28,126][15372] Avg episode reward: [(0, '45.140')] [2024-08-05 20:30:28,779][15444] Updated weights for policy 0, policy_version 61881 (0.0022) [2024-08-05 20:30:32,037][15444] Updated weights for policy 0, policy_version 61891 (0.0016) [2024-08-05 20:30:33,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24167.9, 300 sec: 24187.2). Total num frames: 507027456. Throughput: 0: 6048.2. Samples: 126764030. Policy #0 lag: (min: 0.0, avg: 4.5, max: 8.0) [2024-08-05 20:30:33,119][15372] Avg episode reward: [(0, '44.569')] [2024-08-05 20:30:35,634][15444] Updated weights for policy 0, policy_version 61901 (0.0022) [2024-08-05 20:30:38,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 507150336. Throughput: 0: 6046.9. Samples: 126782740. Policy #0 lag: (min: 0.0, avg: 4.5, max: 8.0) [2024-08-05 20:30:38,119][15372] Avg episode reward: [(0, '44.647')] [2024-08-05 20:30:38,946][15444] Updated weights for policy 0, policy_version 61911 (0.0016) [2024-08-05 20:30:42,197][15444] Updated weights for policy 0, policy_version 61921 (0.0011) [2024-08-05 20:30:43,118][15372] Fps is (10 sec: 24575.8, 60 sec: 24166.6, 300 sec: 24187.2). Total num frames: 507273216. Throughput: 0: 6050.9. Samples: 126819270. Policy #0 lag: (min: 0.0, avg: 4.5, max: 8.0) [2024-08-05 20:30:43,126][15372] Avg episode reward: [(0, '44.739')] [2024-08-05 20:30:45,574][15444] Updated weights for policy 0, policy_version 61931 (0.0028) [2024-08-05 20:30:48,129][15372] Fps is (10 sec: 24549.3, 60 sec: 24162.7, 300 sec: 24214.1). Total num frames: 507396096. Throughput: 0: 6072.1. Samples: 126855890. Policy #0 lag: (min: 0.0, avg: 4.5, max: 8.0) [2024-08-05 20:30:48,130][15372] Avg episode reward: [(0, '44.910')] [2024-08-05 20:30:48,831][15444] Updated weights for policy 0, policy_version 61941 (0.0016) [2024-08-05 20:30:52,382][15444] Updated weights for policy 0, policy_version 61951 (0.0027) [2024-08-05 20:30:53,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 507518976. Throughput: 0: 6062.9. Samples: 126874230. Policy #0 lag: (min: 0.0, avg: 4.5, max: 8.0) [2024-08-05 20:30:53,119][15372] Avg episode reward: [(0, '45.103')] [2024-08-05 20:30:55,778][15444] Updated weights for policy 0, policy_version 61961 (0.0021) [2024-08-05 20:30:58,119][15372] Fps is (10 sec: 24602.6, 60 sec: 24302.9, 300 sec: 24215.0). Total num frames: 507641856. Throughput: 0: 6040.2. Samples: 126909560. Policy #0 lag: (min: 0.0, avg: 4.5, max: 8.0) [2024-08-05 20:30:58,126][15372] Avg episode reward: [(0, '44.146')] [2024-08-05 20:30:59,244][15444] Updated weights for policy 0, policy_version 61971 (0.0017) [2024-08-05 20:31:02,712][15444] Updated weights for policy 0, policy_version 61981 (0.0015) [2024-08-05 20:31:03,118][15372] Fps is (10 sec: 22937.6, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 507748352. Throughput: 0: 6030.2. Samples: 126945590. Policy #0 lag: (min: 0.0, avg: 4.4, max: 9.0) [2024-08-05 20:31:03,119][15372] Avg episode reward: [(0, '44.958')] [2024-08-05 20:31:05,905][15444] Updated weights for policy 0, policy_version 61991 (0.0042) [2024-08-05 20:31:08,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 507879424. Throughput: 0: 6020.9. Samples: 126963310. Policy #0 lag: (min: 0.0, avg: 4.4, max: 9.0) [2024-08-05 20:31:08,126][15372] Avg episode reward: [(0, '44.177')] [2024-08-05 20:31:09,714][15444] Updated weights for policy 0, policy_version 62001 (0.0021) [2024-08-05 20:31:10,093][15417] Signal inference workers to stop experience collection... (22700 times) [2024-08-05 20:31:10,094][15417] Signal inference workers to resume experience collection... (22700 times) [2024-08-05 20:31:10,141][15444] InferenceWorker_p0-w0: stopping experience collection (22700 times) [2024-08-05 20:31:10,141][15444] InferenceWorker_p0-w0: resuming experience collection (22700 times) [2024-08-05 20:31:12,871][15444] Updated weights for policy 0, policy_version 62011 (0.0012) [2024-08-05 20:31:13,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 507994112. Throughput: 0: 6021.1. Samples: 126999110. Policy #0 lag: (min: 0.0, avg: 4.4, max: 9.0) [2024-08-05 20:31:13,119][15372] Avg episode reward: [(0, '44.310')] [2024-08-05 20:31:16,457][15444] Updated weights for policy 0, policy_version 62021 (0.0021) [2024-08-05 20:31:18,118][15372] Fps is (10 sec: 22937.5, 60 sec: 24029.9, 300 sec: 24159.4). Total num frames: 508108800. Throughput: 0: 6010.0. Samples: 127034480. Policy #0 lag: (min: 0.0, avg: 4.4, max: 9.0) [2024-08-05 20:31:18,119][15372] Avg episode reward: [(0, '44.652')] [2024-08-05 20:31:19,934][15444] Updated weights for policy 0, policy_version 62031 (0.0026) [2024-08-05 20:31:23,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 508231680. Throughput: 0: 6007.3. Samples: 127053070. Policy #0 lag: (min: 0.0, avg: 4.4, max: 9.0) [2024-08-05 20:31:23,126][15372] Avg episode reward: [(0, '43.781')] [2024-08-05 20:31:23,186][15444] Updated weights for policy 0, policy_version 62041 (0.0025) [2024-08-05 20:31:26,615][15444] Updated weights for policy 0, policy_version 62051 (0.0021) [2024-08-05 20:31:28,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24029.9, 300 sec: 24159.4). Total num frames: 508354560. Throughput: 0: 5993.3. Samples: 127088970. Policy #0 lag: (min: 0.0, avg: 4.4, max: 9.0) [2024-08-05 20:31:28,126][15372] Avg episode reward: [(0, '43.433')] [2024-08-05 20:31:29,898][15444] Updated weights for policy 0, policy_version 62061 (0.0029) [2024-08-05 20:31:33,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 508477440. Throughput: 0: 5997.2. Samples: 127125700. Policy #0 lag: (min: 0.0, avg: 4.4, max: 9.0) [2024-08-05 20:31:33,126][15372] Avg episode reward: [(0, '43.633')] [2024-08-05 20:31:33,353][15444] Updated weights for policy 0, policy_version 62071 (0.0017) [2024-08-05 20:31:36,722][15444] Updated weights for policy 0, policy_version 62081 (0.0014) [2024-08-05 20:31:38,119][15372] Fps is (10 sec: 24575.8, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 508600320. Throughput: 0: 5991.1. Samples: 127143830. Policy #0 lag: (min: 2.0, avg: 4.4, max: 9.0) [2024-08-05 20:31:38,119][15372] Avg episode reward: [(0, '44.491')] [2024-08-05 20:31:39,994][15444] Updated weights for policy 0, policy_version 62091 (0.0013) [2024-08-05 20:31:43,118][15372] Fps is (10 sec: 23757.1, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 508715008. Throughput: 0: 6023.1. Samples: 127180600. Policy #0 lag: (min: 2.0, avg: 4.4, max: 9.0) [2024-08-05 20:31:43,126][15372] Avg episode reward: [(0, '43.214')] [2024-08-05 20:31:43,684][15444] Updated weights for policy 0, policy_version 62101 (0.0020) [2024-08-05 20:31:46,634][15444] Updated weights for policy 0, policy_version 62111 (0.0027) [2024-08-05 20:31:48,118][15372] Fps is (10 sec: 23757.0, 60 sec: 24034.2, 300 sec: 24159.6). Total num frames: 508837888. Throughput: 0: 6014.0. Samples: 127216220. Policy #0 lag: (min: 2.0, avg: 4.4, max: 9.0) [2024-08-05 20:31:48,128][15372] Avg episode reward: [(0, '43.814')] [2024-08-05 20:31:50,325][15444] Updated weights for policy 0, policy_version 62121 (0.0012) [2024-08-05 20:31:50,671][15417] Signal inference workers to stop experience collection... (22750 times) [2024-08-05 20:31:50,672][15417] Signal inference workers to resume experience collection... (22750 times) [2024-08-05 20:31:50,746][15444] InferenceWorker_p0-w0: stopping experience collection (22750 times) [2024-08-05 20:31:50,746][15444] InferenceWorker_p0-w0: resuming experience collection (22750 times) [2024-08-05 20:31:53,119][15372] Fps is (10 sec: 25394.1, 60 sec: 24166.3, 300 sec: 24187.2). Total num frames: 508968960. Throughput: 0: 6030.2. Samples: 127234670. Policy #0 lag: (min: 2.0, avg: 4.4, max: 9.0) [2024-08-05 20:31:53,119][15372] Avg episode reward: [(0, '43.100')] [2024-08-05 20:31:53,709][15444] Updated weights for policy 0, policy_version 62131 (0.0012) [2024-08-05 20:31:56,747][15444] Updated weights for policy 0, policy_version 62141 (0.0021) [2024-08-05 20:31:58,119][15372] Fps is (10 sec: 24575.4, 60 sec: 24029.8, 300 sec: 24159.4). Total num frames: 509083648. Throughput: 0: 6048.6. Samples: 127271300. Policy #0 lag: (min: 2.0, avg: 4.4, max: 9.0) [2024-08-05 20:31:58,119][15372] Avg episode reward: [(0, '43.343')] [2024-08-05 20:32:00,291][15444] Updated weights for policy 0, policy_version 62151 (0.0016) [2024-08-05 20:32:03,118][15372] Fps is (10 sec: 23757.5, 60 sec: 24302.9, 300 sec: 24159.5). Total num frames: 509206528. Throughput: 0: 6088.7. Samples: 127308470. Policy #0 lag: (min: 2.0, avg: 4.4, max: 9.0) [2024-08-05 20:32:03,119][15372] Avg episode reward: [(0, '44.885')] [2024-08-05 20:32:03,463][15444] Updated weights for policy 0, policy_version 62161 (0.0027) [2024-08-05 20:32:07,183][15444] Updated weights for policy 0, policy_version 62171 (0.0022) [2024-08-05 20:32:08,119][15372] Fps is (10 sec: 24575.9, 60 sec: 24166.3, 300 sec: 24159.5). Total num frames: 509329408. Throughput: 0: 6067.7. Samples: 127326120. Policy #0 lag: (min: 2.0, avg: 4.4, max: 9.0) [2024-08-05 20:32:08,119][15372] Avg episode reward: [(0, '44.411')] [2024-08-05 20:32:10,600][15444] Updated weights for policy 0, policy_version 62181 (0.0036) [2024-08-05 20:32:13,119][15372] Fps is (10 sec: 24575.8, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 509452288. Throughput: 0: 6085.8. Samples: 127362830. Policy #0 lag: (min: 2.0, avg: 4.4, max: 9.0) [2024-08-05 20:32:13,119][15372] Avg episode reward: [(0, '43.368')] [2024-08-05 20:32:13,664][15444] Updated weights for policy 0, policy_version 62191 (0.0018) [2024-08-05 20:32:17,317][15444] Updated weights for policy 0, policy_version 62201 (0.0034) [2024-08-05 20:32:18,118][15372] Fps is (10 sec: 24576.7, 60 sec: 24439.5, 300 sec: 24159.5). Total num frames: 509575168. Throughput: 0: 6069.1. Samples: 127398810. Policy #0 lag: (min: 1.0, avg: 4.3, max: 8.0) [2024-08-05 20:32:18,119][15372] Avg episode reward: [(0, '42.497')] [2024-08-05 20:32:18,122][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000062204_509575168.pth... [2024-08-05 20:32:18,261][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000061496_503775232.pth [2024-08-05 20:32:20,331][15444] Updated weights for policy 0, policy_version 62211 (0.0011) [2024-08-05 20:32:23,119][15372] Fps is (10 sec: 23756.5, 60 sec: 24302.8, 300 sec: 24159.4). Total num frames: 509689856. Throughput: 0: 6080.6. Samples: 127417460. Policy #0 lag: (min: 1.0, avg: 4.3, max: 8.0) [2024-08-05 20:32:23,119][15372] Avg episode reward: [(0, '43.427')] [2024-08-05 20:32:23,945][15444] Updated weights for policy 0, policy_version 62221 (0.0018) [2024-08-05 20:32:27,281][15444] Updated weights for policy 0, policy_version 62231 (0.0030) [2024-08-05 20:32:28,119][15372] Fps is (10 sec: 22936.6, 60 sec: 24166.2, 300 sec: 24159.4). Total num frames: 509804544. Throughput: 0: 6069.5. Samples: 127453730. Policy #0 lag: (min: 1.0, avg: 4.3, max: 8.0) [2024-08-05 20:32:28,128][15372] Avg episode reward: [(0, '43.744')] [2024-08-05 20:32:30,244][15417] Signal inference workers to stop experience collection... (22800 times) [2024-08-05 20:32:30,251][15417] Signal inference workers to resume experience collection... (22800 times) [2024-08-05 20:32:30,323][15444] InferenceWorker_p0-w0: stopping experience collection (22800 times) [2024-08-05 20:32:30,323][15444] InferenceWorker_p0-w0: resuming experience collection (22800 times) [2024-08-05 20:32:30,690][15444] Updated weights for policy 0, policy_version 62241 (0.0014) [2024-08-05 20:32:33,118][15372] Fps is (10 sec: 24576.7, 60 sec: 24303.0, 300 sec: 24159.5). Total num frames: 509935616. Throughput: 0: 6078.9. Samples: 127489770. Policy #0 lag: (min: 1.0, avg: 4.3, max: 8.0) [2024-08-05 20:32:33,119][15372] Avg episode reward: [(0, '43.554')] [2024-08-05 20:32:34,241][15444] Updated weights for policy 0, policy_version 62251 (0.0029) [2024-08-05 20:32:37,498][15444] Updated weights for policy 0, policy_version 62261 (0.0017) [2024-08-05 20:32:38,118][15372] Fps is (10 sec: 24577.2, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 510050304. Throughput: 0: 6066.1. Samples: 127507640. Policy #0 lag: (min: 1.0, avg: 4.3, max: 8.0) [2024-08-05 20:32:38,119][15372] Avg episode reward: [(0, '43.757')] [2024-08-05 20:32:40,829][15444] Updated weights for policy 0, policy_version 62271 (0.0013) [2024-08-05 20:32:43,119][15372] Fps is (10 sec: 23755.0, 60 sec: 24302.6, 300 sec: 24159.8). Total num frames: 510173184. Throughput: 0: 6040.6. Samples: 127543130. Policy #0 lag: (min: 1.0, avg: 4.3, max: 8.0) [2024-08-05 20:32:43,127][15372] Avg episode reward: [(0, '43.240')] [2024-08-05 20:32:44,365][15444] Updated weights for policy 0, policy_version 62281 (0.0012) [2024-08-05 20:32:47,740][15444] Updated weights for policy 0, policy_version 62291 (0.0023) [2024-08-05 20:32:48,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 510287872. Throughput: 0: 6020.2. Samples: 127579380. Policy #0 lag: (min: 1.0, avg: 4.3, max: 8.0) [2024-08-05 20:32:48,119][15372] Avg episode reward: [(0, '44.139')] [2024-08-05 20:32:51,065][15444] Updated weights for policy 0, policy_version 62301 (0.0014) [2024-08-05 20:32:53,119][15372] Fps is (10 sec: 24575.3, 60 sec: 24166.1, 300 sec: 24187.1). Total num frames: 510418944. Throughput: 0: 6041.7. Samples: 127598000. Policy #0 lag: (min: 1.0, avg: 4.3, max: 8.0) [2024-08-05 20:32:53,120][15372] Avg episode reward: [(0, '45.143')] [2024-08-05 20:32:54,894][15444] Updated weights for policy 0, policy_version 62311 (0.0018) [2024-08-05 20:32:58,119][15372] Fps is (10 sec: 23755.8, 60 sec: 24029.8, 300 sec: 24131.7). Total num frames: 510525440. Throughput: 0: 6010.2. Samples: 127633290. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 20:32:58,127][15372] Avg episode reward: [(0, '43.819')] [2024-08-05 20:32:58,169][15444] Updated weights for policy 0, policy_version 62321 (0.0037) [2024-08-05 20:33:01,471][15444] Updated weights for policy 0, policy_version 62331 (0.0017) [2024-08-05 20:33:03,118][15372] Fps is (10 sec: 22939.9, 60 sec: 24029.9, 300 sec: 24131.9). Total num frames: 510648320. Throughput: 0: 6009.3. Samples: 127669230. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 20:33:03,126][15372] Avg episode reward: [(0, '43.806')] [2024-08-05 20:33:04,983][15444] Updated weights for policy 0, policy_version 62341 (0.0011) [2024-08-05 20:33:08,034][15444] Updated weights for policy 0, policy_version 62351 (0.0019) [2024-08-05 20:33:08,118][15372] Fps is (10 sec: 25396.1, 60 sec: 24166.5, 300 sec: 24187.3). Total num frames: 510779392. Throughput: 0: 6008.9. Samples: 127687860. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 20:33:08,119][15372] Avg episode reward: [(0, '44.939')] [2024-08-05 20:33:11,655][15444] Updated weights for policy 0, policy_version 62361 (0.0013) [2024-08-05 20:33:13,119][15372] Fps is (10 sec: 24575.7, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 510894080. Throughput: 0: 6007.4. Samples: 127724060. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 20:33:13,119][15372] Avg episode reward: [(0, '45.251')] [2024-08-05 20:33:14,864][15444] Updated weights for policy 0, policy_version 62371 (0.0027) [2024-08-05 20:33:18,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 511016960. Throughput: 0: 6017.6. Samples: 127760560. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 20:33:18,126][15372] Avg episode reward: [(0, '44.020')] [2024-08-05 20:33:18,317][15444] Updated weights for policy 0, policy_version 62381 (0.0020) [2024-08-05 20:33:22,061][15444] Updated weights for policy 0, policy_version 62391 (0.0013) [2024-08-05 20:33:22,874][15417] Signal inference workers to stop experience collection... (22850 times) [2024-08-05 20:33:22,875][15417] Signal inference workers to resume experience collection... (22850 times) [2024-08-05 20:33:22,920][15444] InferenceWorker_p0-w0: stopping experience collection (22850 times) [2024-08-05 20:33:22,920][15444] InferenceWorker_p0-w0: resuming experience collection (22850 times) [2024-08-05 20:33:23,118][15372] Fps is (10 sec: 24576.4, 60 sec: 24166.5, 300 sec: 24159.5). Total num frames: 511139840. Throughput: 0: 6018.4. Samples: 127778470. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 20:33:23,119][15372] Avg episode reward: [(0, '43.545')] [2024-08-05 20:33:25,023][15444] Updated weights for policy 0, policy_version 62401 (0.0029) [2024-08-05 20:33:28,119][15372] Fps is (10 sec: 23754.7, 60 sec: 24166.2, 300 sec: 24131.6). Total num frames: 511254528. Throughput: 0: 6037.5. Samples: 127814820. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 20:33:28,127][15372] Avg episode reward: [(0, '43.472')] [2024-08-05 20:33:28,695][15444] Updated weights for policy 0, policy_version 62411 (0.0020) [2024-08-05 20:33:31,928][15444] Updated weights for policy 0, policy_version 62421 (0.0014) [2024-08-05 20:33:33,118][15372] Fps is (10 sec: 23756.6, 60 sec: 24029.8, 300 sec: 24131.7). Total num frames: 511377408. Throughput: 0: 6032.7. Samples: 127850850. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 20:33:33,119][15372] Avg episode reward: [(0, '44.575')] [2024-08-05 20:33:35,349][15444] Updated weights for policy 0, policy_version 62431 (0.0012) [2024-08-05 20:33:38,118][15372] Fps is (10 sec: 24578.1, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 511500288. Throughput: 0: 6011.5. Samples: 127868510. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 20:33:38,119][15372] Avg episode reward: [(0, '44.962')] [2024-08-05 20:33:38,930][15444] Updated weights for policy 0, policy_version 62441 (0.0010) [2024-08-05 20:33:42,013][15444] Updated weights for policy 0, policy_version 62451 (0.0012) [2024-08-05 20:33:43,119][15372] Fps is (10 sec: 23756.8, 60 sec: 24030.1, 300 sec: 24104.2). Total num frames: 511614976. Throughput: 0: 6034.7. Samples: 127904850. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 20:33:43,126][15372] Avg episode reward: [(0, '44.011')] [2024-08-05 20:33:45,452][15444] Updated weights for policy 0, policy_version 62461 (0.0037) [2024-08-05 20:33:48,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 511737856. Throughput: 0: 6051.3. Samples: 127941540. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 20:33:48,119][15372] Avg episode reward: [(0, '43.150')] [2024-08-05 20:33:48,827][15444] Updated weights for policy 0, policy_version 62471 (0.0020) [2024-08-05 20:33:52,289][15444] Updated weights for policy 0, policy_version 62481 (0.0011) [2024-08-05 20:33:53,119][15372] Fps is (10 sec: 24575.9, 60 sec: 24030.2, 300 sec: 24131.7). Total num frames: 511860736. Throughput: 0: 6041.3. Samples: 127959720. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 20:33:53,119][15372] Avg episode reward: [(0, '44.563')] [2024-08-05 20:33:55,533][15444] Updated weights for policy 0, policy_version 62491 (0.0026) [2024-08-05 20:33:58,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24303.1, 300 sec: 24159.5). Total num frames: 511983616. Throughput: 0: 6036.5. Samples: 127995700. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 20:33:58,126][15372] Avg episode reward: [(0, '44.474')] [2024-08-05 20:33:59,152][15444] Updated weights for policy 0, policy_version 62501 (0.0018) [2024-08-05 20:34:02,332][15444] Updated weights for policy 0, policy_version 62511 (0.0012) [2024-08-05 20:34:03,118][15372] Fps is (10 sec: 23757.1, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 512098304. Throughput: 0: 6022.4. Samples: 128031570. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 20:34:03,119][15372] Avg episode reward: [(0, '44.972')] [2024-08-05 20:34:06,068][15444] Updated weights for policy 0, policy_version 62521 (0.0032) [2024-08-05 20:34:08,119][15372] Fps is (10 sec: 22937.2, 60 sec: 23893.3, 300 sec: 24103.9). Total num frames: 512212992. Throughput: 0: 6031.7. Samples: 128049900. Policy #0 lag: (min: 1.0, avg: 3.7, max: 8.0) [2024-08-05 20:34:08,119][15372] Avg episode reward: [(0, '44.598')] [2024-08-05 20:34:09,300][15444] Updated weights for policy 0, policy_version 62531 (0.0013) [2024-08-05 20:34:12,711][15444] Updated weights for policy 0, policy_version 62541 (0.0011) [2024-08-05 20:34:13,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 512335872. Throughput: 0: 6022.6. Samples: 128085830. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 20:34:13,119][15372] Avg episode reward: [(0, '44.641')] [2024-08-05 20:34:16,170][15444] Updated weights for policy 0, policy_version 62551 (0.0021) [2024-08-05 20:34:18,119][15372] Fps is (10 sec: 24576.4, 60 sec: 24029.8, 300 sec: 24131.7). Total num frames: 512458752. Throughput: 0: 6016.2. Samples: 128121580. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 20:34:18,126][15372] Avg episode reward: [(0, '45.803')] [2024-08-05 20:34:18,129][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000062556_512458752.pth... [2024-08-05 20:34:18,265][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000061850_506675200.pth [2024-08-05 20:34:19,534][15444] Updated weights for policy 0, policy_version 62561 (0.0018) [2024-08-05 20:34:22,879][15444] Updated weights for policy 0, policy_version 62571 (0.0020) [2024-08-05 20:34:23,119][15372] Fps is (10 sec: 24575.6, 60 sec: 24029.8, 300 sec: 24131.7). Total num frames: 512581632. Throughput: 0: 6028.9. Samples: 128139810. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 20:34:23,119][15372] Avg episode reward: [(0, '44.873')] [2024-08-05 20:34:25,512][15417] Signal inference workers to stop experience collection... (22900 times) [2024-08-05 20:34:25,513][15417] Signal inference workers to resume experience collection... (22900 times) [2024-08-05 20:34:25,553][15444] InferenceWorker_p0-w0: stopping experience collection (22900 times) [2024-08-05 20:34:25,560][15444] InferenceWorker_p0-w0: resuming experience collection (22900 times) [2024-08-05 20:34:26,223][15444] Updated weights for policy 0, policy_version 62581 (0.0015) [2024-08-05 20:34:28,119][15372] Fps is (10 sec: 24575.7, 60 sec: 24166.7, 300 sec: 24159.8). Total num frames: 512704512. Throughput: 0: 6037.5. Samples: 128176540. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 20:34:28,119][15372] Avg episode reward: [(0, '44.712')] [2024-08-05 20:34:29,599][15444] Updated weights for policy 0, policy_version 62591 (0.0021) [2024-08-05 20:34:32,815][15444] Updated weights for policy 0, policy_version 62601 (0.0020) [2024-08-05 20:34:33,119][15372] Fps is (10 sec: 24576.1, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 512827392. Throughput: 0: 6048.0. Samples: 128213700. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 20:34:33,119][15372] Avg episode reward: [(0, '45.358')] [2024-08-05 20:34:36,192][15444] Updated weights for policy 0, policy_version 62611 (0.0025) [2024-08-05 20:34:38,118][15372] Fps is (10 sec: 24576.5, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 512950272. Throughput: 0: 6060.2. Samples: 128232430. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 20:34:38,126][15372] Avg episode reward: [(0, '44.888')] [2024-08-05 20:34:39,590][15444] Updated weights for policy 0, policy_version 62621 (0.0024) [2024-08-05 20:34:42,936][15444] Updated weights for policy 0, policy_version 62631 (0.0015) [2024-08-05 20:34:43,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24303.0, 300 sec: 24159.6). Total num frames: 513073152. Throughput: 0: 6073.8. Samples: 128269020. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 20:34:43,119][15372] Avg episode reward: [(0, '44.314')] [2024-08-05 20:34:46,497][15444] Updated weights for policy 0, policy_version 62641 (0.0013) [2024-08-05 20:34:48,119][15372] Fps is (10 sec: 24575.6, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 513196032. Throughput: 0: 6074.2. Samples: 128304910. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 20:34:48,119][15372] Avg episode reward: [(0, '44.775')] [2024-08-05 20:34:49,629][15444] Updated weights for policy 0, policy_version 62651 (0.0012) [2024-08-05 20:34:53,079][15444] Updated weights for policy 0, policy_version 62661 (0.0020) [2024-08-05 20:34:53,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24303.0, 300 sec: 24187.2). Total num frames: 513318912. Throughput: 0: 6084.9. Samples: 128323720. Policy #0 lag: (min: 1.0, avg: 3.9, max: 8.0) [2024-08-05 20:34:53,119][15372] Avg episode reward: [(0, '44.830')] [2024-08-05 20:34:56,310][15444] Updated weights for policy 0, policy_version 62671 (0.0011) [2024-08-05 20:34:58,119][15372] Fps is (10 sec: 24576.1, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 513441792. Throughput: 0: 6087.8. Samples: 128359780. Policy #0 lag: (min: 1.0, avg: 3.9, max: 8.0) [2024-08-05 20:34:58,126][15372] Avg episode reward: [(0, '44.970')] [2024-08-05 20:34:59,747][15444] Updated weights for policy 0, policy_version 62681 (0.0023) [2024-08-05 20:35:03,119][15372] Fps is (10 sec: 23755.7, 60 sec: 24302.7, 300 sec: 24159.4). Total num frames: 513556480. Throughput: 0: 6102.6. Samples: 128396200. Policy #0 lag: (min: 1.0, avg: 3.9, max: 8.0) [2024-08-05 20:35:03,127][15372] Avg episode reward: [(0, '43.466')] [2024-08-05 20:35:03,175][15444] Updated weights for policy 0, policy_version 62691 (0.0011) [2024-08-05 20:35:06,390][15444] Updated weights for policy 0, policy_version 62701 (0.0014) [2024-08-05 20:35:08,119][15372] Fps is (10 sec: 23756.9, 60 sec: 24439.5, 300 sec: 24159.5). Total num frames: 513679360. Throughput: 0: 6109.8. Samples: 128414750. Policy #0 lag: (min: 1.0, avg: 3.9, max: 8.0) [2024-08-05 20:35:08,126][15372] Avg episode reward: [(0, '43.912')] [2024-08-05 20:35:09,979][15444] Updated weights for policy 0, policy_version 62711 (0.0019) [2024-08-05 20:35:13,118][15372] Fps is (10 sec: 24577.1, 60 sec: 24439.5, 300 sec: 24187.2). Total num frames: 513802240. Throughput: 0: 6103.6. Samples: 128451200. Policy #0 lag: (min: 1.0, avg: 3.9, max: 8.0) [2024-08-05 20:35:13,126][15372] Avg episode reward: [(0, '43.638')] [2024-08-05 20:35:13,381][15444] Updated weights for policy 0, policy_version 62721 (0.0012) [2024-08-05 20:35:16,600][15444] Updated weights for policy 0, policy_version 62731 (0.0017) [2024-08-05 20:35:18,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24439.5, 300 sec: 24187.2). Total num frames: 513925120. Throughput: 0: 6068.2. Samples: 128486770. Policy #0 lag: (min: 1.0, avg: 3.9, max: 8.0) [2024-08-05 20:35:18,126][15372] Avg episode reward: [(0, '44.349')] [2024-08-05 20:35:20,290][15444] Updated weights for policy 0, policy_version 62741 (0.0011) [2024-08-05 20:35:23,119][15372] Fps is (10 sec: 24575.2, 60 sec: 24439.4, 300 sec: 24187.2). Total num frames: 514048000. Throughput: 0: 6065.5. Samples: 128505380. Policy #0 lag: (min: 1.0, avg: 3.9, max: 8.0) [2024-08-05 20:35:23,119][15372] Avg episode reward: [(0, '45.001')] [2024-08-05 20:35:23,598][15444] Updated weights for policy 0, policy_version 62751 (0.0026) [2024-08-05 20:35:26,957][15444] Updated weights for policy 0, policy_version 62761 (0.0030) [2024-08-05 20:35:28,119][15372] Fps is (10 sec: 23755.7, 60 sec: 24302.8, 300 sec: 24187.2). Total num frames: 514162688. Throughput: 0: 6048.4. Samples: 128541200. Policy #0 lag: (min: 1.0, avg: 3.9, max: 8.0) [2024-08-05 20:35:28,119][15372] Avg episode reward: [(0, '43.984')] [2024-08-05 20:35:29,114][15417] Signal inference workers to stop experience collection... (22950 times) [2024-08-05 20:35:29,114][15417] Signal inference workers to resume experience collection... (22950 times) [2024-08-05 20:35:29,154][15444] InferenceWorker_p0-w0: stopping experience collection (22950 times) [2024-08-05 20:35:29,154][15444] InferenceWorker_p0-w0: resuming experience collection (22950 times) [2024-08-05 20:35:30,371][15444] Updated weights for policy 0, policy_version 62771 (0.0012) [2024-08-05 20:35:33,119][15372] Fps is (10 sec: 23756.7, 60 sec: 24302.8, 300 sec: 24187.2). Total num frames: 514285568. Throughput: 0: 6075.7. Samples: 128578320. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 20:35:33,119][15372] Avg episode reward: [(0, '44.365')] [2024-08-05 20:35:33,494][15444] Updated weights for policy 0, policy_version 62781 (0.0027) [2024-08-05 20:35:37,047][15444] Updated weights for policy 0, policy_version 62791 (0.0020) [2024-08-05 20:35:38,118][15372] Fps is (10 sec: 24577.3, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 514408448. Throughput: 0: 6044.0. Samples: 128595700. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 20:35:38,119][15372] Avg episode reward: [(0, '44.749')] [2024-08-05 20:35:40,391][15444] Updated weights for policy 0, policy_version 62801 (0.0019) [2024-08-05 20:35:43,119][15372] Fps is (10 sec: 24576.5, 60 sec: 24302.9, 300 sec: 24188.1). Total num frames: 514531328. Throughput: 0: 6062.2. Samples: 128632580. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 20:35:43,119][15372] Avg episode reward: [(0, '44.401')] [2024-08-05 20:35:43,773][15444] Updated weights for policy 0, policy_version 62811 (0.0017) [2024-08-05 20:35:47,102][15444] Updated weights for policy 0, policy_version 62821 (0.0017) [2024-08-05 20:35:48,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 514646016. Throughput: 0: 6055.8. Samples: 128668710. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 20:35:48,119][15372] Avg episode reward: [(0, '44.130')] [2024-08-05 20:35:50,406][15444] Updated weights for policy 0, policy_version 62831 (0.0016) [2024-08-05 20:35:53,119][15372] Fps is (10 sec: 23757.0, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 514768896. Throughput: 0: 6052.4. Samples: 128687110. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 20:35:53,119][15372] Avg episode reward: [(0, '43.529')] [2024-08-05 20:35:54,131][15444] Updated weights for policy 0, policy_version 62841 (0.0013) [2024-08-05 20:35:57,125][15444] Updated weights for policy 0, policy_version 62851 (0.0018) [2024-08-05 20:35:58,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.9, 300 sec: 24187.2). Total num frames: 514883584. Throughput: 0: 6052.0. Samples: 128723540. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 20:35:58,126][15372] Avg episode reward: [(0, '44.081')] [2024-08-05 20:36:00,570][15444] Updated weights for policy 0, policy_version 62861 (0.0013) [2024-08-05 20:36:03,119][15372] Fps is (10 sec: 25395.0, 60 sec: 24439.6, 300 sec: 24215.0). Total num frames: 515022848. Throughput: 0: 6070.0. Samples: 128759920. Policy #0 lag: (min: 0.0, avg: 3.9, max: 8.0) [2024-08-05 20:36:03,119][15372] Avg episode reward: [(0, '45.245')] [2024-08-05 20:36:04,153][15444] Updated weights for policy 0, policy_version 62871 (0.0013) [2024-08-05 20:36:07,396][15444] Updated weights for policy 0, policy_version 62881 (0.0024) [2024-08-05 20:36:08,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 515129344. Throughput: 0: 6049.8. Samples: 128777620. Policy #0 lag: (min: 0.0, avg: 4.1, max: 9.0) [2024-08-05 20:36:08,119][15372] Avg episode reward: [(0, '44.397')] [2024-08-05 20:36:10,880][15444] Updated weights for policy 0, policy_version 62891 (0.0028) [2024-08-05 20:36:13,118][15372] Fps is (10 sec: 22937.9, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 515252224. Throughput: 0: 6043.2. Samples: 128813140. Policy #0 lag: (min: 0.0, avg: 4.1, max: 9.0) [2024-08-05 20:36:13,126][15372] Avg episode reward: [(0, '44.151')] [2024-08-05 20:36:14,426][15444] Updated weights for policy 0, policy_version 62901 (0.0028) [2024-08-05 20:36:17,930][15444] Updated weights for policy 0, policy_version 62911 (0.0014) [2024-08-05 20:36:18,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 515375104. Throughput: 0: 6012.9. Samples: 128848900. Policy #0 lag: (min: 0.0, avg: 4.1, max: 9.0) [2024-08-05 20:36:18,119][15372] Avg episode reward: [(0, '43.731')] [2024-08-05 20:36:18,122][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000062912_515375104.pth... [2024-08-05 20:36:18,224][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000062204_509575168.pth [2024-08-05 20:36:21,440][15444] Updated weights for policy 0, policy_version 62921 (0.0012) [2024-08-05 20:36:23,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24030.0, 300 sec: 24187.2). Total num frames: 515489792. Throughput: 0: 6036.9. Samples: 128867360. Policy #0 lag: (min: 0.0, avg: 4.1, max: 9.0) [2024-08-05 20:36:23,126][15372] Avg episode reward: [(0, '44.128')] [2024-08-05 20:36:24,517][15444] Updated weights for policy 0, policy_version 62931 (0.0012) [2024-08-05 20:36:26,397][15417] Signal inference workers to stop experience collection... (23000 times) [2024-08-05 20:36:26,397][15417] Signal inference workers to resume experience collection... (23000 times) [2024-08-05 20:36:26,466][15444] InferenceWorker_p0-w0: stopping experience collection (23000 times) [2024-08-05 20:36:26,472][15444] InferenceWorker_p0-w0: resuming experience collection (23000 times) [2024-08-05 20:36:27,993][15444] Updated weights for policy 0, policy_version 62941 (0.0029) [2024-08-05 20:36:28,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.6, 300 sec: 24187.2). Total num frames: 515612672. Throughput: 0: 6019.8. Samples: 128903470. Policy #0 lag: (min: 0.0, avg: 4.1, max: 9.0) [2024-08-05 20:36:28,119][15372] Avg episode reward: [(0, '44.365')] [2024-08-05 20:36:31,176][15444] Updated weights for policy 0, policy_version 62951 (0.0013) [2024-08-05 20:36:33,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24166.6, 300 sec: 24187.2). Total num frames: 515735552. Throughput: 0: 6026.9. Samples: 128939920. Policy #0 lag: (min: 0.0, avg: 4.1, max: 9.0) [2024-08-05 20:36:33,119][15372] Avg episode reward: [(0, '44.508')] [2024-08-05 20:36:34,663][15444] Updated weights for policy 0, policy_version 62961 (0.0012) [2024-08-05 20:36:38,091][15444] Updated weights for policy 0, policy_version 62971 (0.0012) [2024-08-05 20:36:38,119][15372] Fps is (10 sec: 24575.9, 60 sec: 24166.4, 300 sec: 24215.0). Total num frames: 515858432. Throughput: 0: 6034.2. Samples: 128958650. Policy #0 lag: (min: 0.0, avg: 4.1, max: 9.0) [2024-08-05 20:36:38,119][15372] Avg episode reward: [(0, '44.039')] [2024-08-05 20:36:41,322][15444] Updated weights for policy 0, policy_version 62981 (0.0017) [2024-08-05 20:36:43,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24029.9, 300 sec: 24187.2). Total num frames: 515973120. Throughput: 0: 6027.1. Samples: 128994760. Policy #0 lag: (min: 0.0, avg: 4.1, max: 9.0) [2024-08-05 20:36:43,126][15372] Avg episode reward: [(0, '44.452')] [2024-08-05 20:36:44,750][15444] Updated weights for policy 0, policy_version 62991 (0.0021) [2024-08-05 20:36:48,050][15444] Updated weights for policy 0, policy_version 63001 (0.0018) [2024-08-05 20:36:48,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24302.9, 300 sec: 24187.3). Total num frames: 516104192. Throughput: 0: 6022.2. Samples: 129030920. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 20:36:48,119][15372] Avg episode reward: [(0, '44.143')] [2024-08-05 20:36:51,601][15444] Updated weights for policy 0, policy_version 63011 (0.0021) [2024-08-05 20:36:53,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 516218880. Throughput: 0: 6049.1. Samples: 129049830. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 20:36:53,126][15372] Avg episode reward: [(0, '44.617')] [2024-08-05 20:36:55,128][15444] Updated weights for policy 0, policy_version 63021 (0.0011) [2024-08-05 20:36:58,120][15444] Updated weights for policy 0, policy_version 63031 (0.0031) [2024-08-05 20:36:58,131][15372] Fps is (10 sec: 24545.7, 60 sec: 24434.4, 300 sec: 24214.0). Total num frames: 516349952. Throughput: 0: 6063.2. Samples: 129086060. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 20:36:58,131][15372] Avg episode reward: [(0, '45.976')] [2024-08-05 20:36:58,135][15417] Saving new best policy, reward=45.976! [2024-08-05 20:37:01,920][15444] Updated weights for policy 0, policy_version 63041 (0.0041) [2024-08-05 20:37:03,119][15372] Fps is (10 sec: 24575.6, 60 sec: 24029.8, 300 sec: 24187.2). Total num frames: 516464640. Throughput: 0: 6041.3. Samples: 129120760. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 20:37:03,119][15372] Avg episode reward: [(0, '45.173')] [2024-08-05 20:37:05,057][15444] Updated weights for policy 0, policy_version 63051 (0.0043) [2024-08-05 20:37:08,120][15372] Fps is (10 sec: 22962.6, 60 sec: 24165.8, 300 sec: 24159.3). Total num frames: 516579328. Throughput: 0: 6054.5. Samples: 129139820. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 20:37:08,128][15372] Avg episode reward: [(0, '45.852')] [2024-08-05 20:37:08,563][15444] Updated weights for policy 0, policy_version 63061 (0.0012) [2024-08-05 20:37:09,496][15417] Signal inference workers to stop experience collection... (23050 times) [2024-08-05 20:37:09,502][15417] Signal inference workers to resume experience collection... (23050 times) [2024-08-05 20:37:09,564][15444] InferenceWorker_p0-w0: stopping experience collection (23050 times) [2024-08-05 20:37:09,564][15444] InferenceWorker_p0-w0: resuming experience collection (23050 times) [2024-08-05 20:37:12,075][15444] Updated weights for policy 0, policy_version 63071 (0.0036) [2024-08-05 20:37:13,119][15372] Fps is (10 sec: 23757.4, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 516702208. Throughput: 0: 6053.8. Samples: 129175890. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 20:37:13,127][15372] Avg episode reward: [(0, '44.600')] [2024-08-05 20:37:15,347][15444] Updated weights for policy 0, policy_version 63081 (0.0013) [2024-08-05 20:37:18,119][15372] Fps is (10 sec: 23759.4, 60 sec: 24029.7, 300 sec: 24159.4). Total num frames: 516816896. Throughput: 0: 6039.5. Samples: 129211700. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 20:37:18,119][15372] Avg episode reward: [(0, '43.652')] [2024-08-05 20:37:18,854][15444] Updated weights for policy 0, policy_version 63091 (0.0012) [2024-08-05 20:37:22,279][15444] Updated weights for policy 0, policy_version 63101 (0.0014) [2024-08-05 20:37:23,118][15372] Fps is (10 sec: 23756.6, 60 sec: 24166.4, 300 sec: 24187.3). Total num frames: 516939776. Throughput: 0: 6028.9. Samples: 129229950. Policy #0 lag: (min: 0.0, avg: 3.7, max: 7.0) [2024-08-05 20:37:23,119][15372] Avg episode reward: [(0, '44.468')] [2024-08-05 20:37:25,556][15444] Updated weights for policy 0, policy_version 63111 (0.0026) [2024-08-05 20:37:28,119][15372] Fps is (10 sec: 24576.4, 60 sec: 24166.3, 300 sec: 24159.4). Total num frames: 517062656. Throughput: 0: 6012.2. Samples: 129265310. Policy #0 lag: (min: 1.0, avg: 3.9, max: 8.0) [2024-08-05 20:37:28,126][15372] Avg episode reward: [(0, '44.510')] [2024-08-05 20:37:29,168][15444] Updated weights for policy 0, policy_version 63121 (0.0022) [2024-08-05 20:37:32,427][15444] Updated weights for policy 0, policy_version 63131 (0.0021) [2024-08-05 20:37:33,119][15372] Fps is (10 sec: 24575.3, 60 sec: 24166.3, 300 sec: 24187.2). Total num frames: 517185536. Throughput: 0: 6014.2. Samples: 129301560. Policy #0 lag: (min: 1.0, avg: 3.9, max: 8.0) [2024-08-05 20:37:33,119][15372] Avg episode reward: [(0, '44.950')] [2024-08-05 20:37:35,731][15444] Updated weights for policy 0, policy_version 63141 (0.0019) [2024-08-05 20:37:38,118][15372] Fps is (10 sec: 23757.3, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 517300224. Throughput: 0: 6003.6. Samples: 129319990. Policy #0 lag: (min: 1.0, avg: 3.9, max: 8.0) [2024-08-05 20:37:38,126][15372] Avg episode reward: [(0, '45.145')] [2024-08-05 20:37:39,355][15444] Updated weights for policy 0, policy_version 63151 (0.0045) [2024-08-05 20:37:42,744][15444] Updated weights for policy 0, policy_version 63161 (0.0030) [2024-08-05 20:37:43,118][15372] Fps is (10 sec: 23757.7, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 517423104. Throughput: 0: 6008.5. Samples: 129356370. Policy #0 lag: (min: 1.0, avg: 3.9, max: 8.0) [2024-08-05 20:37:43,119][15372] Avg episode reward: [(0, '44.398')] [2024-08-05 20:37:46,050][15444] Updated weights for policy 0, policy_version 63171 (0.0021) [2024-08-05 20:37:48,118][15372] Fps is (10 sec: 23756.8, 60 sec: 23893.3, 300 sec: 24131.8). Total num frames: 517537792. Throughput: 0: 6020.9. Samples: 129391700. Policy #0 lag: (min: 1.0, avg: 3.9, max: 8.0) [2024-08-05 20:37:48,119][15372] Avg episode reward: [(0, '44.799')] [2024-08-05 20:37:49,455][15444] Updated weights for policy 0, policy_version 63181 (0.0019) [2024-08-05 20:37:52,935][15444] Updated weights for policy 0, policy_version 63191 (0.0017) [2024-08-05 20:37:53,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.9, 300 sec: 24187.3). Total num frames: 517660672. Throughput: 0: 6001.5. Samples: 129409880. Policy #0 lag: (min: 1.0, avg: 3.9, max: 8.0) [2024-08-05 20:37:53,119][15372] Avg episode reward: [(0, '45.162')] [2024-08-05 20:37:56,238][15444] Updated weights for policy 0, policy_version 63201 (0.0020) [2024-08-05 20:37:58,118][15372] Fps is (10 sec: 24576.1, 60 sec: 23898.3, 300 sec: 24187.2). Total num frames: 517783552. Throughput: 0: 5997.3. Samples: 129445770. Policy #0 lag: (min: 1.0, avg: 3.9, max: 8.0) [2024-08-05 20:37:58,126][15372] Avg episode reward: [(0, '44.996')] [2024-08-05 20:37:59,691][15444] Updated weights for policy 0, policy_version 63211 (0.0012) [2024-08-05 20:38:03,114][15444] Updated weights for policy 0, policy_version 63221 (0.0010) [2024-08-05 20:38:03,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24030.0, 300 sec: 24159.5). Total num frames: 517906432. Throughput: 0: 6008.1. Samples: 129482060. Policy #0 lag: (min: 1.0, avg: 3.9, max: 8.0) [2024-08-05 20:38:03,128][15372] Avg episode reward: [(0, '44.604')] [2024-08-05 20:38:06,441][15444] Updated weights for policy 0, policy_version 63231 (0.0020) [2024-08-05 20:38:08,118][15372] Fps is (10 sec: 22937.4, 60 sec: 23893.9, 300 sec: 24131.7). Total num frames: 518012928. Throughput: 0: 6011.6. Samples: 129500470. Policy #0 lag: (min: 1.0, avg: 3.3, max: 7.0) [2024-08-05 20:38:08,126][15372] Avg episode reward: [(0, '43.935')] [2024-08-05 20:38:08,297][15417] Signal inference workers to stop experience collection... (23100 times) [2024-08-05 20:38:08,298][15417] Signal inference workers to resume experience collection... (23100 times) [2024-08-05 20:38:08,336][15444] InferenceWorker_p0-w0: stopping experience collection (23100 times) [2024-08-05 20:38:08,342][15444] InferenceWorker_p0-w0: resuming experience collection (23100 times) [2024-08-05 20:38:10,230][15444] Updated weights for policy 0, policy_version 63241 (0.0026) [2024-08-05 20:38:13,119][15372] Fps is (10 sec: 23756.5, 60 sec: 24029.8, 300 sec: 24159.4). Total num frames: 518144000. Throughput: 0: 6034.7. Samples: 129536870. Policy #0 lag: (min: 1.0, avg: 3.3, max: 7.0) [2024-08-05 20:38:13,126][15372] Avg episode reward: [(0, '44.192')] [2024-08-05 20:38:13,226][15444] Updated weights for policy 0, policy_version 63251 (0.0024) [2024-08-05 20:38:16,869][15444] Updated weights for policy 0, policy_version 63261 (0.0018) [2024-08-05 20:38:18,118][15372] Fps is (10 sec: 25395.2, 60 sec: 24166.5, 300 sec: 24159.5). Total num frames: 518266880. Throughput: 0: 6017.6. Samples: 129572350. Policy #0 lag: (min: 1.0, avg: 3.3, max: 7.0) [2024-08-05 20:38:18,119][15372] Avg episode reward: [(0, '44.457')] [2024-08-05 20:38:18,122][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000063265_518266880.pth... [2024-08-05 20:38:18,223][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000062556_512458752.pth [2024-08-05 20:38:19,920][15444] Updated weights for policy 0, policy_version 63271 (0.0029) [2024-08-05 20:38:23,119][15372] Fps is (10 sec: 23756.1, 60 sec: 24029.7, 300 sec: 24159.5). Total num frames: 518381568. Throughput: 0: 6024.8. Samples: 129591110. Policy #0 lag: (min: 1.0, avg: 3.3, max: 7.0) [2024-08-05 20:38:23,127][15372] Avg episode reward: [(0, '44.407')] [2024-08-05 20:38:23,593][15444] Updated weights for policy 0, policy_version 63281 (0.0024) [2024-08-05 20:38:27,072][15444] Updated weights for policy 0, policy_version 63291 (0.0013) [2024-08-05 20:38:28,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 518504448. Throughput: 0: 5998.0. Samples: 129626280. Policy #0 lag: (min: 1.0, avg: 3.3, max: 7.0) [2024-08-05 20:38:28,119][15372] Avg episode reward: [(0, '44.873')] [2024-08-05 20:38:30,186][15444] Updated weights for policy 0, policy_version 63301 (0.0010) [2024-08-05 20:38:33,119][15372] Fps is (10 sec: 24576.5, 60 sec: 24029.9, 300 sec: 24159.4). Total num frames: 518627328. Throughput: 0: 6029.1. Samples: 129663010. Policy #0 lag: (min: 1.0, avg: 3.3, max: 7.0) [2024-08-05 20:38:33,126][15372] Avg episode reward: [(0, '44.814')] [2024-08-05 20:38:33,795][15444] Updated weights for policy 0, policy_version 63311 (0.0026) [2024-08-05 20:38:37,091][15444] Updated weights for policy 0, policy_version 63321 (0.0013) [2024-08-05 20:38:38,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 518750208. Throughput: 0: 6032.9. Samples: 129681360. Policy #0 lag: (min: 1.0, avg: 3.3, max: 7.0) [2024-08-05 20:38:38,119][15372] Avg episode reward: [(0, '44.689')] [2024-08-05 20:38:40,330][15444] Updated weights for policy 0, policy_version 63331 (0.0016) [2024-08-05 20:38:43,118][15372] Fps is (10 sec: 24576.4, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 518873088. Throughput: 0: 6036.2. Samples: 129717400. Policy #0 lag: (min: 0.0, avg: 4.2, max: 9.0) [2024-08-05 20:38:43,119][15372] Avg episode reward: [(0, '44.761')] [2024-08-05 20:38:43,765][15444] Updated weights for policy 0, policy_version 63341 (0.0016) [2024-08-05 20:38:47,378][15444] Updated weights for policy 0, policy_version 63351 (0.0013) [2024-08-05 20:38:48,119][15372] Fps is (10 sec: 23756.5, 60 sec: 24166.3, 300 sec: 24159.5). Total num frames: 518987776. Throughput: 0: 6022.9. Samples: 129753090. Policy #0 lag: (min: 0.0, avg: 4.2, max: 9.0) [2024-08-05 20:38:48,119][15372] Avg episode reward: [(0, '45.077')] [2024-08-05 20:38:48,861][15417] Signal inference workers to stop experience collection... (23150 times) [2024-08-05 20:38:48,861][15417] Signal inference workers to resume experience collection... (23150 times) [2024-08-05 20:38:48,915][15444] InferenceWorker_p0-w0: stopping experience collection (23150 times) [2024-08-05 20:38:48,915][15444] InferenceWorker_p0-w0: resuming experience collection (23150 times) [2024-08-05 20:38:50,683][15444] Updated weights for policy 0, policy_version 63361 (0.0024) [2024-08-05 20:38:53,119][15372] Fps is (10 sec: 23756.6, 60 sec: 24166.3, 300 sec: 24159.5). Total num frames: 519110656. Throughput: 0: 6036.0. Samples: 129772090. Policy #0 lag: (min: 0.0, avg: 4.2, max: 9.0) [2024-08-05 20:38:53,119][15372] Avg episode reward: [(0, '44.199')] [2024-08-05 20:38:53,879][15444] Updated weights for policy 0, policy_version 63371 (0.0016) [2024-08-05 20:38:57,566][15444] Updated weights for policy 0, policy_version 63381 (0.0018) [2024-08-05 20:38:58,118][15372] Fps is (10 sec: 24576.4, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 519233536. Throughput: 0: 6024.0. Samples: 129807950. Policy #0 lag: (min: 0.0, avg: 4.2, max: 9.0) [2024-08-05 20:38:58,119][15372] Avg episode reward: [(0, '43.707')] [2024-08-05 20:39:00,815][15444] Updated weights for policy 0, policy_version 63391 (0.0027) [2024-08-05 20:39:03,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24029.8, 300 sec: 24187.2). Total num frames: 519348224. Throughput: 0: 6028.7. Samples: 129843640. Policy #0 lag: (min: 0.0, avg: 4.2, max: 9.0) [2024-08-05 20:39:03,126][15372] Avg episode reward: [(0, '43.519')] [2024-08-05 20:39:04,275][15444] Updated weights for policy 0, policy_version 63401 (0.0013) [2024-08-05 20:39:07,707][15444] Updated weights for policy 0, policy_version 63411 (0.0012) [2024-08-05 20:39:08,118][15372] Fps is (10 sec: 22937.5, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 519462912. Throughput: 0: 6010.0. Samples: 129861560. Policy #0 lag: (min: 0.0, avg: 4.2, max: 9.0) [2024-08-05 20:39:08,119][15372] Avg episode reward: [(0, '43.818')] [2024-08-05 20:39:10,902][15444] Updated weights for policy 0, policy_version 63421 (0.0045) [2024-08-05 20:39:13,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24166.4, 300 sec: 24187.2). Total num frames: 519593984. Throughput: 0: 6033.3. Samples: 129897780. Policy #0 lag: (min: 0.0, avg: 4.2, max: 9.0) [2024-08-05 20:39:13,126][15372] Avg episode reward: [(0, '43.724')] [2024-08-05 20:39:14,700][15444] Updated weights for policy 0, policy_version 63431 (0.0023) [2024-08-05 20:39:18,119][15372] Fps is (10 sec: 23755.8, 60 sec: 23893.2, 300 sec: 24131.7). Total num frames: 519700480. Throughput: 0: 6003.3. Samples: 129933160. Policy #0 lag: (min: 0.0, avg: 4.2, max: 9.0) [2024-08-05 20:39:18,120][15372] Avg episode reward: [(0, '43.984')] [2024-08-05 20:39:18,222][15444] Updated weights for policy 0, policy_version 63441 (0.0013) [2024-08-05 20:39:21,238][15444] Updated weights for policy 0, policy_version 63451 (0.0022) [2024-08-05 20:39:23,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.5, 300 sec: 24159.5). Total num frames: 519831552. Throughput: 0: 6012.7. Samples: 129951930. Policy #0 lag: (min: 0.0, avg: 4.5, max: 8.0) [2024-08-05 20:39:23,126][15372] Avg episode reward: [(0, '45.351')] [2024-08-05 20:39:24,865][15444] Updated weights for policy 0, policy_version 63461 (0.0028) [2024-08-05 20:39:26,860][15417] Signal inference workers to stop experience collection... (23200 times) [2024-08-05 20:39:26,861][15417] Signal inference workers to resume experience collection... (23200 times) [2024-08-05 20:39:26,897][15444] InferenceWorker_p0-w0: stopping experience collection (23200 times) [2024-08-05 20:39:26,897][15444] InferenceWorker_p0-w0: resuming experience collection (23200 times) [2024-08-05 20:39:27,922][15444] Updated weights for policy 0, policy_version 63471 (0.0014) [2024-08-05 20:39:28,119][15372] Fps is (10 sec: 25396.0, 60 sec: 24166.3, 300 sec: 24159.5). Total num frames: 519954432. Throughput: 0: 6026.7. Samples: 129988600. Policy #0 lag: (min: 0.0, avg: 4.5, max: 8.0) [2024-08-05 20:39:28,119][15372] Avg episode reward: [(0, '44.391')] [2024-08-05 20:39:31,398][15444] Updated weights for policy 0, policy_version 63481 (0.0016) [2024-08-05 20:39:33,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 520069120. Throughput: 0: 6039.1. Samples: 130024850. Policy #0 lag: (min: 0.0, avg: 4.5, max: 8.0) [2024-08-05 20:39:33,119][15372] Avg episode reward: [(0, '44.210')] [2024-08-05 20:39:35,012][15444] Updated weights for policy 0, policy_version 63491 (0.0021) [2024-08-05 20:39:37,950][15444] Updated weights for policy 0, policy_version 63501 (0.0017) [2024-08-05 20:39:38,119][15372] Fps is (10 sec: 24575.9, 60 sec: 24166.3, 300 sec: 24159.4). Total num frames: 520200192. Throughput: 0: 6022.0. Samples: 130043080. Policy #0 lag: (min: 0.0, avg: 4.5, max: 8.0) [2024-08-05 20:39:38,119][15372] Avg episode reward: [(0, '45.476')] [2024-08-05 20:39:41,709][15444] Updated weights for policy 0, policy_version 63511 (0.0031) [2024-08-05 20:39:43,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 520314880. Throughput: 0: 6022.4. Samples: 130078960. Policy #0 lag: (min: 0.0, avg: 4.5, max: 8.0) [2024-08-05 20:39:43,119][15372] Avg episode reward: [(0, '44.928')] [2024-08-05 20:39:45,025][15444] Updated weights for policy 0, policy_version 63521 (0.0017) [2024-08-05 20:39:48,119][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 520437760. Throughput: 0: 6047.1. Samples: 130115760. Policy #0 lag: (min: 0.0, avg: 4.5, max: 8.0) [2024-08-05 20:39:48,126][15372] Avg episode reward: [(0, '44.590')] [2024-08-05 20:39:48,246][15444] Updated weights for policy 0, policy_version 63531 (0.0010) [2024-08-05 20:39:51,799][15444] Updated weights for policy 0, policy_version 63541 (0.0016) [2024-08-05 20:39:53,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 520560640. Throughput: 0: 6057.8. Samples: 130134160. Policy #0 lag: (min: 0.0, avg: 4.5, max: 8.0) [2024-08-05 20:39:53,119][15372] Avg episode reward: [(0, '45.096')] [2024-08-05 20:39:54,908][15444] Updated weights for policy 0, policy_version 63551 (0.0019) [2024-08-05 20:39:58,118][15372] Fps is (10 sec: 23757.3, 60 sec: 24029.8, 300 sec: 24131.7). Total num frames: 520675328. Throughput: 0: 6072.0. Samples: 130171020. Policy #0 lag: (min: 0.0, avg: 4.5, max: 8.0) [2024-08-05 20:39:58,126][15372] Avg episode reward: [(0, '45.260')] [2024-08-05 20:39:58,382][15444] Updated weights for policy 0, policy_version 63561 (0.0010) [2024-08-05 20:40:01,785][15444] Updated weights for policy 0, policy_version 63571 (0.0012) [2024-08-05 20:40:03,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24302.9, 300 sec: 24159.5). Total num frames: 520806400. Throughput: 0: 6082.7. Samples: 130206880. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 20:40:03,119][15372] Avg episode reward: [(0, '45.185')] [2024-08-05 20:40:05,009][15444] Updated weights for policy 0, policy_version 63581 (0.0023) [2024-08-05 20:40:08,118][15372] Fps is (10 sec: 25395.3, 60 sec: 24439.5, 300 sec: 24159.5). Total num frames: 520929280. Throughput: 0: 6085.1. Samples: 130225760. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 20:40:08,126][15372] Avg episode reward: [(0, '44.731')] [2024-08-05 20:40:08,483][15444] Updated weights for policy 0, policy_version 63591 (0.0011) [2024-08-05 20:40:11,903][15444] Updated weights for policy 0, policy_version 63601 (0.0015) [2024-08-05 20:40:13,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24302.9, 300 sec: 24159.5). Total num frames: 521052160. Throughput: 0: 6076.7. Samples: 130262050. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 20:40:13,119][15372] Avg episode reward: [(0, '44.755')] [2024-08-05 20:40:15,138][15444] Updated weights for policy 0, policy_version 63611 (0.0012) [2024-08-05 20:40:18,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24576.2, 300 sec: 24159.5). Total num frames: 521175040. Throughput: 0: 6083.8. Samples: 130298620. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 20:40:18,126][15372] Avg episode reward: [(0, '44.954')] [2024-08-05 20:40:18,130][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000063620_521175040.pth... [2024-08-05 20:40:18,259][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000062912_515375104.pth [2024-08-05 20:40:18,611][15444] Updated weights for policy 0, policy_version 63621 (0.0026) [2024-08-05 20:40:22,170][15444] Updated weights for policy 0, policy_version 63631 (0.0012) [2024-08-05 20:40:23,119][15372] Fps is (10 sec: 22937.1, 60 sec: 24166.3, 300 sec: 24131.7). Total num frames: 521281536. Throughput: 0: 6068.2. Samples: 130316150. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 20:40:23,119][15372] Avg episode reward: [(0, '45.587')] [2024-08-05 20:40:23,317][15417] Signal inference workers to stop experience collection... (23250 times) [2024-08-05 20:40:23,317][15417] Signal inference workers to resume experience collection... (23250 times) [2024-08-05 20:40:23,392][15444] InferenceWorker_p0-w0: stopping experience collection (23250 times) [2024-08-05 20:40:23,392][15444] InferenceWorker_p0-w0: resuming experience collection (23250 times) [2024-08-05 20:40:25,453][15444] Updated weights for policy 0, policy_version 63641 (0.0031) [2024-08-05 20:40:28,121][15372] Fps is (10 sec: 22930.8, 60 sec: 24165.3, 300 sec: 24131.5). Total num frames: 521404416. Throughput: 0: 6073.2. Samples: 130352270. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 20:40:28,122][15372] Avg episode reward: [(0, '44.058')] [2024-08-05 20:40:28,841][15444] Updated weights for policy 0, policy_version 63651 (0.0014) [2024-08-05 20:40:32,478][15444] Updated weights for policy 0, policy_version 63661 (0.0014) [2024-08-05 20:40:33,118][15372] Fps is (10 sec: 24576.6, 60 sec: 24302.9, 300 sec: 24131.7). Total num frames: 521527296. Throughput: 0: 6054.3. Samples: 130388200. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 20:40:33,119][15372] Avg episode reward: [(0, '44.037')] [2024-08-05 20:40:35,383][15444] Updated weights for policy 0, policy_version 63671 (0.0023) [2024-08-05 20:40:38,118][15372] Fps is (10 sec: 24583.3, 60 sec: 24166.5, 300 sec: 24131.7). Total num frames: 521650176. Throughput: 0: 6042.2. Samples: 130406060. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-08-05 20:40:38,126][15372] Avg episode reward: [(0, '43.808')] [2024-08-05 20:40:39,269][15444] Updated weights for policy 0, policy_version 63681 (0.0018) [2024-08-05 20:40:42,748][15444] Updated weights for policy 0, policy_version 63691 (0.0020) [2024-08-05 20:40:43,118][15372] Fps is (10 sec: 23756.6, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 521764864. Throughput: 0: 6015.8. Samples: 130441730. Policy #0 lag: (min: 0.0, avg: 4.2, max: 7.0) [2024-08-05 20:40:43,119][15372] Avg episode reward: [(0, '45.384')] [2024-08-05 20:40:45,894][15444] Updated weights for policy 0, policy_version 63701 (0.0011) [2024-08-05 20:40:48,119][15372] Fps is (10 sec: 23756.4, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 521887744. Throughput: 0: 6018.2. Samples: 130477700. Policy #0 lag: (min: 0.0, avg: 4.2, max: 7.0) [2024-08-05 20:40:48,126][15372] Avg episode reward: [(0, '45.011')] [2024-08-05 20:40:49,590][15444] Updated weights for policy 0, policy_version 63711 (0.0019) [2024-08-05 20:40:52,751][15444] Updated weights for policy 0, policy_version 63721 (0.0027) [2024-08-05 20:40:53,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24029.8, 300 sec: 24131.7). Total num frames: 522002432. Throughput: 0: 5988.9. Samples: 130495260. Policy #0 lag: (min: 0.0, avg: 4.2, max: 7.0) [2024-08-05 20:40:53,119][15372] Avg episode reward: [(0, '44.202')] [2024-08-05 20:40:56,255][15444] Updated weights for policy 0, policy_version 63731 (0.0024) [2024-08-05 20:40:58,118][15372] Fps is (10 sec: 22938.0, 60 sec: 24029.9, 300 sec: 24048.4). Total num frames: 522117120. Throughput: 0: 5976.0. Samples: 130530970. Policy #0 lag: (min: 0.0, avg: 4.2, max: 7.0) [2024-08-05 20:40:58,126][15372] Avg episode reward: [(0, '44.210')] [2024-08-05 20:40:59,761][15444] Updated weights for policy 0, policy_version 63741 (0.0024) [2024-08-05 20:41:02,944][15444] Updated weights for policy 0, policy_version 63751 (0.0019) [2024-08-05 20:41:03,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 522248192. Throughput: 0: 5976.7. Samples: 130567570. Policy #0 lag: (min: 0.0, avg: 4.2, max: 7.0) [2024-08-05 20:41:03,119][15372] Avg episode reward: [(0, '43.224')] [2024-08-05 20:41:06,660][15444] Updated weights for policy 0, policy_version 63761 (0.0023) [2024-08-05 20:41:07,447][15417] Signal inference workers to stop experience collection... (23300 times) [2024-08-05 20:41:07,447][15417] Signal inference workers to resume experience collection... (23300 times) [2024-08-05 20:41:07,489][15444] InferenceWorker_p0-w0: stopping experience collection (23300 times) [2024-08-05 20:41:07,489][15444] InferenceWorker_p0-w0: resuming experience collection (23300 times) [2024-08-05 20:41:08,118][15372] Fps is (10 sec: 24575.9, 60 sec: 23893.3, 300 sec: 24103.9). Total num frames: 522362880. Throughput: 0: 5998.3. Samples: 130586070. Policy #0 lag: (min: 0.0, avg: 4.2, max: 7.0) [2024-08-05 20:41:08,119][15372] Avg episode reward: [(0, '43.231')] [2024-08-05 20:41:09,862][15444] Updated weights for policy 0, policy_version 63771 (0.0015) [2024-08-05 20:41:13,118][15372] Fps is (10 sec: 23756.8, 60 sec: 23893.3, 300 sec: 24103.9). Total num frames: 522485760. Throughput: 0: 5998.4. Samples: 130622180. Policy #0 lag: (min: 0.0, avg: 4.2, max: 7.0) [2024-08-05 20:41:13,126][15372] Avg episode reward: [(0, '43.831')] [2024-08-05 20:41:13,183][15444] Updated weights for policy 0, policy_version 63781 (0.0021) [2024-08-05 20:41:16,854][15444] Updated weights for policy 0, policy_version 63791 (0.0025) [2024-08-05 20:41:18,118][15372] Fps is (10 sec: 24576.0, 60 sec: 23893.3, 300 sec: 24131.7). Total num frames: 522608640. Throughput: 0: 5984.4. Samples: 130657500. Policy #0 lag: (min: 0.0, avg: 4.2, max: 7.0) [2024-08-05 20:41:18,119][15372] Avg episode reward: [(0, '44.011')] [2024-08-05 20:41:20,086][15444] Updated weights for policy 0, policy_version 63801 (0.0013) [2024-08-05 20:41:23,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24166.5, 300 sec: 24131.7). Total num frames: 522731520. Throughput: 0: 6013.8. Samples: 130676680. Policy #0 lag: (min: 1.0, avg: 4.0, max: 7.0) [2024-08-05 20:41:23,126][15372] Avg episode reward: [(0, '44.750')] [2024-08-05 20:41:23,559][15444] Updated weights for policy 0, policy_version 63811 (0.0017) [2024-08-05 20:41:26,598][15444] Updated weights for policy 0, policy_version 63821 (0.0015) [2024-08-05 20:41:28,119][15372] Fps is (10 sec: 23756.7, 60 sec: 24031.0, 300 sec: 24103.9). Total num frames: 522846208. Throughput: 0: 6015.8. Samples: 130712440. Policy #0 lag: (min: 1.0, avg: 4.0, max: 7.0) [2024-08-05 20:41:28,126][15372] Avg episode reward: [(0, '44.462')] [2024-08-05 20:41:30,143][15444] Updated weights for policy 0, policy_version 63831 (0.0012) [2024-08-05 20:41:33,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 522977280. Throughput: 0: 6046.9. Samples: 130749810. Policy #0 lag: (min: 1.0, avg: 4.0, max: 7.0) [2024-08-05 20:41:33,126][15372] Avg episode reward: [(0, '44.949')] [2024-08-05 20:41:33,439][15444] Updated weights for policy 0, policy_version 63841 (0.0012) [2024-08-05 20:41:36,750][15444] Updated weights for policy 0, policy_version 63851 (0.0021) [2024-08-05 20:41:38,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 523091968. Throughput: 0: 6066.9. Samples: 130768270. Policy #0 lag: (min: 1.0, avg: 4.0, max: 7.0) [2024-08-05 20:41:38,119][15372] Avg episode reward: [(0, '45.210')] [2024-08-05 20:41:40,124][15444] Updated weights for policy 0, policy_version 63861 (0.0019) [2024-08-05 20:41:43,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24166.4, 300 sec: 24103.9). Total num frames: 523214848. Throughput: 0: 6079.8. Samples: 130804560. Policy #0 lag: (min: 1.0, avg: 4.0, max: 7.0) [2024-08-05 20:41:43,128][15372] Avg episode reward: [(0, '45.342')] [2024-08-05 20:41:43,545][15444] Updated weights for policy 0, policy_version 63871 (0.0025) [2024-08-05 20:41:46,758][15444] Updated weights for policy 0, policy_version 63881 (0.0011) [2024-08-05 20:41:48,119][15372] Fps is (10 sec: 24575.8, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 523337728. Throughput: 0: 6067.3. Samples: 130840600. Policy #0 lag: (min: 1.0, avg: 4.0, max: 7.0) [2024-08-05 20:41:48,119][15372] Avg episode reward: [(0, '45.728')] [2024-08-05 20:41:50,414][15444] Updated weights for policy 0, policy_version 63891 (0.0025) [2024-08-05 20:41:53,119][15372] Fps is (10 sec: 24575.6, 60 sec: 24302.9, 300 sec: 24104.9). Total num frames: 523460608. Throughput: 0: 6072.6. Samples: 130859340. Policy #0 lag: (min: 1.0, avg: 4.0, max: 7.0) [2024-08-05 20:41:53,119][15372] Avg episode reward: [(0, '44.266')] [2024-08-05 20:41:53,603][15444] Updated weights for policy 0, policy_version 63901 (0.0017) [2024-08-05 20:41:56,974][15444] Updated weights for policy 0, policy_version 63911 (0.0011) [2024-08-05 20:41:58,118][15372] Fps is (10 sec: 24576.3, 60 sec: 24439.5, 300 sec: 24131.7). Total num frames: 523583488. Throughput: 0: 6071.3. Samples: 130895390. Policy #0 lag: (min: 1.0, avg: 4.0, max: 7.0) [2024-08-05 20:41:58,119][15372] Avg episode reward: [(0, '44.356')] [2024-08-05 20:42:00,303][15444] Updated weights for policy 0, policy_version 63921 (0.0022) [2024-08-05 20:42:03,118][15372] Fps is (10 sec: 24576.5, 60 sec: 24302.9, 300 sec: 24159.6). Total num frames: 523706368. Throughput: 0: 6095.1. Samples: 130931780. Policy #0 lag: (min: 1.0, avg: 3.8, max: 7.0) [2024-08-05 20:42:03,119][15372] Avg episode reward: [(0, '44.135')] [2024-08-05 20:42:03,819][15444] Updated weights for policy 0, policy_version 63931 (0.0016) [2024-08-05 20:42:07,266][15444] Updated weights for policy 0, policy_version 63941 (0.0015) [2024-08-05 20:42:08,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24302.9, 300 sec: 24131.7). Total num frames: 523821056. Throughput: 0: 6070.2. Samples: 130949840. Policy #0 lag: (min: 1.0, avg: 3.8, max: 7.0) [2024-08-05 20:42:08,119][15372] Avg episode reward: [(0, '43.733')] [2024-08-05 20:42:10,644][15444] Updated weights for policy 0, policy_version 63951 (0.0019) [2024-08-05 20:42:13,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24302.9, 300 sec: 24159.5). Total num frames: 523943936. Throughput: 0: 6077.8. Samples: 130985940. Policy #0 lag: (min: 1.0, avg: 3.8, max: 7.0) [2024-08-05 20:42:13,126][15372] Avg episode reward: [(0, '44.176')] [2024-08-05 20:42:14,049][15444] Updated weights for policy 0, policy_version 63961 (0.0029) [2024-08-05 20:42:17,313][15444] Updated weights for policy 0, policy_version 63971 (0.0014) [2024-08-05 20:42:18,119][15372] Fps is (10 sec: 24575.3, 60 sec: 24302.8, 300 sec: 24159.4). Total num frames: 524066816. Throughput: 0: 6053.1. Samples: 131022200. Policy #0 lag: (min: 1.0, avg: 3.8, max: 7.0) [2024-08-05 20:42:18,127][15372] Avg episode reward: [(0, '43.973')] [2024-08-05 20:42:18,131][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000063973_524066816.pth... [2024-08-05 20:42:18,296][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000063265_518266880.pth [2024-08-05 20:42:20,140][15417] Signal inference workers to stop experience collection... (23350 times) [2024-08-05 20:42:20,140][15417] Signal inference workers to resume experience collection... (23350 times) [2024-08-05 20:42:20,199][15444] InferenceWorker_p0-w0: stopping experience collection (23350 times) [2024-08-05 20:42:20,199][15444] InferenceWorker_p0-w0: resuming experience collection (23350 times) [2024-08-05 20:42:20,853][15444] Updated weights for policy 0, policy_version 63981 (0.0021) [2024-08-05 20:42:23,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 524181504. Throughput: 0: 6042.4. Samples: 131040180. Policy #0 lag: (min: 1.0, avg: 3.8, max: 7.0) [2024-08-05 20:42:23,119][15372] Avg episode reward: [(0, '43.350')] [2024-08-05 20:42:24,166][15444] Updated weights for policy 0, policy_version 63991 (0.0012) [2024-08-05 20:42:27,529][15444] Updated weights for policy 0, policy_version 64001 (0.0016) [2024-08-05 20:42:28,119][15372] Fps is (10 sec: 23756.7, 60 sec: 24302.8, 300 sec: 24131.7). Total num frames: 524304384. Throughput: 0: 6052.4. Samples: 131076920. Policy #0 lag: (min: 1.0, avg: 3.8, max: 7.0) [2024-08-05 20:42:28,119][15372] Avg episode reward: [(0, '45.033')] [2024-08-05 20:42:30,852][15444] Updated weights for policy 0, policy_version 64011 (0.0013) [2024-08-05 20:42:33,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 524427264. Throughput: 0: 6051.8. Samples: 131112930. Policy #0 lag: (min: 1.0, avg: 3.8, max: 7.0) [2024-08-05 20:42:33,126][15372] Avg episode reward: [(0, '46.435')] [2024-08-05 20:42:33,127][15417] Saving new best policy, reward=46.435! [2024-08-05 20:42:34,384][15444] Updated weights for policy 0, policy_version 64021 (0.0013) [2024-08-05 20:42:37,769][15444] Updated weights for policy 0, policy_version 64031 (0.0020) [2024-08-05 20:42:38,118][15372] Fps is (10 sec: 23757.7, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 524541952. Throughput: 0: 6025.4. Samples: 131130480. Policy #0 lag: (min: 1.0, avg: 3.8, max: 7.0) [2024-08-05 20:42:38,119][15372] Avg episode reward: [(0, '45.532')] [2024-08-05 20:42:41,279][15444] Updated weights for policy 0, policy_version 64041 (0.0011) [2024-08-05 20:42:43,118][15372] Fps is (10 sec: 23756.7, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 524664832. Throughput: 0: 6023.8. Samples: 131166460. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 20:42:43,126][15372] Avg episode reward: [(0, '44.722')] [2024-08-05 20:42:44,579][15444] Updated weights for policy 0, policy_version 64051 (0.0025) [2024-08-05 20:42:47,995][15444] Updated weights for policy 0, policy_version 64061 (0.0027) [2024-08-05 20:42:48,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 524787712. Throughput: 0: 6029.1. Samples: 131203090. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 20:42:48,119][15372] Avg episode reward: [(0, '44.307')] [2024-08-05 20:42:51,300][15444] Updated weights for policy 0, policy_version 64071 (0.0014) [2024-08-05 20:42:53,118][15372] Fps is (10 sec: 24576.0, 60 sec: 24166.5, 300 sec: 24159.5). Total num frames: 524910592. Throughput: 0: 6044.4. Samples: 131221840. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 20:42:53,126][15372] Avg episode reward: [(0, '45.027')] [2024-08-05 20:42:54,782][15444] Updated weights for policy 0, policy_version 64081 (0.0011) [2024-08-05 20:42:58,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 525025280. Throughput: 0: 6049.8. Samples: 131258180. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 20:42:58,126][15372] Avg episode reward: [(0, '45.471')] [2024-08-05 20:42:58,213][15444] Updated weights for policy 0, policy_version 64091 (0.0020) [2024-08-05 20:43:01,518][15444] Updated weights for policy 0, policy_version 64101 (0.0022) [2024-08-05 20:43:03,118][15372] Fps is (10 sec: 23756.9, 60 sec: 24029.9, 300 sec: 24187.2). Total num frames: 525148160. Throughput: 0: 6040.9. Samples: 131294040. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 20:43:03,126][15372] Avg episode reward: [(0, '45.468')] [2024-08-05 20:43:04,732][15444] Updated weights for policy 0, policy_version 64111 (0.0021) [2024-08-05 20:43:08,118][15372] Fps is (10 sec: 24576.1, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 525271040. Throughput: 0: 6052.2. Samples: 131312530. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 20:43:08,126][15372] Avg episode reward: [(0, '44.680')] [2024-08-05 20:43:08,251][15444] Updated weights for policy 0, policy_version 64121 (0.0016) [2024-08-05 20:43:11,844][15444] Updated weights for policy 0, policy_version 64131 (0.0020) [2024-08-05 20:43:13,119][15372] Fps is (10 sec: 24575.5, 60 sec: 24166.3, 300 sec: 24159.4). Total num frames: 525393920. Throughput: 0: 6044.7. Samples: 131348930. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 20:43:13,119][15372] Avg episode reward: [(0, '43.729')] [2024-08-05 20:43:14,980][15444] Updated weights for policy 0, policy_version 64141 (0.0015) [2024-08-05 20:43:18,119][15372] Fps is (10 sec: 23756.6, 60 sec: 24030.0, 300 sec: 24159.5). Total num frames: 525508608. Throughput: 0: 6044.7. Samples: 131384940. Policy #0 lag: (min: 1.0, avg: 4.0, max: 7.0) [2024-08-05 20:43:18,126][15372] Avg episode reward: [(0, '45.041')] [2024-08-05 20:43:18,550][15444] Updated weights for policy 0, policy_version 64151 (0.0014) [2024-08-05 20:43:21,868][15417] Signal inference workers to stop experience collection... (23400 times) [2024-08-05 20:43:21,890][15417] Signal inference workers to resume experience collection... (23400 times) [2024-08-05 20:43:21,918][15444] InferenceWorker_p0-w0: stopping experience collection (23400 times) [2024-08-05 20:43:21,953][15444] InferenceWorker_p0-w0: resuming experience collection (23400 times) [2024-08-05 20:43:21,954][15444] Updated weights for policy 0, policy_version 64161 (0.0014) [2024-08-05 20:43:23,118][15372] Fps is (10 sec: 24576.4, 60 sec: 24302.9, 300 sec: 24187.2). Total num frames: 525639680. Throughput: 0: 6066.4. Samples: 131403470. Policy #0 lag: (min: 1.0, avg: 4.0, max: 7.0) [2024-08-05 20:43:23,119][15372] Avg episode reward: [(0, '45.020')] [2024-08-05 20:43:25,036][15444] Updated weights for policy 0, policy_version 64171 (0.0014) [2024-08-05 20:43:28,118][15372] Fps is (10 sec: 24576.2, 60 sec: 24166.6, 300 sec: 24159.5). Total num frames: 525754368. Throughput: 0: 6072.4. Samples: 131439720. Policy #0 lag: (min: 1.0, avg: 4.0, max: 7.0) [2024-08-05 20:43:28,126][15372] Avg episode reward: [(0, '45.008')] [2024-08-05 20:43:28,714][15444] Updated weights for policy 0, policy_version 64181 (0.0022) [2024-08-05 20:43:32,127][15444] Updated weights for policy 0, policy_version 64191 (0.0018) [2024-08-05 20:43:33,119][15372] Fps is (10 sec: 22937.3, 60 sec: 24029.8, 300 sec: 24131.7). Total num frames: 525869056. Throughput: 0: 6045.8. Samples: 131475150. Policy #0 lag: (min: 1.0, avg: 4.0, max: 7.0) [2024-08-05 20:43:33,119][15372] Avg episode reward: [(0, '45.297')] [2024-08-05 20:43:35,386][15444] Updated weights for policy 0, policy_version 64201 (0.0013) [2024-08-05 20:43:38,119][15372] Fps is (10 sec: 24575.7, 60 sec: 24302.9, 300 sec: 24159.5). Total num frames: 526000128. Throughput: 0: 6039.5. Samples: 131493620. Policy #0 lag: (min: 1.0, avg: 4.0, max: 7.0) [2024-08-05 20:43:38,119][15372] Avg episode reward: [(0, '44.667')] [2024-08-05 20:43:38,838][15444] Updated weights for policy 0, policy_version 64211 (0.0011) [2024-08-05 20:43:42,126][15444] Updated weights for policy 0, policy_version 64221 (0.0012) [2024-08-05 20:43:43,118][15372] Fps is (10 sec: 24576.3, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 526114816. Throughput: 0: 6033.6. Samples: 131529690. Policy #0 lag: (min: 1.0, avg: 4.0, max: 7.0) [2024-08-05 20:43:43,119][15372] Avg episode reward: [(0, '44.144')] [2024-08-05 20:43:45,877][15444] Updated weights for policy 0, policy_version 64231 (0.0014) [2024-08-05 20:43:48,120][15372] Fps is (10 sec: 23752.8, 60 sec: 24165.7, 300 sec: 24159.3). Total num frames: 526237696. Throughput: 0: 6057.3. Samples: 131566630. Policy #0 lag: (min: 1.0, avg: 4.0, max: 7.0) [2024-08-05 20:43:48,120][15372] Avg episode reward: [(0, '44.083')] [2024-08-05 20:43:48,767][15444] Updated weights for policy 0, policy_version 64241 (0.0015) [2024-08-05 20:43:52,652][15444] Updated weights for policy 0, policy_version 64251 (0.0032) [2024-08-05 20:43:52,775][15417] Signal inference workers to stop experience collection... (23450 times) [2024-08-05 20:43:52,775][15417] Signal inference workers to resume experience collection... (23450 times) [2024-08-05 20:43:52,807][15444] InferenceWorker_p0-w0: stopping experience collection (23450 times) [2024-08-05 20:43:52,808][15444] InferenceWorker_p0-w0: resuming experience collection (23450 times) [2024-08-05 20:43:53,119][15372] Fps is (10 sec: 24575.8, 60 sec: 24166.4, 300 sec: 24159.4). Total num frames: 526360576. Throughput: 0: 6033.3. Samples: 131584030. Policy #0 lag: (min: 1.0, avg: 4.0, max: 7.0) [2024-08-05 20:43:53,119][15372] Avg episode reward: [(0, '44.658')] [2024-08-05 20:43:56,092][15444] Updated weights for policy 0, policy_version 64261 (0.0024) [2024-08-05 20:43:58,118][15372] Fps is (10 sec: 23761.1, 60 sec: 24166.4, 300 sec: 24159.5). Total num frames: 526475264. Throughput: 0: 6021.8. Samples: 131619910. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 20:43:58,126][15372] Avg episode reward: [(0, '44.925')] [2024-08-05 20:43:59,288][15444] Updated weights for policy 0, policy_version 64271 (0.0032) [2024-08-05 20:44:03,118][15372] Fps is (10 sec: 22118.7, 60 sec: 23893.3, 300 sec: 24131.7). Total num frames: 526581760. Throughput: 0: 6003.8. Samples: 131655110. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 20:44:03,126][15372] Avg episode reward: [(0, '44.545')] [2024-08-05 20:44:03,141][15444] Updated weights for policy 0, policy_version 64281 (0.0011) [2024-08-05 20:44:06,029][15444] Updated weights for policy 0, policy_version 64291 (0.0027) [2024-08-05 20:44:08,119][15372] Fps is (10 sec: 24575.6, 60 sec: 24166.3, 300 sec: 24159.4). Total num frames: 526721024. Throughput: 0: 5988.0. Samples: 131672930. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 20:44:08,119][15372] Avg episode reward: [(0, '44.053')] [2024-08-05 20:44:09,761][15444] Updated weights for policy 0, policy_version 64301 (0.0029) [2024-08-05 20:44:13,124][15372] Fps is (10 sec: 24561.7, 60 sec: 23891.1, 300 sec: 24159.0). Total num frames: 526827520. Throughput: 0: 5969.2. Samples: 131708370. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 20:44:13,132][15372] Avg episode reward: [(0, '44.257')] [2024-08-05 20:44:13,228][15444] Updated weights for policy 0, policy_version 64311 (0.0012) [2024-08-05 20:44:16,515][15444] Updated weights for policy 0, policy_version 64321 (0.0011) [2024-08-05 20:44:18,118][15372] Fps is (10 sec: 22938.0, 60 sec: 24029.9, 300 sec: 24131.7). Total num frames: 526950400. Throughput: 0: 5963.8. Samples: 131743520. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 20:44:18,126][15372] Avg episode reward: [(0, '44.182')] [2024-08-05 20:44:18,130][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000064325_526950400.pth... [2024-08-05 20:44:18,260][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000063620_521175040.pth [2024-08-05 20:44:20,079][15444] Updated weights for policy 0, policy_version 64331 (0.0013) [2024-08-05 20:44:23,118][15372] Fps is (10 sec: 24590.4, 60 sec: 23893.3, 300 sec: 24131.7). Total num frames: 527073280. Throughput: 0: 5962.9. Samples: 131761950. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 20:44:23,126][15372] Avg episode reward: [(0, '43.755')] [2024-08-05 20:44:23,309][15444] Updated weights for policy 0, policy_version 64341 (0.0014) [2024-08-05 20:44:26,756][15444] Updated weights for policy 0, policy_version 64351 (0.0013) [2024-08-05 20:44:27,148][15417] Signal inference workers to stop experience collection... (23500 times) [2024-08-05 20:44:27,153][15417] Signal inference workers to resume experience collection... (23500 times) [2024-08-05 20:44:27,228][15444] InferenceWorker_p0-w0: stopping experience collection (23500 times) [2024-08-05 20:44:27,234][15444] InferenceWorker_p0-w0: resuming experience collection (23500 times) [2024-08-05 20:44:28,118][15372] Fps is (10 sec: 23756.8, 60 sec: 23893.4, 300 sec: 24131.7). Total num frames: 527187968. Throughput: 0: 5969.6. Samples: 131798320. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 20:44:28,119][15372] Avg episode reward: [(0, '43.942')] [2024-08-05 20:44:30,248][15444] Updated weights for policy 0, policy_version 64361 (0.0015) [2024-08-05 20:44:33,118][15372] Fps is (10 sec: 24575.9, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 527319040. Throughput: 0: 5969.3. Samples: 131835240. Policy #0 lag: (min: 1.0, avg: 4.2, max: 8.0) [2024-08-05 20:44:33,127][15372] Avg episode reward: [(0, '45.519')] [2024-08-05 20:44:33,370][15444] Updated weights for policy 0, policy_version 64371 (0.0020) [2024-08-05 20:44:36,991][15444] Updated weights for policy 0, policy_version 64381 (0.0021) [2024-08-05 20:44:38,119][15372] Fps is (10 sec: 25394.6, 60 sec: 24029.8, 300 sec: 24159.4). Total num frames: 527441920. Throughput: 0: 5984.2. Samples: 131853320. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 20:44:38,119][15372] Avg episode reward: [(0, '44.842')] [2024-08-05 20:44:40,058][15444] Updated weights for policy 0, policy_version 64391 (0.0012) [2024-08-05 20:44:43,118][15372] Fps is (10 sec: 23756.8, 60 sec: 24029.8, 300 sec: 24131.7). Total num frames: 527556608. Throughput: 0: 5998.9. Samples: 131889860. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 20:44:43,126][15372] Avg episode reward: [(0, '44.166')] [2024-08-05 20:44:43,553][15444] Updated weights for policy 0, policy_version 64401 (0.0016) [2024-08-05 20:44:47,153][15444] Updated weights for policy 0, policy_version 64411 (0.0013) [2024-08-05 20:44:48,119][15372] Fps is (10 sec: 22937.8, 60 sec: 23894.0, 300 sec: 24103.9). Total num frames: 527671296. Throughput: 0: 6002.2. Samples: 131925210. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 20:44:48,119][15372] Avg episode reward: [(0, '44.605')] [2024-08-05 20:44:50,586][15444] Updated weights for policy 0, policy_version 64421 (0.0014) [2024-08-05 20:44:53,119][15372] Fps is (10 sec: 24575.9, 60 sec: 24029.9, 300 sec: 24159.5). Total num frames: 527802368. Throughput: 0: 6028.7. Samples: 131944220. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 20:44:53,119][15372] Avg episode reward: [(0, '44.679')] [2024-08-05 20:44:53,735][15444] Updated weights for policy 0, policy_version 64431 (0.0013) [2024-08-05 20:44:57,251][15444] Updated weights for policy 0, policy_version 64441 (0.0017) [2024-08-05 20:44:58,118][15372] Fps is (10 sec: 25395.5, 60 sec: 24166.4, 300 sec: 24131.7). Total num frames: 527925248. Throughput: 0: 6049.4. Samples: 131980560. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 20:44:58,119][15372] Avg episode reward: [(0, '44.459')] [2024-08-05 20:45:00,716][15444] Updated weights for policy 0, policy_version 64451 (0.0014) [2024-08-05 20:45:03,119][15372] Fps is (10 sec: 22937.7, 60 sec: 24166.4, 300 sec: 24076.1). Total num frames: 528031744. Throughput: 0: 6069.8. Samples: 132016660. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 20:45:03,126][15372] Avg episode reward: [(0, '44.301')] [2024-08-05 20:45:03,897][15444] Updated weights for policy 0, policy_version 64461 (0.0021) [2024-08-05 20:45:08,119][15372] Fps is (10 sec: 15564.1, 60 sec: 22664.4, 300 sec: 23826.2). Total num frames: 528080896. Throughput: 0: 5934.6. Samples: 132029010. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 20:45:08,127][15372] Avg episode reward: [(0, '44.814')] [2024-08-05 20:45:13,118][15372] Fps is (10 sec: 9830.4, 60 sec: 21710.9, 300 sec: 23576.3). Total num frames: 528130048. Throughput: 0: 5232.0. Samples: 132033760. Policy #0 lag: (min: 0.0, avg: 4.1, max: 7.0) [2024-08-05 20:45:13,126][15372] Avg episode reward: [(0, '44.606')] [2024-08-05 20:45:13,552][15417] Signal inference workers to stop experience collection... (23550 times) [2024-08-05 20:45:13,553][15417] Signal inference workers to resume experience collection... (23550 times) [2024-08-05 20:45:13,618][15444] InferenceWorker_p0-w0: stopping experience collection (23550 times) [2024-08-05 20:45:13,618][15444] InferenceWorker_p0-w0: resuming experience collection (23550 times) [2024-08-05 20:45:13,657][15444] Updated weights for policy 0, policy_version 64471 (0.0026) [2024-08-05 20:45:17,375][15444] Updated weights for policy 0, policy_version 64481 (0.0012) [2024-08-05 20:45:18,119][15372] Fps is (10 sec: 15565.3, 60 sec: 21435.7, 300 sec: 23576.3). Total num frames: 528236544. Throughput: 0: 5120.4. Samples: 132065660. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 20:45:18,119][15372] Avg episode reward: [(0, '44.285')] [2024-08-05 20:45:21,211][15444] Updated weights for policy 0, policy_version 64491 (0.0025) [2024-08-05 20:45:23,118][15372] Fps is (10 sec: 22118.5, 60 sec: 21299.2, 300 sec: 23548.8). Total num frames: 528351232. Throughput: 0: 5102.2. Samples: 132082920. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 20:45:23,119][15372] Avg episode reward: [(0, '44.375')] [2024-08-05 20:45:24,834][15444] Updated weights for policy 0, policy_version 64501 (0.0026) [2024-08-05 20:45:28,118][15372] Fps is (10 sec: 22937.9, 60 sec: 21299.2, 300 sec: 23520.8). Total num frames: 528465920. Throughput: 0: 5058.7. Samples: 132117500. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 20:45:28,126][15372] Avg episode reward: [(0, '43.227')] [2024-08-05 20:45:28,321][15444] Updated weights for policy 0, policy_version 64511 (0.0018) [2024-08-05 20:45:31,853][15444] Updated weights for policy 0, policy_version 64521 (0.0014) [2024-08-05 20:45:33,118][15372] Fps is (10 sec: 23756.8, 60 sec: 21162.7, 300 sec: 23520.8). Total num frames: 528588800. Throughput: 0: 5032.7. Samples: 132151680. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 20:45:33,119][15372] Avg episode reward: [(0, '43.908')] [2024-08-05 20:45:35,154][15444] Updated weights for policy 0, policy_version 64531 (0.0022) [2024-08-05 20:45:38,118][15372] Fps is (10 sec: 22937.6, 60 sec: 20889.7, 300 sec: 23493.0). Total num frames: 528695296. Throughput: 0: 5014.7. Samples: 132169880. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 20:45:38,126][15372] Avg episode reward: [(0, '44.726')] [2024-08-05 20:45:38,915][15444] Updated weights for policy 0, policy_version 64541 (0.0021) [2024-08-05 20:45:42,732][15444] Updated weights for policy 0, policy_version 64551 (0.0028) [2024-08-05 20:45:43,118][15372] Fps is (10 sec: 21299.2, 60 sec: 20753.1, 300 sec: 23437.5). Total num frames: 528801792. Throughput: 0: 4960.7. Samples: 132203790. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 20:45:43,119][15372] Avg episode reward: [(0, '44.603')] [2024-08-05 20:45:46,259][15444] Updated weights for policy 0, policy_version 64561 (0.0023) [2024-08-05 20:45:48,118][15372] Fps is (10 sec: 22118.4, 60 sec: 20753.1, 300 sec: 23437.5). Total num frames: 528916480. Throughput: 0: 4868.9. Samples: 132235760. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 20:45:48,126][15372] Avg episode reward: [(0, '45.573')] [2024-08-05 20:45:49,910][15444] Updated weights for policy 0, policy_version 64571 (0.0035) [2024-08-05 20:45:53,119][15372] Fps is (10 sec: 22937.1, 60 sec: 20480.0, 300 sec: 23437.4). Total num frames: 529031168. Throughput: 0: 4973.4. Samples: 132252810. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-08-05 20:45:53,127][15372] Avg episode reward: [(0, '44.920')] [2024-08-05 20:45:53,994][15444] Updated weights for policy 0, policy_version 64581 (0.0022) [2024-08-05 20:45:57,439][15444] Updated weights for policy 0, policy_version 64591 (0.0032) [2024-08-05 20:45:58,118][15372] Fps is (10 sec: 22937.6, 60 sec: 20343.5, 300 sec: 23381.9). Total num frames: 529145856. Throughput: 0: 5604.2. Samples: 132285950. Policy #0 lag: (min: 1.0, avg: 4.1, max: 7.0) [2024-08-05 20:45:58,119][15372] Avg episode reward: [(0, '44.176')] [2024-08-05 20:46:01,223][15444] Updated weights for policy 0, policy_version 64601 (0.0013) [2024-08-05 20:46:03,119][15372] Fps is (10 sec: 21299.5, 60 sec: 20206.9, 300 sec: 23326.4). Total num frames: 529244160. Throughput: 0: 5646.5. Samples: 132319750. Policy #0 lag: (min: 1.0, avg: 4.1, max: 7.0) [2024-08-05 20:46:03,119][15372] Avg episode reward: [(0, '44.935')] [2024-08-05 20:46:04,419][15444] Updated weights for policy 0, policy_version 64611 (0.0021) [2024-08-05 20:46:05,767][15417] Signal inference workers to stop experience collection... (23600 times) [2024-08-05 20:46:05,770][15417] Signal inference workers to resume experience collection... (23600 times) [2024-08-05 20:46:05,825][15444] InferenceWorker_p0-w0: stopping experience collection (23600 times) [2024-08-05 20:46:05,832][15444] InferenceWorker_p0-w0: resuming experience collection (23600 times) [2024-08-05 20:46:08,118][15372] Fps is (10 sec: 22118.4, 60 sec: 21435.9, 300 sec: 23326.4). Total num frames: 529367040. Throughput: 0: 5658.2. Samples: 132337540. Policy #0 lag: (min: 1.0, avg: 4.1, max: 7.0) [2024-08-05 20:46:08,119][15372] Avg episode reward: [(0, '44.873')] [2024-08-05 20:46:08,218][15444] Updated weights for policy 0, policy_version 64621 (0.0022) [2024-08-05 20:46:11,855][15444] Updated weights for policy 0, policy_version 64631 (0.0024) [2024-08-05 20:46:13,118][15372] Fps is (10 sec: 24576.1, 60 sec: 22664.5, 300 sec: 23326.4). Total num frames: 529489920. Throughput: 0: 5658.2. Samples: 132372120. Policy #0 lag: (min: 1.0, avg: 4.1, max: 7.0) [2024-08-05 20:46:13,119][15372] Avg episode reward: [(0, '44.782')] [2024-08-05 20:46:15,186][15444] Updated weights for policy 0, policy_version 64641 (0.0011) [2024-08-05 20:46:18,118][15372] Fps is (10 sec: 22937.6, 60 sec: 22664.6, 300 sec: 23270.8). Total num frames: 529596416. Throughput: 0: 5670.0. Samples: 132406830. Policy #0 lag: (min: 1.0, avg: 4.1, max: 7.0) [2024-08-05 20:46:18,119][15372] Avg episode reward: [(0, '45.063')] [2024-08-05 20:46:18,193][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000064649_529604608.pth... [2024-08-05 20:46:18,310][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000063973_524066816.pth [2024-08-05 20:46:19,314][15444] Updated weights for policy 0, policy_version 64651 (0.0027) [2024-08-05 20:46:23,119][15372] Fps is (10 sec: 19660.7, 60 sec: 22254.9, 300 sec: 23187.5). Total num frames: 529686528. Throughput: 0: 5550.2. Samples: 132419640. Policy #0 lag: (min: 1.0, avg: 4.1, max: 7.0) [2024-08-05 20:46:23,126][15372] Avg episode reward: [(0, '45.285')] [2024-08-05 20:46:23,517][15444] Updated weights for policy 0, policy_version 64661 (0.0013) [2024-08-05 20:46:27,317][15444] Updated weights for policy 0, policy_version 64671 (0.0041) [2024-08-05 20:46:28,119][15372] Fps is (10 sec: 19660.5, 60 sec: 22118.3, 300 sec: 23104.2). Total num frames: 529793024. Throughput: 0: 5479.5. Samples: 132450370. Policy #0 lag: (min: 1.0, avg: 4.1, max: 7.0) [2024-08-05 20:46:28,119][15372] Avg episode reward: [(0, '44.294')] [2024-08-05 20:46:31,142][15444] Updated weights for policy 0, policy_version 64681 (0.0011) [2024-08-05 20:46:33,118][15372] Fps is (10 sec: 22118.6, 60 sec: 21981.9, 300 sec: 23104.2). Total num frames: 529907712. Throughput: 0: 5487.3. Samples: 132482690. Policy #0 lag: (min: 1.0, avg: 4.1, max: 7.0) [2024-08-05 20:46:33,119][15372] Avg episode reward: [(0, '44.245')] [2024-08-05 20:46:35,103][15444] Updated weights for policy 0, policy_version 64691 (0.0029) [2024-08-05 20:46:38,118][15372] Fps is (10 sec: 21299.4, 60 sec: 21845.3, 300 sec: 23020.9). Total num frames: 530006016. Throughput: 0: 5460.7. Samples: 132498540. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 20:46:38,127][15372] Avg episode reward: [(0, '45.078')] [2024-08-05 20:46:38,745][15444] Updated weights for policy 0, policy_version 64701 (0.0014) [2024-08-05 20:46:42,489][15444] Updated weights for policy 0, policy_version 64711 (0.0013) [2024-08-05 20:46:43,119][15372] Fps is (10 sec: 20479.0, 60 sec: 21845.2, 300 sec: 22965.3). Total num frames: 530112512. Throughput: 0: 5447.3. Samples: 132531080. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 20:46:43,120][15372] Avg episode reward: [(0, '44.789')] [2024-08-05 20:46:46,852][15444] Updated weights for policy 0, policy_version 64721 (0.0031) [2024-08-05 20:46:48,118][15372] Fps is (10 sec: 21299.3, 60 sec: 21708.8, 300 sec: 22909.8). Total num frames: 530219008. Throughput: 0: 5360.0. Samples: 132560950. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 20:46:48,119][15372] Avg episode reward: [(0, '44.549')] [2024-08-05 20:46:51,320][15444] Updated weights for policy 0, policy_version 64731 (0.0020) [2024-08-05 20:46:53,120][15372] Fps is (10 sec: 18839.5, 60 sec: 21162.2, 300 sec: 22770.9). Total num frames: 530300928. Throughput: 0: 5251.8. Samples: 132573880. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 20:46:53,122][15372] Avg episode reward: [(0, '45.520')] [2024-08-05 20:46:55,639][15444] Updated weights for policy 0, policy_version 64741 (0.0016) [2024-08-05 20:46:58,118][15372] Fps is (10 sec: 18841.6, 60 sec: 21026.1, 300 sec: 22715.4). Total num frames: 530407424. Throughput: 0: 5113.6. Samples: 132602230. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 20:46:58,119][15372] Avg episode reward: [(0, '45.309')] [2024-08-05 20:47:00,299][15444] Updated weights for policy 0, policy_version 64751 (0.0023) [2024-08-05 20:47:03,119][15372] Fps is (10 sec: 18844.0, 60 sec: 20753.0, 300 sec: 22604.3). Total num frames: 530489344. Throughput: 0: 4941.5. Samples: 132629200. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 20:47:03,119][15372] Avg episode reward: [(0, '44.749')] [2024-08-05 20:47:04,721][15444] Updated weights for policy 0, policy_version 64761 (0.0023) [2024-08-05 20:47:08,119][15372] Fps is (10 sec: 18022.0, 60 sec: 20343.4, 300 sec: 22521.0). Total num frames: 530587648. Throughput: 0: 4967.1. Samples: 132643160. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 20:47:08,127][15372] Avg episode reward: [(0, '44.030')] [2024-08-05 20:47:09,270][15444] Updated weights for policy 0, policy_version 64771 (0.0014) [2024-08-05 20:47:13,027][15444] Updated weights for policy 0, policy_version 64781 (0.0034) [2024-08-05 20:47:13,120][15372] Fps is (10 sec: 19657.7, 60 sec: 19933.2, 300 sec: 22437.6). Total num frames: 530685952. Throughput: 0: 4921.8. Samples: 132671860. Policy #0 lag: (min: 0.0, avg: 3.8, max: 7.0) [2024-08-05 20:47:13,121][15372] Avg episode reward: [(0, '43.351')] [2024-08-05 20:47:17,792][15444] Updated weights for policy 0, policy_version 64791 (0.0019) [2024-08-05 20:47:18,119][15372] Fps is (10 sec: 18022.6, 60 sec: 19524.2, 300 sec: 22326.7). Total num frames: 530767872. Throughput: 0: 4828.4. Samples: 132699970. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 20:47:18,119][15372] Avg episode reward: [(0, '43.821')] [2024-08-05 20:47:21,866][15444] Updated weights for policy 0, policy_version 64801 (0.0033) [2024-08-05 20:47:23,119][15372] Fps is (10 sec: 17204.9, 60 sec: 19524.0, 300 sec: 22215.6). Total num frames: 530857984. Throughput: 0: 4786.1. Samples: 132713920. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 20:47:23,120][15372] Avg episode reward: [(0, '44.301')] [2024-08-05 20:47:26,620][15444] Updated weights for policy 0, policy_version 64811 (0.0013) [2024-08-05 20:47:28,119][15372] Fps is (10 sec: 18020.9, 60 sec: 19251.0, 300 sec: 22104.4). Total num frames: 530948096. Throughput: 0: 4651.1. Samples: 132740380. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 20:47:28,120][15372] Avg episode reward: [(0, '43.991')] [2024-08-05 20:47:31,148][15444] Updated weights for policy 0, policy_version 64821 (0.0015) [2024-08-05 20:47:31,293][15417] Signal inference workers to stop experience collection... (23650 times) [2024-08-05 20:47:31,293][15417] Signal inference workers to resume experience collection... (23650 times) [2024-08-05 20:47:31,318][15444] InferenceWorker_p0-w0: stopping experience collection (23650 times) [2024-08-05 20:47:31,324][15444] InferenceWorker_p0-w0: resuming experience collection (23650 times) [2024-08-05 20:47:33,118][15372] Fps is (10 sec: 18024.0, 60 sec: 18841.6, 300 sec: 22021.2). Total num frames: 531038208. Throughput: 0: 4594.4. Samples: 132767700. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 20:47:33,119][15372] Avg episode reward: [(0, '44.028')] [2024-08-05 20:47:35,652][15444] Updated weights for policy 0, policy_version 64831 (0.0013) [2024-08-05 20:47:38,118][15372] Fps is (10 sec: 18843.3, 60 sec: 18841.6, 300 sec: 21937.9). Total num frames: 531136512. Throughput: 0: 4636.2. Samples: 132782500. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 20:47:38,119][15372] Avg episode reward: [(0, '43.682')] [2024-08-05 20:47:40,089][15444] Updated weights for policy 0, policy_version 64841 (0.0025) [2024-08-05 20:47:43,119][15372] Fps is (10 sec: 18841.3, 60 sec: 18568.6, 300 sec: 21826.8). Total num frames: 531226624. Throughput: 0: 4612.0. Samples: 132809770. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 20:47:43,128][15372] Avg episode reward: [(0, '43.651')] [2024-08-05 20:47:44,734][15444] Updated weights for policy 0, policy_version 64851 (0.0019) [2024-08-05 20:47:48,119][15372] Fps is (10 sec: 17202.9, 60 sec: 18158.9, 300 sec: 21688.0). Total num frames: 531308544. Throughput: 0: 4571.1. Samples: 132834900. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 20:47:48,128][15372] Avg episode reward: [(0, '43.917')] [2024-08-05 20:47:49,752][15444] Updated weights for policy 0, policy_version 64861 (0.0028) [2024-08-05 20:47:53,119][15372] Fps is (10 sec: 17203.2, 60 sec: 18295.9, 300 sec: 21604.7). Total num frames: 531398656. Throughput: 0: 4552.2. Samples: 132848010. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 20:47:53,129][15372] Avg episode reward: [(0, '45.096')] [2024-08-05 20:47:54,009][15444] Updated weights for policy 0, policy_version 64871 (0.0025) [2024-08-05 20:47:58,119][15372] Fps is (10 sec: 18022.5, 60 sec: 18022.4, 300 sec: 21493.6). Total num frames: 531488768. Throughput: 0: 4514.8. Samples: 132875020. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 20:47:58,126][15372] Avg episode reward: [(0, '45.376')] [2024-08-05 20:47:58,715][15444] Updated weights for policy 0, policy_version 64881 (0.0030) [2024-08-05 20:48:02,656][15444] Updated weights for policy 0, policy_version 64891 (0.0019) [2024-08-05 20:48:03,119][15372] Fps is (10 sec: 18841.0, 60 sec: 18295.4, 300 sec: 21410.2). Total num frames: 531587072. Throughput: 0: 4517.3. Samples: 132903250. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 20:48:03,119][15372] Avg episode reward: [(0, '44.572')] [2024-08-05 20:48:07,430][15444] Updated weights for policy 0, policy_version 64901 (0.0025) [2024-08-05 20:48:08,127][15372] Fps is (10 sec: 18827.4, 60 sec: 18156.7, 300 sec: 21298.7). Total num frames: 531677184. Throughput: 0: 4521.3. Samples: 132917410. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 20:48:08,127][15372] Avg episode reward: [(0, '44.659')] [2024-08-05 20:48:12,082][15444] Updated weights for policy 0, policy_version 64911 (0.0020) [2024-08-05 20:48:13,119][15372] Fps is (10 sec: 18022.5, 60 sec: 18022.9, 300 sec: 21215.9). Total num frames: 531767296. Throughput: 0: 4521.2. Samples: 132943830. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 20:48:13,121][15372] Avg episode reward: [(0, '45.133')] [2024-08-05 20:48:16,452][15444] Updated weights for policy 0, policy_version 64921 (0.0031) [2024-08-05 20:48:18,119][15372] Fps is (10 sec: 18854.8, 60 sec: 18295.3, 300 sec: 21104.8). Total num frames: 531865600. Throughput: 0: 4527.0. Samples: 132971420. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 20:48:18,127][15372] Avg episode reward: [(0, '44.767')] [2024-08-05 20:48:18,135][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000064925_531865600.pth... [2024-08-05 20:48:18,285][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000064325_526950400.pth [2024-08-05 20:48:20,923][15444] Updated weights for policy 0, policy_version 64931 (0.0014) [2024-08-05 20:48:23,119][15372] Fps is (10 sec: 18023.0, 60 sec: 18159.2, 300 sec: 20993.7). Total num frames: 531947520. Throughput: 0: 4499.3. Samples: 132984970. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 20:48:23,119][15372] Avg episode reward: [(0, '45.234')] [2024-08-05 20:48:26,172][15444] Updated weights for policy 0, policy_version 64941 (0.0031) [2024-08-05 20:48:28,119][15372] Fps is (10 sec: 16384.5, 60 sec: 18022.6, 300 sec: 20882.6). Total num frames: 532029440. Throughput: 0: 4444.0. Samples: 133009750. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 20:48:28,127][15372] Avg episode reward: [(0, '45.238')] [2024-08-05 20:48:31,160][15444] Updated weights for policy 0, policy_version 64951 (0.0023) [2024-08-05 20:48:33,119][15372] Fps is (10 sec: 16383.6, 60 sec: 17885.8, 300 sec: 20716.0). Total num frames: 532111360. Throughput: 0: 4419.1. Samples: 133033760. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-08-05 20:48:33,120][15372] Avg episode reward: [(0, '45.104')] [2024-08-05 20:48:37,432][15444] Updated weights for policy 0, policy_version 64961 (0.0028) [2024-08-05 20:48:38,118][15372] Fps is (10 sec: 14746.2, 60 sec: 17339.7, 300 sec: 20549.4). Total num frames: 532176896. Throughput: 0: 4366.2. Samples: 133044490. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 20:48:38,119][15372] Avg episode reward: [(0, '44.729')] [2024-08-05 20:48:43,119][15372] Fps is (10 sec: 12288.1, 60 sec: 16793.6, 300 sec: 20327.4). Total num frames: 532234240. Throughput: 0: 4166.7. Samples: 133062520. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 20:48:43,126][15372] Avg episode reward: [(0, '44.138')] [2024-08-05 20:48:43,460][15444] Updated weights for policy 0, policy_version 64971 (0.0014) [2024-08-05 20:48:48,120][15372] Fps is (10 sec: 11466.7, 60 sec: 16383.5, 300 sec: 20105.0). Total num frames: 532291584. Throughput: 0: 3945.0. Samples: 133080780. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 20:48:48,136][15372] Avg episode reward: [(0, '44.398')] [2024-08-05 20:48:49,832][15444] Updated weights for policy 0, policy_version 64981 (0.0018) [2024-08-05 20:48:53,119][15372] Fps is (10 sec: 11468.9, 60 sec: 15837.9, 300 sec: 19910.7). Total num frames: 532348928. Throughput: 0: 3845.3. Samples: 133090420. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 20:48:53,152][15372] Avg episode reward: [(0, '43.894')] [2024-08-05 20:48:56,387][15444] Updated weights for policy 0, policy_version 64991 (0.0021) [2024-08-05 20:48:58,118][15372] Fps is (10 sec: 12290.2, 60 sec: 15428.3, 300 sec: 19771.9). Total num frames: 532414464. Throughput: 0: 3668.5. Samples: 133108910. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 20:48:58,128][15372] Avg episode reward: [(0, '44.650')] [2024-08-05 20:49:03,119][15372] Fps is (10 sec: 13107.2, 60 sec: 14882.2, 300 sec: 19522.0). Total num frames: 532480000. Throughput: 0: 3456.7. Samples: 133126970. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 20:49:03,126][15372] Avg episode reward: [(0, '44.320')] [2024-08-05 20:49:03,307][15444] Updated weights for policy 0, policy_version 65001 (0.0045) [2024-08-05 20:49:08,119][15372] Fps is (10 sec: 12287.4, 60 sec: 14337.7, 300 sec: 19355.7). Total num frames: 532537344. Throughput: 0: 3357.7. Samples: 133136070. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 20:49:08,127][15372] Avg episode reward: [(0, '44.282')] [2024-08-05 20:49:09,667][15444] Updated weights for policy 0, policy_version 65011 (0.0016) [2024-08-05 20:49:13,118][15372] Fps is (10 sec: 14745.8, 60 sec: 14336.1, 300 sec: 19244.3). Total num frames: 532627456. Throughput: 0: 3311.4. Samples: 133158760. Policy #0 lag: (min: 0.0, avg: 4.2, max: 8.0) [2024-08-05 20:49:13,127][15372] Avg episode reward: [(0, '44.021')] [2024-08-05 20:49:14,630][15444] Updated weights for policy 0, policy_version 65021 (0.0027) [2024-08-05 20:49:18,119][15372] Fps is (10 sec: 18023.2, 60 sec: 14199.6, 300 sec: 19133.2). Total num frames: 532717568. Throughput: 0: 3348.7. Samples: 133184450. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 20:49:18,119][15372] Avg episode reward: [(0, '43.989')] [2024-08-05 20:49:19,163][15444] Updated weights for policy 0, policy_version 65031 (0.0019) [2024-08-05 20:49:23,128][15372] Fps is (10 sec: 18005.1, 60 sec: 14333.7, 300 sec: 19049.2). Total num frames: 532807680. Throughput: 0: 3430.4. Samples: 133198890. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 20:49:23,136][15372] Avg episode reward: [(0, '44.051')] [2024-08-05 20:49:23,640][15444] Updated weights for policy 0, policy_version 65041 (0.0023) [2024-08-05 20:49:28,119][15372] Fps is (10 sec: 14745.6, 60 sec: 13926.5, 300 sec: 18799.9). Total num frames: 532865024. Throughput: 0: 3474.0. Samples: 133218850. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 20:49:28,126][15372] Avg episode reward: [(0, '43.583')] [2024-08-05 20:49:31,672][15444] Updated weights for policy 0, policy_version 65051 (0.0028) [2024-08-05 20:49:33,119][15372] Fps is (10 sec: 10659.5, 60 sec: 13380.3, 300 sec: 18550.0). Total num frames: 532914176. Throughput: 0: 3400.1. Samples: 133233780. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 20:49:33,127][15372] Avg episode reward: [(0, '43.836')] [2024-08-05 20:49:37,130][15444] Updated weights for policy 0, policy_version 65061 (0.0021) [2024-08-05 20:49:38,121][15372] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 18438.9). Total num frames: 532996096. Throughput: 0: 3496.2. Samples: 133247750. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 20:49:38,121][15372] Avg episode reward: [(0, '43.206')] [2024-08-05 20:49:41,169][15444] Updated weights for policy 0, policy_version 65071 (0.0030) [2024-08-05 20:49:43,119][15372] Fps is (10 sec: 18022.3, 60 sec: 14336.0, 300 sec: 18383.4). Total num frames: 533094400. Throughput: 0: 3696.0. Samples: 133275230. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 20:49:43,127][15372] Avg episode reward: [(0, '43.872')] [2024-08-05 20:49:45,995][15444] Updated weights for policy 0, policy_version 65081 (0.0029) [2024-08-05 20:49:48,118][15372] Fps is (10 sec: 18022.6, 60 sec: 14746.0, 300 sec: 18216.8). Total num frames: 533176320. Throughput: 0: 3864.9. Samples: 133300890. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 20:49:48,119][15372] Avg episode reward: [(0, '44.685')] [2024-08-05 20:49:50,779][15444] Updated weights for policy 0, policy_version 65091 (0.0027) [2024-08-05 20:49:53,119][15372] Fps is (10 sec: 17203.7, 60 sec: 15291.7, 300 sec: 18105.7). Total num frames: 533266432. Throughput: 0: 3952.9. Samples: 133313950. Policy #0 lag: (min: 0.0, avg: 3.7, max: 8.0) [2024-08-05 20:49:53,119][15372] Avg episode reward: [(0, '45.096')] [2024-08-05 20:49:55,474][15444] Updated weights for policy 0, policy_version 65101 (0.0022) [2024-08-05 20:49:58,120][15372] Fps is (10 sec: 17201.0, 60 sec: 15564.5, 300 sec: 18022.3). Total num frames: 533348352. Throughput: 0: 4022.3. Samples: 133339770. Policy #0 lag: (min: 0.0, avg: 3.0, max: 7.0) [2024-08-05 20:49:58,120][15372] Avg episode reward: [(0, '45.544')] [2024-08-05 20:49:59,984][15444] Updated weights for policy 0, policy_version 65111 (0.0019) [2024-08-05 20:50:03,118][15372] Fps is (10 sec: 18022.6, 60 sec: 16111.0, 300 sec: 18189.0). Total num frames: 533446656. Throughput: 0: 4115.6. Samples: 133369650. Policy #0 lag: (min: 0.0, avg: 3.0, max: 7.0) [2024-08-05 20:50:03,126][15372] Avg episode reward: [(0, '45.421')] [2024-08-05 20:50:03,898][15444] Updated weights for policy 0, policy_version 65121 (0.0033) [2024-08-05 20:50:06,406][15417] Signal inference workers to stop experience collection... (23700 times) [2024-08-05 20:50:06,411][15417] Signal inference workers to resume experience collection... (23700 times) [2024-08-05 20:50:06,499][15444] InferenceWorker_p0-w0: stopping experience collection (23700 times) [2024-08-05 20:50:06,500][15444] InferenceWorker_p0-w0: resuming experience collection (23700 times) [2024-08-05 20:50:08,118][15372] Fps is (10 sec: 18844.0, 60 sec: 16657.2, 300 sec: 18327.9). Total num frames: 533536768. Throughput: 0: 4102.4. Samples: 133383460. Policy #0 lag: (min: 0.0, avg: 3.0, max: 7.0) [2024-08-05 20:50:08,127][15372] Avg episode reward: [(0, '45.109')] [2024-08-05 20:50:08,857][15444] Updated weights for policy 0, policy_version 65131 (0.0016) [2024-08-05 20:50:12,724][15444] Updated weights for policy 0, policy_version 65141 (0.0025) [2024-08-05 20:50:13,119][15372] Fps is (10 sec: 18840.0, 60 sec: 16793.4, 300 sec: 18300.1). Total num frames: 533635072. Throughput: 0: 4241.3. Samples: 133409710. Policy #0 lag: (min: 0.0, avg: 3.0, max: 7.0) [2024-08-05 20:50:13,120][15372] Avg episode reward: [(0, '43.900')] [2024-08-05 20:50:17,363][15444] Updated weights for policy 0, policy_version 65151 (0.0012) [2024-08-05 20:50:18,119][15372] Fps is (10 sec: 19660.4, 60 sec: 16930.1, 300 sec: 18244.5). Total num frames: 533733376. Throughput: 0: 4580.9. Samples: 133439920. Policy #0 lag: (min: 0.0, avg: 3.0, max: 7.0) [2024-08-05 20:50:18,119][15372] Avg episode reward: [(0, '42.963')] [2024-08-05 20:50:18,122][15417] Saving /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000065153_533733376.pth... [2024-08-05 20:50:18,283][15417] Removing /home/evgenii/Documents/Jupyter_notebooks/SAMPLE_FACTORY/train_dir/doom_battle_w20_v20/checkpoint_p0/checkpoint_000064649_529604608.pth [2024-08-05 20:50:21,336][15444] Updated weights for policy 0, policy_version 65161 (0.0023) [2024-08-05 20:50:23,118][15372] Fps is (10 sec: 19662.4, 60 sec: 17069.4, 300 sec: 18189.0). Total num frames: 533831680. Throughput: 0: 4599.3. Samples: 133454720. Policy #0 lag: (min: 0.0, avg: 3.0, max: 7.0) [2024-08-05 20:50:23,126][15372] Avg episode reward: [(0, '42.843')] [2024-08-05 20:50:25,439][15444] Updated weights for policy 0, policy_version 65171 (0.0025) [2024-08-05 20:50:28,118][15372] Fps is (10 sec: 19661.2, 60 sec: 17749.4, 300 sec: 18105.7). Total num frames: 533929984. Throughput: 0: 4647.4. Samples: 133484360. Policy #0 lag: (min: 0.0, avg: 3.0, max: 7.0) [2024-08-05 20:50:28,119][15372] Avg episode reward: [(0, '43.857')] [2024-08-05 20:50:29,558][15444] Updated weights for policy 0, policy_version 65181 (0.0013)