SergejSchweizer commited on
Commit
3911018
1 Parent(s): a904269

Upload . with huggingface_hub

Browse files
.summary/0/events.out.tfevents.1677355992.683f0af01f9e ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3baa518fd0298cfd0336cacf63ca08954acf5583dbd4e5a66b6c807fab6e8306
3
+ size 451472
README.md CHANGED
@@ -15,7 +15,7 @@ model-index:
15
  type: doom_health_gathering_supreme
16
  metrics:
17
  - type: mean_reward
18
- value: 11.28 +/- 5.09
19
  name: mean_reward
20
  verified: false
21
  ---
 
15
  type: doom_health_gathering_supreme
16
  metrics:
17
  - type: mean_reward
18
+ value: 12.24 +/- 6.12
19
  name: mean_reward
20
  verified: false
21
  ---
checkpoint_p0/best_000001163_4763648_reward_31.256.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:dd721f45f8bb0b7f2caabeb14a3547155f5779ee1df7fe989b823cb14f9b0b5d
3
+ size 34928806
checkpoint_p0/checkpoint_000001450_5939200.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:61fc5ff183832d3e087269c3d01f056de0faf06bdbd8bc916598140d191788e4
3
+ size 34929220
checkpoint_p0/checkpoint_000001466_6004736.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ebd75a5469c0f11ef0818ac5cc8f91a843341f51e4864210e83352c7d855fd2b
3
+ size 34929220
config.json CHANGED
@@ -65,7 +65,7 @@
65
  "summaries_use_frameskip": true,
66
  "heartbeat_interval": 20,
67
  "heartbeat_reporting_interval": 600,
68
- "train_for_env_steps": 4000000,
69
  "train_for_seconds": 10000000000,
70
  "save_every_sec": 120,
71
  "keep_checkpoints": 2,
 
65
  "summaries_use_frameskip": true,
66
  "heartbeat_interval": 20,
67
  "heartbeat_reporting_interval": 600,
68
+ "train_for_env_steps": 6000000,
69
  "train_for_seconds": 10000000000,
70
  "save_every_sec": 120,
71
  "keep_checkpoints": 2,
replay.mp4 CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:89627cf83da114e93fa3515a199c58d7a31063043e89148d964c3c25e5ebbb60
3
- size 21961716
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7acdf65873b53326d088cda54fdebad0d2490195a539435e2486cbb1e1bb3b07
3
+ size 24021580
sf_log.txt CHANGED
@@ -1349,3 +1349,1117 @@ main_loop: 1191.3321
1349
  [2023-02-25 19:59:24,682][14226] Avg episode rewards: #0: 28.076, true rewards: #0: 11.276
1350
  [2023-02-25 19:59:24,684][14226] Avg episode reward: 28.076, avg true_objective: 11.276
1351
  [2023-02-25 20:00:37,511][14226] Replay video saved to /content/train_dir/default_experiment/replay.mp4!
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1349
  [2023-02-25 19:59:24,682][14226] Avg episode rewards: #0: 28.076, true rewards: #0: 11.276
1350
  [2023-02-25 19:59:24,684][14226] Avg episode reward: 28.076, avg true_objective: 11.276
1351
  [2023-02-25 20:00:37,511][14226] Replay video saved to /content/train_dir/default_experiment/replay.mp4!
1352
+ [2023-02-25 20:00:43,486][14226] The model has been pushed to https://huggingface.co/SergejSchweizer/rl_course_vizdoom_health_gathering_supreme
1353
+ [2023-02-25 20:05:13,354][14226] Loading legacy config file train_dir/doom_health_gathering_supreme_2222/cfg.json instead of train_dir/doom_health_gathering_supreme_2222/config.json
1354
+ [2023-02-25 20:05:13,356][14226] Loading existing experiment configuration from train_dir/doom_health_gathering_supreme_2222/config.json
1355
+ [2023-02-25 20:05:13,360][14226] Overriding arg 'experiment' with value 'doom_health_gathering_supreme_2222' passed from command line
1356
+ [2023-02-25 20:05:13,364][14226] Overriding arg 'train_dir' with value 'train_dir' passed from command line
1357
+ [2023-02-25 20:05:13,366][14226] Overriding arg 'num_workers' with value 1 passed from command line
1358
+ [2023-02-25 20:05:13,367][14226] Adding new argument 'lr_adaptive_min'=1e-06 that is not in the saved config file!
1359
+ [2023-02-25 20:05:13,369][14226] Adding new argument 'lr_adaptive_max'=0.01 that is not in the saved config file!
1360
+ [2023-02-25 20:05:13,370][14226] Adding new argument 'env_gpu_observations'=True that is not in the saved config file!
1361
+ [2023-02-25 20:05:13,372][14226] Adding new argument 'no_render'=True that is not in the saved config file!
1362
+ [2023-02-25 20:05:13,373][14226] Adding new argument 'save_video'=True that is not in the saved config file!
1363
+ [2023-02-25 20:05:13,374][14226] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file!
1364
+ [2023-02-25 20:05:13,376][14226] Adding new argument 'video_name'=None that is not in the saved config file!
1365
+ [2023-02-25 20:05:13,378][14226] Adding new argument 'max_num_frames'=1000000000.0 that is not in the saved config file!
1366
+ [2023-02-25 20:05:13,380][14226] Adding new argument 'max_num_episodes'=10 that is not in the saved config file!
1367
+ [2023-02-25 20:05:13,382][14226] Adding new argument 'push_to_hub'=False that is not in the saved config file!
1368
+ [2023-02-25 20:05:13,384][14226] Adding new argument 'hf_repository'=None that is not in the saved config file!
1369
+ [2023-02-25 20:05:13,386][14226] Adding new argument 'policy_index'=0 that is not in the saved config file!
1370
+ [2023-02-25 20:05:13,399][14226] Adding new argument 'eval_deterministic'=False that is not in the saved config file!
1371
+ [2023-02-25 20:05:13,401][14226] Adding new argument 'train_script'=None that is not in the saved config file!
1372
+ [2023-02-25 20:05:13,403][14226] Adding new argument 'enjoy_script'=None that is not in the saved config file!
1373
+ [2023-02-25 20:05:13,405][14226] Using frameskip 1 and render_action_repeat=4 for evaluation
1374
+ [2023-02-25 20:05:13,428][14226] RunningMeanStd input shape: (3, 72, 128)
1375
+ [2023-02-25 20:05:13,430][14226] RunningMeanStd input shape: (1,)
1376
+ [2023-02-25 20:05:13,444][14226] ConvEncoder: input_channels=3
1377
+ [2023-02-25 20:05:13,489][14226] Conv encoder output size: 512
1378
+ [2023-02-25 20:05:13,491][14226] Policy head output size: 512
1379
+ [2023-02-25 20:05:13,516][14226] Loading state from checkpoint train_dir/doom_health_gathering_supreme_2222/checkpoint_p0/checkpoint_000539850_4422451200.pth...
1380
+ [2023-02-25 20:05:14,002][14226] Num frames 100...
1381
+ [2023-02-25 20:05:14,120][14226] Num frames 200...
1382
+ [2023-02-25 20:05:14,237][14226] Num frames 300...
1383
+ [2023-02-25 20:05:14,360][14226] Num frames 400...
1384
+ [2023-02-25 20:05:14,483][14226] Num frames 500...
1385
+ [2023-02-25 20:05:14,610][14226] Num frames 600...
1386
+ [2023-02-25 20:05:14,728][14226] Num frames 700...
1387
+ [2023-02-25 20:05:14,848][14226] Num frames 800...
1388
+ [2023-02-25 20:05:14,974][14226] Num frames 900...
1389
+ [2023-02-25 20:05:15,099][14226] Num frames 1000...
1390
+ [2023-02-25 20:05:15,218][14226] Num frames 1100...
1391
+ [2023-02-25 20:05:15,341][14226] Num frames 1200...
1392
+ [2023-02-25 20:05:15,470][14226] Num frames 1300...
1393
+ [2023-02-25 20:05:15,598][14226] Num frames 1400...
1394
+ [2023-02-25 20:05:15,717][14226] Num frames 1500...
1395
+ [2023-02-25 20:05:15,838][14226] Num frames 1600...
1396
+ [2023-02-25 20:05:15,962][14226] Num frames 1700...
1397
+ [2023-02-25 20:05:16,088][14226] Num frames 1800...
1398
+ [2023-02-25 20:05:16,205][14226] Num frames 1900...
1399
+ [2023-02-25 20:05:16,330][14226] Avg episode rewards: #0: 62.599, true rewards: #0: 19.600
1400
+ [2023-02-25 20:05:16,331][14226] Avg episode reward: 62.599, avg true_objective: 19.600
1401
+ [2023-02-25 20:05:16,389][14226] Num frames 2000...
1402
+ [2023-02-25 20:05:16,520][14226] Num frames 2100...
1403
+ [2023-02-25 20:05:16,643][14226] Num frames 2200...
1404
+ [2023-02-25 20:05:16,762][14226] Num frames 2300...
1405
+ [2023-02-25 20:05:16,884][14226] Num frames 2400...
1406
+ [2023-02-25 20:05:17,009][14226] Num frames 2500...
1407
+ [2023-02-25 20:05:17,124][14226] Num frames 2600...
1408
+ [2023-02-25 20:05:17,241][14226] Num frames 2700...
1409
+ [2023-02-25 20:05:17,363][14226] Num frames 2800...
1410
+ [2023-02-25 20:05:17,476][14226] Num frames 2900...
1411
+ [2023-02-25 20:05:17,601][14226] Num frames 3000...
1412
+ [2023-02-25 20:05:17,719][14226] Num frames 3100...
1413
+ [2023-02-25 20:05:17,835][14226] Num frames 3200...
1414
+ [2023-02-25 20:05:17,957][14226] Num frames 3300...
1415
+ [2023-02-25 20:05:18,076][14226] Num frames 3400...
1416
+ [2023-02-25 20:05:18,197][14226] Num frames 3500...
1417
+ [2023-02-25 20:05:18,317][14226] Num frames 3600...
1418
+ [2023-02-25 20:05:18,438][14226] Num frames 3700...
1419
+ [2023-02-25 20:05:18,555][14226] Num frames 3800...
1420
+ [2023-02-25 20:05:18,688][14226] Num frames 3900...
1421
+ [2023-02-25 20:05:18,813][14226] Num frames 4000...
1422
+ [2023-02-25 20:05:18,939][14226] Avg episode rewards: #0: 61.799, true rewards: #0: 20.300
1423
+ [2023-02-25 20:05:18,941][14226] Avg episode reward: 61.799, avg true_objective: 20.300
1424
+ [2023-02-25 20:05:18,994][14226] Num frames 4100...
1425
+ [2023-02-25 20:05:19,107][14226] Num frames 4200...
1426
+ [2023-02-25 20:05:19,223][14226] Num frames 4300...
1427
+ [2023-02-25 20:05:19,336][14226] Num frames 4400...
1428
+ [2023-02-25 20:05:19,468][14226] Num frames 4500...
1429
+ [2023-02-25 20:05:19,593][14226] Num frames 4600...
1430
+ [2023-02-25 20:05:19,717][14226] Num frames 4700...
1431
+ [2023-02-25 20:05:19,841][14226] Num frames 4800...
1432
+ [2023-02-25 20:05:19,956][14226] Num frames 4900...
1433
+ [2023-02-25 20:05:20,078][14226] Num frames 5000...
1434
+ [2023-02-25 20:05:20,193][14226] Num frames 5100...
1435
+ [2023-02-25 20:05:20,310][14226] Num frames 5200...
1436
+ [2023-02-25 20:05:20,439][14226] Num frames 5300...
1437
+ [2023-02-25 20:05:20,559][14226] Num frames 5400...
1438
+ [2023-02-25 20:05:20,686][14226] Num frames 5500...
1439
+ [2023-02-25 20:05:20,803][14226] Num frames 5600...
1440
+ [2023-02-25 20:05:20,926][14226] Num frames 5700...
1441
+ [2023-02-25 20:05:21,043][14226] Num frames 5800...
1442
+ [2023-02-25 20:05:21,160][14226] Num frames 5900...
1443
+ [2023-02-25 20:05:21,276][14226] Num frames 6000...
1444
+ [2023-02-25 20:05:21,397][14226] Num frames 6100...
1445
+ [2023-02-25 20:05:21,521][14226] Avg episode rewards: #0: 59.865, true rewards: #0: 20.533
1446
+ [2023-02-25 20:05:21,523][14226] Avg episode reward: 59.865, avg true_objective: 20.533
1447
+ [2023-02-25 20:05:21,576][14226] Num frames 6200...
1448
+ [2023-02-25 20:05:21,701][14226] Num frames 6300...
1449
+ [2023-02-25 20:05:21,847][14226] Num frames 6400...
1450
+ [2023-02-25 20:05:22,018][14226] Num frames 6500...
1451
+ [2023-02-25 20:05:22,185][14226] Num frames 6600...
1452
+ [2023-02-25 20:05:22,349][14226] Num frames 6700...
1453
+ [2023-02-25 20:05:22,510][14226] Num frames 6800...
1454
+ [2023-02-25 20:05:22,670][14226] Num frames 6900...
1455
+ [2023-02-25 20:05:22,846][14226] Num frames 7000...
1456
+ [2023-02-25 20:05:23,009][14226] Num frames 7100...
1457
+ [2023-02-25 20:05:23,170][14226] Num frames 7200...
1458
+ [2023-02-25 20:05:23,332][14226] Num frames 7300...
1459
+ [2023-02-25 20:05:23,493][14226] Num frames 7400...
1460
+ [2023-02-25 20:05:23,658][14226] Num frames 7500...
1461
+ [2023-02-25 20:05:23,828][14226] Num frames 7600...
1462
+ [2023-02-25 20:05:23,994][14226] Num frames 7700...
1463
+ [2023-02-25 20:05:24,161][14226] Num frames 7800...
1464
+ [2023-02-25 20:05:24,326][14226] Num frames 7900...
1465
+ [2023-02-25 20:05:24,499][14226] Num frames 8000...
1466
+ [2023-02-25 20:05:24,674][14226] Num frames 8100...
1467
+ [2023-02-25 20:05:24,852][14226] Num frames 8200...
1468
+ [2023-02-25 20:05:25,039][14226] Avg episode rewards: #0: 58.899, true rewards: #0: 20.650
1469
+ [2023-02-25 20:05:25,042][14226] Avg episode reward: 58.899, avg true_objective: 20.650
1470
+ [2023-02-25 20:05:25,124][14226] Num frames 8300...
1471
+ [2023-02-25 20:05:25,300][14226] Num frames 8400...
1472
+ [2023-02-25 20:05:25,468][14226] Num frames 8500...
1473
+ [2023-02-25 20:05:25,594][14226] Num frames 8600...
1474
+ [2023-02-25 20:05:25,707][14226] Num frames 8700...
1475
+ [2023-02-25 20:05:25,830][14226] Num frames 8800...
1476
+ [2023-02-25 20:05:25,947][14226] Num frames 8900...
1477
+ [2023-02-25 20:05:26,059][14226] Num frames 9000...
1478
+ [2023-02-25 20:05:26,176][14226] Num frames 9100...
1479
+ [2023-02-25 20:05:26,298][14226] Num frames 9200...
1480
+ [2023-02-25 20:05:26,412][14226] Num frames 9300...
1481
+ [2023-02-25 20:05:26,526][14226] Num frames 9400...
1482
+ [2023-02-25 20:05:26,649][14226] Num frames 9500...
1483
+ [2023-02-25 20:05:26,769][14226] Num frames 9600...
1484
+ [2023-02-25 20:05:26,895][14226] Num frames 9700...
1485
+ [2023-02-25 20:05:27,018][14226] Num frames 9800...
1486
+ [2023-02-25 20:05:27,134][14226] Num frames 9900...
1487
+ [2023-02-25 20:05:27,256][14226] Num frames 10000...
1488
+ [2023-02-25 20:05:27,373][14226] Num frames 10100...
1489
+ [2023-02-25 20:05:27,490][14226] Num frames 10200...
1490
+ [2023-02-25 20:05:27,614][14226] Num frames 10300...
1491
+ [2023-02-25 20:05:27,741][14226] Avg episode rewards: #0: 58.919, true rewards: #0: 20.720
1492
+ [2023-02-25 20:05:27,744][14226] Avg episode reward: 58.919, avg true_objective: 20.720
1493
+ [2023-02-25 20:05:27,800][14226] Num frames 10400...
1494
+ [2023-02-25 20:05:27,937][14226] Num frames 10500...
1495
+ [2023-02-25 20:05:28,052][14226] Num frames 10600...
1496
+ [2023-02-25 20:05:28,168][14226] Num frames 10700...
1497
+ [2023-02-25 20:05:28,287][14226] Num frames 10800...
1498
+ [2023-02-25 20:05:28,406][14226] Num frames 10900...
1499
+ [2023-02-25 20:05:28,524][14226] Num frames 11000...
1500
+ [2023-02-25 20:05:28,642][14226] Num frames 11100...
1501
+ [2023-02-25 20:05:28,765][14226] Num frames 11200...
1502
+ [2023-02-25 20:05:28,888][14226] Num frames 11300...
1503
+ [2023-02-25 20:05:29,006][14226] Num frames 11400...
1504
+ [2023-02-25 20:05:29,128][14226] Num frames 11500...
1505
+ [2023-02-25 20:05:29,252][14226] Num frames 11600...
1506
+ [2023-02-25 20:05:29,372][14226] Num frames 11700...
1507
+ [2023-02-25 20:05:29,494][14226] Num frames 11800...
1508
+ [2023-02-25 20:05:29,617][14226] Num frames 11900...
1509
+ [2023-02-25 20:05:29,736][14226] Num frames 12000...
1510
+ [2023-02-25 20:05:29,862][14226] Num frames 12100...
1511
+ [2023-02-25 20:05:29,999][14226] Num frames 12200...
1512
+ [2023-02-25 20:05:30,126][14226] Num frames 12300...
1513
+ [2023-02-25 20:05:30,247][14226] Num frames 12400...
1514
+ [2023-02-25 20:05:30,373][14226] Avg episode rewards: #0: 59.265, true rewards: #0: 20.767
1515
+ [2023-02-25 20:05:30,380][14226] Avg episode reward: 59.265, avg true_objective: 20.767
1516
+ [2023-02-25 20:05:30,435][14226] Num frames 12500...
1517
+ [2023-02-25 20:05:30,559][14226] Num frames 12600...
1518
+ [2023-02-25 20:05:30,676][14226] Num frames 12700...
1519
+ [2023-02-25 20:05:30,791][14226] Num frames 12800...
1520
+ [2023-02-25 20:05:30,921][14226] Num frames 12900...
1521
+ [2023-02-25 20:05:31,035][14226] Num frames 13000...
1522
+ [2023-02-25 20:05:31,159][14226] Num frames 13100...
1523
+ [2023-02-25 20:05:31,272][14226] Num frames 13200...
1524
+ [2023-02-25 20:05:31,387][14226] Num frames 13300...
1525
+ [2023-02-25 20:05:31,510][14226] Num frames 13400...
1526
+ [2023-02-25 20:05:31,628][14226] Num frames 13500...
1527
+ [2023-02-25 20:05:31,753][14226] Num frames 13600...
1528
+ [2023-02-25 20:05:31,880][14226] Num frames 13700...
1529
+ [2023-02-25 20:05:32,004][14226] Num frames 13800...
1530
+ [2023-02-25 20:05:32,127][14226] Num frames 13900...
1531
+ [2023-02-25 20:05:32,248][14226] Num frames 14000...
1532
+ [2023-02-25 20:05:32,368][14226] Num frames 14100...
1533
+ [2023-02-25 20:05:32,494][14226] Num frames 14200...
1534
+ [2023-02-25 20:05:32,616][14226] Num frames 14300...
1535
+ [2023-02-25 20:05:32,732][14226] Num frames 14400...
1536
+ [2023-02-25 20:05:32,851][14226] Num frames 14500...
1537
+ [2023-02-25 20:05:32,976][14226] Avg episode rewards: #0: 60.799, true rewards: #0: 20.800
1538
+ [2023-02-25 20:05:32,979][14226] Avg episode reward: 60.799, avg true_objective: 20.800
1539
+ [2023-02-25 20:05:33,032][14226] Num frames 14600...
1540
+ [2023-02-25 20:05:33,167][14226] Num frames 14700...
1541
+ [2023-02-25 20:05:33,295][14226] Num frames 14800...
1542
+ [2023-02-25 20:05:33,419][14226] Num frames 14900...
1543
+ [2023-02-25 20:05:33,554][14226] Num frames 15000...
1544
+ [2023-02-25 20:05:33,676][14226] Num frames 15100...
1545
+ [2023-02-25 20:05:33,791][14226] Num frames 15200...
1546
+ [2023-02-25 20:05:33,907][14226] Num frames 15300...
1547
+ [2023-02-25 20:05:34,040][14226] Num frames 15400...
1548
+ [2023-02-25 20:05:34,162][14226] Num frames 15500...
1549
+ [2023-02-25 20:05:34,280][14226] Num frames 15600...
1550
+ [2023-02-25 20:05:34,405][14226] Num frames 15700...
1551
+ [2023-02-25 20:05:34,525][14226] Num frames 15800...
1552
+ [2023-02-25 20:05:34,645][14226] Num frames 15900...
1553
+ [2023-02-25 20:05:34,772][14226] Num frames 16000...
1554
+ [2023-02-25 20:05:34,892][14226] Num frames 16100...
1555
+ [2023-02-25 20:05:35,014][14226] Num frames 16200...
1556
+ [2023-02-25 20:05:35,132][14226] Num frames 16300...
1557
+ [2023-02-25 20:05:35,249][14226] Num frames 16400...
1558
+ [2023-02-25 20:05:35,375][14226] Num frames 16500...
1559
+ [2023-02-25 20:05:35,495][14226] Num frames 16600...
1560
+ [2023-02-25 20:05:35,658][14226] Avg episode rewards: #0: 61.199, true rewards: #0: 20.825
1561
+ [2023-02-25 20:05:35,661][14226] Avg episode reward: 61.199, avg true_objective: 20.825
1562
+ [2023-02-25 20:05:35,735][14226] Num frames 16700...
1563
+ [2023-02-25 20:05:35,902][14226] Num frames 16800...
1564
+ [2023-02-25 20:05:36,072][14226] Num frames 16900...
1565
+ [2023-02-25 20:05:36,236][14226] Num frames 17000...
1566
+ [2023-02-25 20:05:36,397][14226] Num frames 17100...
1567
+ [2023-02-25 20:05:36,571][14226] Num frames 17200...
1568
+ [2023-02-25 20:05:36,741][14226] Num frames 17300...
1569
+ [2023-02-25 20:05:36,913][14226] Num frames 17400...
1570
+ [2023-02-25 20:05:37,081][14226] Num frames 17500...
1571
+ [2023-02-25 20:05:37,241][14226] Num frames 17600...
1572
+ [2023-02-25 20:05:37,416][14226] Num frames 17700...
1573
+ [2023-02-25 20:05:37,578][14226] Num frames 17800...
1574
+ [2023-02-25 20:05:37,738][14226] Num frames 17900...
1575
+ [2023-02-25 20:05:37,912][14226] Num frames 18000...
1576
+ [2023-02-25 20:05:38,084][14226] Num frames 18100...
1577
+ [2023-02-25 20:05:38,255][14226] Num frames 18200...
1578
+ [2023-02-25 20:05:38,426][14226] Num frames 18300...
1579
+ [2023-02-25 20:05:38,601][14226] Num frames 18400...
1580
+ [2023-02-25 20:05:38,768][14226] Num frames 18500...
1581
+ [2023-02-25 20:05:38,931][14226] Num frames 18600...
1582
+ [2023-02-25 20:05:39,099][14226] Num frames 18700...
1583
+ [2023-02-25 20:05:39,231][14226] Avg episode rewards: #0: 61.510, true rewards: #0: 20.844
1584
+ [2023-02-25 20:05:39,233][14226] Avg episode reward: 61.510, avg true_objective: 20.844
1585
+ [2023-02-25 20:05:39,289][14226] Num frames 18800...
1586
+ [2023-02-25 20:05:39,412][14226] Num frames 18900...
1587
+ [2023-02-25 20:05:39,546][14226] Num frames 19000...
1588
+ [2023-02-25 20:05:39,675][14226] Num frames 19100...
1589
+ [2023-02-25 20:05:39,796][14226] Num frames 19200...
1590
+ [2023-02-25 20:05:39,926][14226] Num frames 19300...
1591
+ [2023-02-25 20:05:40,045][14226] Num frames 19400...
1592
+ [2023-02-25 20:05:40,178][14226] Num frames 19500...
1593
+ [2023-02-25 20:05:40,295][14226] Num frames 19600...
1594
+ [2023-02-25 20:05:40,420][14226] Num frames 19700...
1595
+ [2023-02-25 20:05:40,565][14226] Num frames 19800...
1596
+ [2023-02-25 20:05:40,686][14226] Num frames 19900...
1597
+ [2023-02-25 20:05:40,814][14226] Num frames 20000...
1598
+ [2023-02-25 20:05:40,932][14226] Num frames 20100...
1599
+ [2023-02-25 20:05:41,057][14226] Num frames 20200...
1600
+ [2023-02-25 20:05:41,189][14226] Num frames 20300...
1601
+ [2023-02-25 20:05:41,309][14226] Num frames 20400...
1602
+ [2023-02-25 20:05:41,433][14226] Num frames 20500...
1603
+ [2023-02-25 20:05:41,552][14226] Num frames 20600...
1604
+ [2023-02-25 20:05:41,675][14226] Num frames 20700...
1605
+ [2023-02-25 20:05:41,799][14226] Num frames 20800...
1606
+ [2023-02-25 20:05:41,925][14226] Avg episode rewards: #0: 62.359, true rewards: #0: 20.860
1607
+ [2023-02-25 20:05:41,926][14226] Avg episode reward: 62.359, avg true_objective: 20.860
1608
+ [2023-02-25 20:07:59,930][14226] Replay video saved to train_dir/doom_health_gathering_supreme_2222/replay.mp4!
1609
+ [2023-02-25 20:13:16,207][36780] Saving configuration to /content/train_dir/default_experiment/config.json...
1610
+ [2023-02-25 20:13:16,213][36780] Rollout worker 0 uses device cpu
1611
+ [2023-02-25 20:13:16,216][36780] Rollout worker 1 uses device cpu
1612
+ [2023-02-25 20:13:16,218][36780] Rollout worker 2 uses device cpu
1613
+ [2023-02-25 20:13:16,219][36780] Rollout worker 3 uses device cpu
1614
+ [2023-02-25 20:13:16,221][36780] Rollout worker 4 uses device cpu
1615
+ [2023-02-25 20:13:16,223][36780] Rollout worker 5 uses device cpu
1616
+ [2023-02-25 20:13:16,225][36780] Rollout worker 6 uses device cpu
1617
+ [2023-02-25 20:13:16,228][36780] Rollout worker 7 uses device cpu
1618
+ [2023-02-25 20:13:16,416][36780] Using GPUs [0] for process 0 (actually maps to GPUs [0])
1619
+ [2023-02-25 20:13:16,422][36780] InferenceWorker_p0-w0: min num requests: 2
1620
+ [2023-02-25 20:13:16,467][36780] Starting all processes...
1621
+ [2023-02-25 20:13:16,476][36780] Starting process learner_proc0
1622
+ [2023-02-25 20:13:16,554][36780] Starting all processes...
1623
+ [2023-02-25 20:13:16,569][36780] Starting process inference_proc0-0
1624
+ [2023-02-25 20:13:16,570][36780] Starting process rollout_proc0
1625
+ [2023-02-25 20:13:16,572][36780] Starting process rollout_proc1
1626
+ [2023-02-25 20:13:16,572][36780] Starting process rollout_proc2
1627
+ [2023-02-25 20:13:16,573][36780] Starting process rollout_proc3
1628
+ [2023-02-25 20:13:16,573][36780] Starting process rollout_proc4
1629
+ [2023-02-25 20:13:16,573][36780] Starting process rollout_proc5
1630
+ [2023-02-25 20:13:16,573][36780] Starting process rollout_proc6
1631
+ [2023-02-25 20:13:16,573][36780] Starting process rollout_proc7
1632
+ [2023-02-25 20:13:28,474][36994] Using GPUs [0] for process 0 (actually maps to GPUs [0])
1633
+ [2023-02-25 20:13:28,475][36994] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0
1634
+ [2023-02-25 20:13:28,780][37011] Worker 2 uses CPU cores [0]
1635
+ [2023-02-25 20:13:28,830][37015] Worker 6 uses CPU cores [0]
1636
+ [2023-02-25 20:13:29,012][37007] Using GPUs [0] for process 0 (actually maps to GPUs [0])
1637
+ [2023-02-25 20:13:29,012][37007] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0
1638
+ [2023-02-25 20:13:29,026][37009] Worker 1 uses CPU cores [1]
1639
+ [2023-02-25 20:13:29,126][37014] Worker 5 uses CPU cores [1]
1640
+ [2023-02-25 20:13:29,214][37013] Worker 4 uses CPU cores [0]
1641
+ [2023-02-25 20:13:29,294][37012] Worker 3 uses CPU cores [1]
1642
+ [2023-02-25 20:13:29,303][37016] Worker 7 uses CPU cores [1]
1643
+ [2023-02-25 20:13:29,451][37010] Worker 0 uses CPU cores [0]
1644
+ [2023-02-25 20:13:29,567][37007] Num visible devices: 1
1645
+ [2023-02-25 20:13:29,580][36994] Num visible devices: 1
1646
+ [2023-02-25 20:13:29,629][36994] Starting seed is not provided
1647
+ [2023-02-25 20:13:29,630][36994] Using GPUs [0] for process 0 (actually maps to GPUs [0])
1648
+ [2023-02-25 20:13:29,631][36994] Initializing actor-critic model on device cuda:0
1649
+ [2023-02-25 20:13:29,633][36994] RunningMeanStd input shape: (3, 72, 128)
1650
+ [2023-02-25 20:13:29,635][36994] RunningMeanStd input shape: (1,)
1651
+ [2023-02-25 20:13:29,734][36994] ConvEncoder: input_channels=3
1652
+ [2023-02-25 20:13:30,009][36994] Conv encoder output size: 512
1653
+ [2023-02-25 20:13:30,010][36994] Policy head output size: 512
1654
+ [2023-02-25 20:13:30,051][36994] Created Actor Critic model with architecture:
1655
+ [2023-02-25 20:13:30,052][36994] ActorCriticSharedWeights(
1656
+ (obs_normalizer): ObservationNormalizer(
1657
+ (running_mean_std): RunningMeanStdDictInPlace(
1658
+ (running_mean_std): ModuleDict(
1659
+ (obs): RunningMeanStdInPlace()
1660
+ )
1661
+ )
1662
+ )
1663
+ (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace)
1664
+ (encoder): VizdoomEncoder(
1665
+ (basic_encoder): ConvEncoder(
1666
+ (enc): RecursiveScriptModule(
1667
+ original_name=ConvEncoderImpl
1668
+ (conv_head): RecursiveScriptModule(
1669
+ original_name=Sequential
1670
+ (0): RecursiveScriptModule(original_name=Conv2d)
1671
+ (1): RecursiveScriptModule(original_name=ELU)
1672
+ (2): RecursiveScriptModule(original_name=Conv2d)
1673
+ (3): RecursiveScriptModule(original_name=ELU)
1674
+ (4): RecursiveScriptModule(original_name=Conv2d)
1675
+ (5): RecursiveScriptModule(original_name=ELU)
1676
+ )
1677
+ (mlp_layers): RecursiveScriptModule(
1678
+ original_name=Sequential
1679
+ (0): RecursiveScriptModule(original_name=Linear)
1680
+ (1): RecursiveScriptModule(original_name=ELU)
1681
+ )
1682
+ )
1683
+ )
1684
+ )
1685
+ (core): ModelCoreRNN(
1686
+ (core): GRU(512, 512)
1687
+ )
1688
+ (decoder): MlpDecoder(
1689
+ (mlp): Identity()
1690
+ )
1691
+ (critic_linear): Linear(in_features=512, out_features=1, bias=True)
1692
+ (action_parameterization): ActionParameterizationDefault(
1693
+ (distribution_linear): Linear(in_features=512, out_features=5, bias=True)
1694
+ )
1695
+ )
1696
+ [2023-02-25 20:13:36,392][36994] Using optimizer <class 'torch.optim.adam.Adam'>
1697
+ [2023-02-25 20:13:36,394][36994] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
1698
+ [2023-02-25 20:13:36,406][36780] Heartbeat connected on Batcher_0
1699
+ [2023-02-25 20:13:36,417][36780] Heartbeat connected on InferenceWorker_p0-w0
1700
+ [2023-02-25 20:13:36,434][36780] Heartbeat connected on RolloutWorker_w0
1701
+ [2023-02-25 20:13:36,439][36780] Heartbeat connected on RolloutWorker_w1
1702
+ [2023-02-25 20:13:36,443][36780] Heartbeat connected on RolloutWorker_w2
1703
+ [2023-02-25 20:13:36,448][36780] Heartbeat connected on RolloutWorker_w3
1704
+ [2023-02-25 20:13:36,456][36780] Heartbeat connected on RolloutWorker_w4
1705
+ [2023-02-25 20:13:36,460][36780] Heartbeat connected on RolloutWorker_w5
1706
+ [2023-02-25 20:13:36,470][36994] Loading model from checkpoint
1707
+ [2023-02-25 20:13:36,471][36780] Heartbeat connected on RolloutWorker_w6
1708
+ [2023-02-25 20:13:36,480][36780] Heartbeat connected on RolloutWorker_w7
1709
+ [2023-02-25 20:13:36,481][36994] Loaded experiment state at self.train_step=978, self.env_steps=4005888
1710
+ [2023-02-25 20:13:36,485][36994] Initialized policy 0 weights for model version 978
1711
+ [2023-02-25 20:13:36,492][36994] LearnerWorker_p0 finished initialization!
1712
+ [2023-02-25 20:13:36,493][36994] Using GPUs [0] for process 0 (actually maps to GPUs [0])
1713
+ [2023-02-25 20:13:36,506][36780] Heartbeat connected on LearnerWorker_p0
1714
+ [2023-02-25 20:13:36,677][37007] RunningMeanStd input shape: (3, 72, 128)
1715
+ [2023-02-25 20:13:36,678][37007] RunningMeanStd input shape: (1,)
1716
+ [2023-02-25 20:13:36,690][37007] ConvEncoder: input_channels=3
1717
+ [2023-02-25 20:13:36,795][37007] Conv encoder output size: 512
1718
+ [2023-02-25 20:13:36,795][37007] Policy head output size: 512
1719
+ [2023-02-25 20:13:37,642][36780] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 4005888. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
1720
+ [2023-02-25 20:13:39,040][36780] Inference worker 0-0 is ready!
1721
+ [2023-02-25 20:13:39,042][36780] All inference workers are ready! Signal rollout workers to start!
1722
+ [2023-02-25 20:13:39,168][37012] Doom resolution: 160x120, resize resolution: (128, 72)
1723
+ [2023-02-25 20:13:39,173][37014] Doom resolution: 160x120, resize resolution: (128, 72)
1724
+ [2023-02-25 20:13:39,178][37009] Doom resolution: 160x120, resize resolution: (128, 72)
1725
+ [2023-02-25 20:13:39,184][37016] Doom resolution: 160x120, resize resolution: (128, 72)
1726
+ [2023-02-25 20:13:39,218][37015] Doom resolution: 160x120, resize resolution: (128, 72)
1727
+ [2023-02-25 20:13:39,219][37010] Doom resolution: 160x120, resize resolution: (128, 72)
1728
+ [2023-02-25 20:13:39,220][37011] Doom resolution: 160x120, resize resolution: (128, 72)
1729
+ [2023-02-25 20:13:39,217][37013] Doom resolution: 160x120, resize resolution: (128, 72)
1730
+ [2023-02-25 20:13:40,028][37013] Decorrelating experience for 0 frames...
1731
+ [2023-02-25 20:13:40,033][37011] Decorrelating experience for 0 frames...
1732
+ [2023-02-25 20:13:40,336][37014] Decorrelating experience for 0 frames...
1733
+ [2023-02-25 20:13:40,342][37016] Decorrelating experience for 0 frames...
1734
+ [2023-02-25 20:13:40,347][37012] Decorrelating experience for 0 frames...
1735
+ [2023-02-25 20:13:41,223][37013] Decorrelating experience for 32 frames...
1736
+ [2023-02-25 20:13:41,371][37016] Decorrelating experience for 32 frames...
1737
+ [2023-02-25 20:13:41,374][37014] Decorrelating experience for 32 frames...
1738
+ [2023-02-25 20:13:41,377][37012] Decorrelating experience for 32 frames...
1739
+ [2023-02-25 20:13:41,382][37015] Decorrelating experience for 0 frames...
1740
+ [2023-02-25 20:13:41,389][37010] Decorrelating experience for 0 frames...
1741
+ [2023-02-25 20:13:41,692][37011] Decorrelating experience for 32 frames...
1742
+ [2023-02-25 20:13:42,477][37015] Decorrelating experience for 32 frames...
1743
+ [2023-02-25 20:13:42,480][37010] Decorrelating experience for 32 frames...
1744
+ [2023-02-25 20:13:42,621][37016] Decorrelating experience for 64 frames...
1745
+ [2023-02-25 20:13:42,642][36780] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 4005888. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
1746
+ [2023-02-25 20:13:42,645][37014] Decorrelating experience for 64 frames...
1747
+ [2023-02-25 20:13:42,904][37009] Decorrelating experience for 0 frames...
1748
+ [2023-02-25 20:13:42,993][37011] Decorrelating experience for 64 frames...
1749
+ [2023-02-25 20:13:43,820][37012] Decorrelating experience for 64 frames...
1750
+ [2023-02-25 20:13:43,843][37013] Decorrelating experience for 64 frames...
1751
+ [2023-02-25 20:13:43,909][37016] Decorrelating experience for 96 frames...
1752
+ [2023-02-25 20:13:43,936][37014] Decorrelating experience for 96 frames...
1753
+ [2023-02-25 20:13:44,038][37015] Decorrelating experience for 64 frames...
1754
+ [2023-02-25 20:13:44,387][37010] Decorrelating experience for 64 frames...
1755
+ [2023-02-25 20:13:44,603][37011] Decorrelating experience for 96 frames...
1756
+ [2023-02-25 20:13:45,201][37009] Decorrelating experience for 32 frames...
1757
+ [2023-02-25 20:13:45,587][37012] Decorrelating experience for 96 frames...
1758
+ [2023-02-25 20:13:46,209][37013] Decorrelating experience for 96 frames...
1759
+ [2023-02-25 20:13:46,376][37015] Decorrelating experience for 96 frames...
1760
+ [2023-02-25 20:13:47,070][37009] Decorrelating experience for 64 frames...
1761
+ [2023-02-25 20:13:47,162][37010] Decorrelating experience for 96 frames...
1762
+ [2023-02-25 20:13:47,642][36780] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 4005888. Throughput: 0: 2.0. Samples: 20. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
1763
+ [2023-02-25 20:13:47,645][36780] Avg episode reward: [(0, '0.800')]
1764
+ [2023-02-25 20:13:51,284][37009] Decorrelating experience for 96 frames...
1765
+ [2023-02-25 20:13:51,868][36994] Signal inference workers to stop experience collection...
1766
+ [2023-02-25 20:13:51,877][37007] InferenceWorker_p0-w0: stopping experience collection
1767
+ [2023-02-25 20:13:52,642][36780] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 4005888. Throughput: 0: 145.7. Samples: 2186. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
1768
+ [2023-02-25 20:13:52,648][36780] Avg episode reward: [(0, '2.498')]
1769
+ [2023-02-25 20:13:53,636][36994] Signal inference workers to resume experience collection...
1770
+ [2023-02-25 20:13:53,637][37007] InferenceWorker_p0-w0: resuming experience collection
1771
+ [2023-02-25 20:13:57,642][36780] Fps is (10 sec: 1638.4, 60 sec: 819.2, 300 sec: 819.2). Total num frames: 4022272. Throughput: 0: 229.3. Samples: 4586. Policy #0 lag: (min: 0.0, avg: 1.0, max: 2.0)
1772
+ [2023-02-25 20:13:57,654][36780] Avg episode reward: [(0, '4.360')]
1773
+ [2023-02-25 20:14:02,642][36780] Fps is (10 sec: 3686.4, 60 sec: 1474.6, 300 sec: 1474.6). Total num frames: 4042752. Throughput: 0: 310.0. Samples: 7750. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
1774
+ [2023-02-25 20:14:02,644][36780] Avg episode reward: [(0, '10.463')]
1775
+ [2023-02-25 20:14:02,929][37007] Updated weights for policy 0, policy_version 988 (0.0012)
1776
+ [2023-02-25 20:14:07,642][36780] Fps is (10 sec: 4096.0, 60 sec: 1911.5, 300 sec: 1911.5). Total num frames: 4063232. Throughput: 0: 466.8. Samples: 14004. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
1777
+ [2023-02-25 20:14:07,647][36780] Avg episode reward: [(0, '17.300')]
1778
+ [2023-02-25 20:14:12,642][36780] Fps is (10 sec: 3276.7, 60 sec: 1989.5, 300 sec: 1989.5). Total num frames: 4075520. Throughput: 0: 520.1. Samples: 18202. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
1779
+ [2023-02-25 20:14:12,647][36780] Avg episode reward: [(0, '20.322')]
1780
+ [2023-02-25 20:14:16,173][37007] Updated weights for policy 0, policy_version 998 (0.0016)
1781
+ [2023-02-25 20:14:17,642][36780] Fps is (10 sec: 2867.2, 60 sec: 2150.4, 300 sec: 2150.4). Total num frames: 4091904. Throughput: 0: 505.2. Samples: 20206. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
1782
+ [2023-02-25 20:14:17,645][36780] Avg episode reward: [(0, '20.621')]
1783
+ [2023-02-25 20:14:22,642][36780] Fps is (10 sec: 3686.5, 60 sec: 2366.6, 300 sec: 2366.6). Total num frames: 4112384. Throughput: 0: 579.5. Samples: 26076. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
1784
+ [2023-02-25 20:14:22,645][36780] Avg episode reward: [(0, '21.303')]
1785
+ [2023-02-25 20:14:25,898][37007] Updated weights for policy 0, policy_version 1008 (0.0020)
1786
+ [2023-02-25 20:14:27,643][36780] Fps is (10 sec: 4095.4, 60 sec: 2539.4, 300 sec: 2539.4). Total num frames: 4132864. Throughput: 0: 714.9. Samples: 32172. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
1787
+ [2023-02-25 20:14:27,648][36780] Avg episode reward: [(0, '24.253')]
1788
+ [2023-02-25 20:14:32,642][36780] Fps is (10 sec: 3276.8, 60 sec: 2532.1, 300 sec: 2532.1). Total num frames: 4145152. Throughput: 0: 759.6. Samples: 34202. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
1789
+ [2023-02-25 20:14:32,647][36780] Avg episode reward: [(0, '24.321')]
1790
+ [2023-02-25 20:14:37,642][36780] Fps is (10 sec: 2867.7, 60 sec: 2594.1, 300 sec: 2594.1). Total num frames: 4161536. Throughput: 0: 802.0. Samples: 38276. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
1791
+ [2023-02-25 20:14:37,645][36780] Avg episode reward: [(0, '24.009')]
1792
+ [2023-02-25 20:14:39,627][37007] Updated weights for policy 0, policy_version 1018 (0.0031)
1793
+ [2023-02-25 20:14:42,642][36780] Fps is (10 sec: 3686.4, 60 sec: 2935.5, 300 sec: 2709.7). Total num frames: 4182016. Throughput: 0: 884.2. Samples: 44374. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
1794
+ [2023-02-25 20:14:42,645][36780] Avg episode reward: [(0, '23.108')]
1795
+ [2023-02-25 20:14:47,644][36780] Fps is (10 sec: 4095.3, 60 sec: 3276.7, 300 sec: 2808.6). Total num frames: 4202496. Throughput: 0: 885.2. Samples: 47584. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
1796
+ [2023-02-25 20:14:47,648][36780] Avg episode reward: [(0, '23.678')]
1797
+ [2023-02-25 20:14:50,119][37007] Updated weights for policy 0, policy_version 1028 (0.0016)
1798
+ [2023-02-25 20:14:52,642][36780] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 2785.3). Total num frames: 4214784. Throughput: 0: 844.8. Samples: 52018. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
1799
+ [2023-02-25 20:14:52,648][36780] Avg episode reward: [(0, '23.693')]
1800
+ [2023-02-25 20:14:57,642][36780] Fps is (10 sec: 2458.0, 60 sec: 3413.3, 300 sec: 2764.8). Total num frames: 4227072. Throughput: 0: 837.5. Samples: 55888. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
1801
+ [2023-02-25 20:14:57,645][36780] Avg episode reward: [(0, '23.846')]
1802
+ [2023-02-25 20:15:02,642][36780] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 2843.1). Total num frames: 4247552. Throughput: 0: 860.8. Samples: 58940. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
1803
+ [2023-02-25 20:15:02,652][36780] Avg episode reward: [(0, '24.171')]
1804
+ [2023-02-25 20:15:02,950][37007] Updated weights for policy 0, policy_version 1038 (0.0020)
1805
+ [2023-02-25 20:15:07,642][36780] Fps is (10 sec: 4505.6, 60 sec: 3481.6, 300 sec: 2958.2). Total num frames: 4272128. Throughput: 0: 875.7. Samples: 65482. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
1806
+ [2023-02-25 20:15:07,648][36780] Avg episode reward: [(0, '24.212')]
1807
+ [2023-02-25 20:15:12,645][36780] Fps is (10 sec: 3685.4, 60 sec: 3481.5, 300 sec: 2931.8). Total num frames: 4284416. Throughput: 0: 844.1. Samples: 70156. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
1808
+ [2023-02-25 20:15:12,648][36780] Avg episode reward: [(0, '24.322')]
1809
+ [2023-02-25 20:15:12,664][36994] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001046_4284416.pth...
1810
+ [2023-02-25 20:15:12,803][36994] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000880_3604480.pth
1811
+ [2023-02-25 20:15:15,046][37007] Updated weights for policy 0, policy_version 1048 (0.0012)
1812
+ [2023-02-25 20:15:17,642][36780] Fps is (10 sec: 2457.5, 60 sec: 3413.3, 300 sec: 2908.2). Total num frames: 4296704. Throughput: 0: 843.1. Samples: 72142. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
1813
+ [2023-02-25 20:15:17,644][36780] Avg episode reward: [(0, '22.939')]
1814
+ [2023-02-25 20:15:22,642][36780] Fps is (10 sec: 3277.7, 60 sec: 3413.3, 300 sec: 2964.7). Total num frames: 4317184. Throughput: 0: 870.6. Samples: 77454. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
1815
+ [2023-02-25 20:15:22,645][36780] Avg episode reward: [(0, '23.019')]
1816
+ [2023-02-25 20:15:25,549][37007] Updated weights for policy 0, policy_version 1058 (0.0020)
1817
+ [2023-02-25 20:15:27,642][36780] Fps is (10 sec: 4505.7, 60 sec: 3481.7, 300 sec: 3053.4). Total num frames: 4341760. Throughput: 0: 880.8. Samples: 84008. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
1818
+ [2023-02-25 20:15:27,645][36780] Avg episode reward: [(0, '24.488')]
1819
+ [2023-02-25 20:15:32,645][36780] Fps is (10 sec: 3685.1, 60 sec: 3481.4, 300 sec: 3027.4). Total num frames: 4354048. Throughput: 0: 862.5. Samples: 86396. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
1820
+ [2023-02-25 20:15:32,648][36780] Avg episode reward: [(0, '25.865')]
1821
+ [2023-02-25 20:15:37,642][36780] Fps is (10 sec: 2457.6, 60 sec: 3413.3, 300 sec: 3003.7). Total num frames: 4366336. Throughput: 0: 854.3. Samples: 90460. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
1822
+ [2023-02-25 20:15:37,647][36780] Avg episode reward: [(0, '25.818')]
1823
+ [2023-02-25 20:15:39,190][37007] Updated weights for policy 0, policy_version 1068 (0.0016)
1824
+ [2023-02-25 20:15:42,642][36780] Fps is (10 sec: 3278.0, 60 sec: 3413.3, 300 sec: 3047.4). Total num frames: 4386816. Throughput: 0: 887.8. Samples: 95838. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
1825
+ [2023-02-25 20:15:42,645][36780] Avg episode reward: [(0, '25.888')]
1826
+ [2023-02-25 20:15:47,642][36780] Fps is (10 sec: 4096.0, 60 sec: 3413.4, 300 sec: 3087.8). Total num frames: 4407296. Throughput: 0: 890.6. Samples: 99016. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
1827
+ [2023-02-25 20:15:47,651][36780] Avg episode reward: [(0, '26.303')]
1828
+ [2023-02-25 20:15:48,672][37007] Updated weights for policy 0, policy_version 1078 (0.0014)
1829
+ [2023-02-25 20:15:52,644][36780] Fps is (10 sec: 3685.8, 60 sec: 3481.5, 300 sec: 3094.7). Total num frames: 4423680. Throughput: 0: 869.7. Samples: 104620. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
1830
+ [2023-02-25 20:15:52,650][36780] Avg episode reward: [(0, '26.379')]
1831
+ [2023-02-25 20:15:57,643][36780] Fps is (10 sec: 2866.9, 60 sec: 3481.5, 300 sec: 3072.0). Total num frames: 4435968. Throughput: 0: 855.7. Samples: 108662. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
1832
+ [2023-02-25 20:15:57,649][36780] Avg episode reward: [(0, '24.535')]
1833
+ [2023-02-25 20:16:02,270][37007] Updated weights for policy 0, policy_version 1088 (0.0016)
1834
+ [2023-02-25 20:16:02,642][36780] Fps is (10 sec: 3277.3, 60 sec: 3481.6, 300 sec: 3107.3). Total num frames: 4456448. Throughput: 0: 859.5. Samples: 110818. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
1835
+ [2023-02-25 20:16:02,645][36780] Avg episode reward: [(0, '24.583')]
1836
+ [2023-02-25 20:16:07,642][36780] Fps is (10 sec: 4096.4, 60 sec: 3413.3, 300 sec: 3140.3). Total num frames: 4476928. Throughput: 0: 889.4. Samples: 117476. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
1837
+ [2023-02-25 20:16:07,645][36780] Avg episode reward: [(0, '23.040')]
1838
+ [2023-02-25 20:16:12,477][37007] Updated weights for policy 0, policy_version 1098 (0.0025)
1839
+ [2023-02-25 20:16:12,642][36780] Fps is (10 sec: 4096.0, 60 sec: 3550.0, 300 sec: 3171.1). Total num frames: 4497408. Throughput: 0: 871.1. Samples: 123208. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
1840
+ [2023-02-25 20:16:12,645][36780] Avg episode reward: [(0, '22.578')]
1841
+ [2023-02-25 20:16:17,642][36780] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3148.8). Total num frames: 4509696. Throughput: 0: 864.1. Samples: 125278. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
1842
+ [2023-02-25 20:16:17,647][36780] Avg episode reward: [(0, '23.180')]
1843
+ [2023-02-25 20:16:22,642][36780] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3152.7). Total num frames: 4526080. Throughput: 0: 871.0. Samples: 129656. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
1844
+ [2023-02-25 20:16:22,649][36780] Avg episode reward: [(0, '24.289')]
1845
+ [2023-02-25 20:16:24,708][37007] Updated weights for policy 0, policy_version 1108 (0.0025)
1846
+ [2023-02-25 20:16:27,642][36780] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3204.5). Total num frames: 4550656. Throughput: 0: 896.8. Samples: 136194. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
1847
+ [2023-02-25 20:16:27,644][36780] Avg episode reward: [(0, '25.080')]
1848
+ [2023-02-25 20:16:32,642][36780] Fps is (10 sec: 4096.0, 60 sec: 3550.1, 300 sec: 3206.6). Total num frames: 4567040. Throughput: 0: 897.8. Samples: 139416. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
1849
+ [2023-02-25 20:16:32,648][36780] Avg episode reward: [(0, '24.626')]
1850
+ [2023-02-25 20:16:37,616][37007] Updated weights for policy 0, policy_version 1118 (0.0039)
1851
+ [2023-02-25 20:16:37,644][36780] Fps is (10 sec: 2866.5, 60 sec: 3549.7, 300 sec: 3185.7). Total num frames: 4579328. Throughput: 0: 853.1. Samples: 143010. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
1852
+ [2023-02-25 20:16:37,647][36780] Avg episode reward: [(0, '24.583')]
1853
+ [2023-02-25 20:16:42,642][36780] Fps is (10 sec: 2048.0, 60 sec: 3345.1, 300 sec: 3144.0). Total num frames: 4587520. Throughput: 0: 836.7. Samples: 146312. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
1854
+ [2023-02-25 20:16:42,652][36780] Avg episode reward: [(0, '24.644')]
1855
+ [2023-02-25 20:16:47,642][36780] Fps is (10 sec: 2048.5, 60 sec: 3208.5, 300 sec: 3125.9). Total num frames: 4599808. Throughput: 0: 828.2. Samples: 148086. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
1856
+ [2023-02-25 20:16:47,644][36780] Avg episode reward: [(0, '25.830')]
1857
+ [2023-02-25 20:16:51,336][37007] Updated weights for policy 0, policy_version 1128 (0.0016)
1858
+ [2023-02-25 20:16:52,642][36780] Fps is (10 sec: 3686.4, 60 sec: 3345.2, 300 sec: 3171.8). Total num frames: 4624384. Throughput: 0: 809.0. Samples: 153880. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
1859
+ [2023-02-25 20:16:52,650][36780] Avg episode reward: [(0, '25.715')]
1860
+ [2023-02-25 20:16:57,642][36780] Fps is (10 sec: 4505.6, 60 sec: 3481.7, 300 sec: 3194.9). Total num frames: 4644864. Throughput: 0: 820.9. Samples: 160150. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
1861
+ [2023-02-25 20:16:57,647][36780] Avg episode reward: [(0, '26.000')]
1862
+ [2023-02-25 20:17:02,642][36780] Fps is (10 sec: 3276.8, 60 sec: 3345.1, 300 sec: 3176.9). Total num frames: 4657152. Throughput: 0: 820.2. Samples: 162188. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
1863
+ [2023-02-25 20:17:02,649][36780] Avg episode reward: [(0, '26.506')]
1864
+ [2023-02-25 20:17:03,058][37007] Updated weights for policy 0, policy_version 1138 (0.0035)
1865
+ [2023-02-25 20:17:07,642][36780] Fps is (10 sec: 2867.2, 60 sec: 3276.8, 300 sec: 3179.3). Total num frames: 4673536. Throughput: 0: 816.8. Samples: 166412. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
1866
+ [2023-02-25 20:17:07,650][36780] Avg episode reward: [(0, '25.539')]
1867
+ [2023-02-25 20:17:12,642][36780] Fps is (10 sec: 3686.4, 60 sec: 3276.8, 300 sec: 3200.6). Total num frames: 4694016. Throughput: 0: 802.5. Samples: 172308. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
1868
+ [2023-02-25 20:17:12,650][36780] Avg episode reward: [(0, '28.332')]
1869
+ [2023-02-25 20:17:12,663][36994] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001146_4694016.pth...
1870
+ [2023-02-25 20:17:12,822][36994] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth
1871
+ [2023-02-25 20:17:12,833][36994] Saving new best policy, reward=28.332!
1872
+ [2023-02-25 20:17:14,422][37007] Updated weights for policy 0, policy_version 1148 (0.0028)
1873
+ [2023-02-25 20:17:17,643][36780] Fps is (10 sec: 4095.7, 60 sec: 3413.3, 300 sec: 3220.9). Total num frames: 4714496. Throughput: 0: 798.6. Samples: 175354. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
1874
+ [2023-02-25 20:17:17,645][36780] Avg episode reward: [(0, '30.397')]
1875
+ [2023-02-25 20:17:17,651][36994] Saving new best policy, reward=30.397!
1876
+ [2023-02-25 20:17:22,644][36780] Fps is (10 sec: 3276.3, 60 sec: 3345.0, 300 sec: 3204.0). Total num frames: 4726784. Throughput: 0: 833.5. Samples: 180518. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
1877
+ [2023-02-25 20:17:22,649][36780] Avg episode reward: [(0, '30.386')]
1878
+ [2023-02-25 20:17:27,554][37007] Updated weights for policy 0, policy_version 1158 (0.0014)
1879
+ [2023-02-25 20:17:27,642][36780] Fps is (10 sec: 2867.4, 60 sec: 3208.5, 300 sec: 3205.6). Total num frames: 4743168. Throughput: 0: 848.9. Samples: 184512. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
1880
+ [2023-02-25 20:17:27,644][36780] Avg episode reward: [(0, '30.981')]
1881
+ [2023-02-25 20:17:27,652][36994] Saving new best policy, reward=30.981!
1882
+ [2023-02-25 20:17:32,642][36780] Fps is (10 sec: 3277.3, 60 sec: 3208.5, 300 sec: 3207.1). Total num frames: 4759552. Throughput: 0: 868.4. Samples: 187164. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
1883
+ [2023-02-25 20:17:32,645][36780] Avg episode reward: [(0, '31.256')]
1884
+ [2023-02-25 20:17:32,664][36994] Saving new best policy, reward=31.256!
1885
+ [2023-02-25 20:17:37,420][37007] Updated weights for policy 0, policy_version 1168 (0.0020)
1886
+ [2023-02-25 20:17:37,642][36780] Fps is (10 sec: 4096.1, 60 sec: 3413.5, 300 sec: 3242.7). Total num frames: 4784128. Throughput: 0: 883.2. Samples: 193624. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
1887
+ [2023-02-25 20:17:37,645][36780] Avg episode reward: [(0, '30.355')]
1888
+ [2023-02-25 20:17:42,644][36780] Fps is (10 sec: 3685.4, 60 sec: 3481.4, 300 sec: 3226.6). Total num frames: 4796416. Throughput: 0: 857.1. Samples: 198720. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
1889
+ [2023-02-25 20:17:42,650][36780] Avg episode reward: [(0, '29.875')]
1890
+ [2023-02-25 20:17:47,643][36780] Fps is (10 sec: 2866.8, 60 sec: 3549.8, 300 sec: 3227.6). Total num frames: 4812800. Throughput: 0: 855.9. Samples: 200706. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
1891
+ [2023-02-25 20:17:47,649][36780] Avg episode reward: [(0, '29.212')]
1892
+ [2023-02-25 20:17:50,775][37007] Updated weights for policy 0, policy_version 1178 (0.0017)
1893
+ [2023-02-25 20:17:52,642][36780] Fps is (10 sec: 3277.5, 60 sec: 3413.3, 300 sec: 3228.6). Total num frames: 4829184. Throughput: 0: 871.8. Samples: 205644. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
1894
+ [2023-02-25 20:17:52,650][36780] Avg episode reward: [(0, '28.389')]
1895
+ [2023-02-25 20:17:57,642][36780] Fps is (10 sec: 4096.5, 60 sec: 3481.6, 300 sec: 3261.0). Total num frames: 4853760. Throughput: 0: 884.4. Samples: 212104. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
1896
+ [2023-02-25 20:17:57,647][36780] Avg episode reward: [(0, '27.861')]
1897
+ [2023-02-25 20:18:01,006][37007] Updated weights for policy 0, policy_version 1188 (0.0017)
1898
+ [2023-02-25 20:18:02,642][36780] Fps is (10 sec: 3686.5, 60 sec: 3481.6, 300 sec: 3245.9). Total num frames: 4866048. Throughput: 0: 878.6. Samples: 214892. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
1899
+ [2023-02-25 20:18:02,649][36780] Avg episode reward: [(0, '26.830')]
1900
+ [2023-02-25 20:18:07,642][36780] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3246.5). Total num frames: 4882432. Throughput: 0: 857.2. Samples: 219090. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
1901
+ [2023-02-25 20:18:07,650][36780] Avg episode reward: [(0, '27.270')]
1902
+ [2023-02-25 20:18:12,642][36780] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3247.0). Total num frames: 4898816. Throughput: 0: 878.2. Samples: 224030. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
1903
+ [2023-02-25 20:18:12,648][36780] Avg episode reward: [(0, '27.520')]
1904
+ [2023-02-25 20:18:13,930][37007] Updated weights for policy 0, policy_version 1198 (0.0020)
1905
+ [2023-02-25 20:18:17,642][36780] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3276.8). Total num frames: 4923392. Throughput: 0: 888.8. Samples: 227160. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
1906
+ [2023-02-25 20:18:17,644][36780] Avg episode reward: [(0, '27.217')]
1907
+ [2023-02-25 20:18:22,647][36780] Fps is (10 sec: 4094.1, 60 sec: 3549.7, 300 sec: 3276.7). Total num frames: 4939776. Throughput: 0: 880.8. Samples: 233266. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
1908
+ [2023-02-25 20:18:22,653][36780] Avg episode reward: [(0, '26.442')]
1909
+ [2023-02-25 20:18:25,255][37007] Updated weights for policy 0, policy_version 1208 (0.0014)
1910
+ [2023-02-25 20:18:27,642][36780] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3262.7). Total num frames: 4952064. Throughput: 0: 859.2. Samples: 237380. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
1911
+ [2023-02-25 20:18:27,648][36780] Avg episode reward: [(0, '24.585')]
1912
+ [2023-02-25 20:18:32,642][36780] Fps is (10 sec: 2868.5, 60 sec: 3481.6, 300 sec: 3262.9). Total num frames: 4968448. Throughput: 0: 861.4. Samples: 239468. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
1913
+ [2023-02-25 20:18:32,645][36780] Avg episode reward: [(0, '25.731')]
1914
+ [2023-02-25 20:18:36,645][37007] Updated weights for policy 0, policy_version 1218 (0.0021)
1915
+ [2023-02-25 20:18:37,642][36780] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3346.2). Total num frames: 4993024. Throughput: 0: 889.2. Samples: 245660. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
1916
+ [2023-02-25 20:18:37,645][36780] Avg episode reward: [(0, '26.719')]
1917
+ [2023-02-25 20:18:42,642][36780] Fps is (10 sec: 4096.1, 60 sec: 3550.0, 300 sec: 3401.8). Total num frames: 5009408. Throughput: 0: 880.6. Samples: 251732. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
1918
+ [2023-02-25 20:18:42,650][36780] Avg episode reward: [(0, '26.710')]
1919
+ [2023-02-25 20:18:47,642][36780] Fps is (10 sec: 2867.2, 60 sec: 3481.7, 300 sec: 3443.4). Total num frames: 5021696. Throughput: 0: 863.8. Samples: 253762. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
1920
+ [2023-02-25 20:18:47,653][36780] Avg episode reward: [(0, '25.818')]
1921
+ [2023-02-25 20:18:49,265][37007] Updated weights for policy 0, policy_version 1228 (0.0012)
1922
+ [2023-02-25 20:18:52,642][36780] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3443.4). Total num frames: 5038080. Throughput: 0: 861.8. Samples: 257872. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
1923
+ [2023-02-25 20:18:52,648][36780] Avg episode reward: [(0, '24.994')]
1924
+ [2023-02-25 20:18:57,642][36780] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3443.4). Total num frames: 5058560. Throughput: 0: 888.8. Samples: 264026. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
1925
+ [2023-02-25 20:18:57,648][36780] Avg episode reward: [(0, '25.185')]
1926
+ [2023-02-25 20:18:59,577][37007] Updated weights for policy 0, policy_version 1238 (0.0019)
1927
+ [2023-02-25 20:19:02,642][36780] Fps is (10 sec: 4095.9, 60 sec: 3549.9, 300 sec: 3443.4). Total num frames: 5079040. Throughput: 0: 890.8. Samples: 267244. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
1928
+ [2023-02-25 20:19:02,650][36780] Avg episode reward: [(0, '25.028')]
1929
+ [2023-02-25 20:19:07,642][36780] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3457.3). Total num frames: 5095424. Throughput: 0: 859.7. Samples: 271948. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
1930
+ [2023-02-25 20:19:07,647][36780] Avg episode reward: [(0, '23.659')]
1931
+ [2023-02-25 20:19:12,642][36780] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3443.4). Total num frames: 5107712. Throughput: 0: 859.9. Samples: 276076. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
1932
+ [2023-02-25 20:19:12,648][36780] Avg episode reward: [(0, '22.873')]
1933
+ [2023-02-25 20:19:12,661][36994] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001247_5107712.pth...
1934
+ [2023-02-25 20:19:12,788][36994] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001046_4284416.pth
1935
+ [2023-02-25 20:19:13,164][37007] Updated weights for policy 0, policy_version 1248 (0.0031)
1936
+ [2023-02-25 20:19:17,642][36780] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3443.4). Total num frames: 5128192. Throughput: 0: 881.0. Samples: 279114. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
1937
+ [2023-02-25 20:19:17,652][36780] Avg episode reward: [(0, '22.641')]
1938
+ [2023-02-25 20:19:22,491][37007] Updated weights for policy 0, policy_version 1258 (0.0018)
1939
+ [2023-02-25 20:19:22,645][36780] Fps is (10 sec: 4504.4, 60 sec: 3550.0, 300 sec: 3457.3). Total num frames: 5152768. Throughput: 0: 891.1. Samples: 285762. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
1940
+ [2023-02-25 20:19:22,648][36780] Avg episode reward: [(0, '23.764')]
1941
+ [2023-02-25 20:19:27,642][36780] Fps is (10 sec: 3686.3, 60 sec: 3549.9, 300 sec: 3457.3). Total num frames: 5165056. Throughput: 0: 858.8. Samples: 290376. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
1942
+ [2023-02-25 20:19:27,648][36780] Avg episode reward: [(0, '25.544')]
1943
+ [2023-02-25 20:19:32,642][36780] Fps is (10 sec: 2458.3, 60 sec: 3481.6, 300 sec: 3443.4). Total num frames: 5177344. Throughput: 0: 859.4. Samples: 292434. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
1944
+ [2023-02-25 20:19:32,645][36780] Avg episode reward: [(0, '24.751')]
1945
+ [2023-02-25 20:19:35,881][37007] Updated weights for policy 0, policy_version 1268 (0.0021)
1946
+ [2023-02-25 20:19:37,642][36780] Fps is (10 sec: 3276.9, 60 sec: 3413.3, 300 sec: 3443.4). Total num frames: 5197824. Throughput: 0: 886.7. Samples: 297774. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
1947
+ [2023-02-25 20:19:37,644][36780] Avg episode reward: [(0, '24.980')]
1948
+ [2023-02-25 20:19:42,642][36780] Fps is (10 sec: 4505.7, 60 sec: 3549.9, 300 sec: 3457.3). Total num frames: 5222400. Throughput: 0: 893.0. Samples: 304210. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
1949
+ [2023-02-25 20:19:42,648][36780] Avg episode reward: [(0, '27.353')]
1950
+ [2023-02-25 20:19:46,690][37007] Updated weights for policy 0, policy_version 1278 (0.0026)
1951
+ [2023-02-25 20:19:47,642][36780] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3457.3). Total num frames: 5234688. Throughput: 0: 878.1. Samples: 306758. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
1952
+ [2023-02-25 20:19:47,648][36780] Avg episode reward: [(0, '27.667')]
1953
+ [2023-02-25 20:19:52,642][36780] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3471.2). Total num frames: 5251072. Throughput: 0: 864.0. Samples: 310826. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
1954
+ [2023-02-25 20:19:52,647][36780] Avg episode reward: [(0, '27.888')]
1955
+ [2023-02-25 20:19:57,642][36780] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3457.3). Total num frames: 5267456. Throughput: 0: 887.2. Samples: 316000. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
1956
+ [2023-02-25 20:19:57,648][36780] Avg episode reward: [(0, '27.351')]
1957
+ [2023-02-25 20:19:58,933][37007] Updated weights for policy 0, policy_version 1288 (0.0017)
1958
+ [2023-02-25 20:20:02,647][36780] Fps is (10 sec: 4093.8, 60 sec: 3549.6, 300 sec: 3457.2). Total num frames: 5292032. Throughput: 0: 891.0. Samples: 319214. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
1959
+ [2023-02-25 20:20:02,656][36780] Avg episode reward: [(0, '28.102')]
1960
+ [2023-02-25 20:20:07,644][36780] Fps is (10 sec: 4095.2, 60 sec: 3549.8, 300 sec: 3471.2). Total num frames: 5308416. Throughput: 0: 876.1. Samples: 325184. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
1961
+ [2023-02-25 20:20:07,648][36780] Avg episode reward: [(0, '27.974')]
1962
+ [2023-02-25 20:20:10,581][37007] Updated weights for policy 0, policy_version 1298 (0.0028)
1963
+ [2023-02-25 20:20:12,642][36780] Fps is (10 sec: 2868.7, 60 sec: 3549.9, 300 sec: 3471.2). Total num frames: 5320704. Throughput: 0: 862.7. Samples: 329196. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
1964
+ [2023-02-25 20:20:12,651][36780] Avg episode reward: [(0, '27.162')]
1965
+ [2023-02-25 20:20:17,642][36780] Fps is (10 sec: 2867.8, 60 sec: 3481.6, 300 sec: 3457.3). Total num frames: 5337088. Throughput: 0: 864.2. Samples: 331324. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
1966
+ [2023-02-25 20:20:17,644][36780] Avg episode reward: [(0, '27.150')]
1967
+ [2023-02-25 20:20:21,712][37007] Updated weights for policy 0, policy_version 1308 (0.0020)
1968
+ [2023-02-25 20:20:22,642][36780] Fps is (10 sec: 4096.0, 60 sec: 3481.8, 300 sec: 3457.3). Total num frames: 5361664. Throughput: 0: 891.4. Samples: 337888. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
1969
+ [2023-02-25 20:20:22,644][36780] Avg episode reward: [(0, '26.277')]
1970
+ [2023-02-25 20:20:27,642][36780] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3471.2). Total num frames: 5378048. Throughput: 0: 872.8. Samples: 343486. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
1971
+ [2023-02-25 20:20:27,648][36780] Avg episode reward: [(0, '27.977')]
1972
+ [2023-02-25 20:20:32,642][36780] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3471.2). Total num frames: 5390336. Throughput: 0: 862.5. Samples: 345570. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
1973
+ [2023-02-25 20:20:32,653][36780] Avg episode reward: [(0, '27.962')]
1974
+ [2023-02-25 20:20:34,509][37007] Updated weights for policy 0, policy_version 1318 (0.0012)
1975
+ [2023-02-25 20:20:37,642][36780] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3457.3). Total num frames: 5406720. Throughput: 0: 867.9. Samples: 349882. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
1976
+ [2023-02-25 20:20:37,649][36780] Avg episode reward: [(0, '28.336')]
1977
+ [2023-02-25 20:20:42,642][36780] Fps is (10 sec: 2867.2, 60 sec: 3276.8, 300 sec: 3429.5). Total num frames: 5419008. Throughput: 0: 842.5. Samples: 353912. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0)
1978
+ [2023-02-25 20:20:42,646][36780] Avg episode reward: [(0, '28.040')]
1979
+ [2023-02-25 20:20:47,642][36780] Fps is (10 sec: 2457.5, 60 sec: 3276.8, 300 sec: 3415.7). Total num frames: 5431296. Throughput: 0: 817.4. Samples: 355994. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
1980
+ [2023-02-25 20:20:47,645][36780] Avg episode reward: [(0, '28.838')]
1981
+ [2023-02-25 20:20:49,364][37007] Updated weights for policy 0, policy_version 1328 (0.0029)
1982
+ [2023-02-25 20:20:52,642][36780] Fps is (10 sec: 2867.2, 60 sec: 3276.8, 300 sec: 3429.5). Total num frames: 5447680. Throughput: 0: 774.5. Samples: 360034. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
1983
+ [2023-02-25 20:20:52,649][36780] Avg episode reward: [(0, '28.336')]
1984
+ [2023-02-25 20:20:57,642][36780] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3401.8). Total num frames: 5459968. Throughput: 0: 776.0. Samples: 364116. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
1985
+ [2023-02-25 20:20:57,645][36780] Avg episode reward: [(0, '27.955')]
1986
+ [2023-02-25 20:21:01,530][37007] Updated weights for policy 0, policy_version 1338 (0.0028)
1987
+ [2023-02-25 20:21:02,642][36780] Fps is (10 sec: 3686.4, 60 sec: 3208.8, 300 sec: 3415.6). Total num frames: 5484544. Throughput: 0: 802.2. Samples: 367424. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
1988
+ [2023-02-25 20:21:02,646][36780] Avg episode reward: [(0, '28.915')]
1989
+ [2023-02-25 20:21:07,646][36780] Fps is (10 sec: 4504.1, 60 sec: 3276.7, 300 sec: 3415.6). Total num frames: 5505024. Throughput: 0: 805.1. Samples: 374120. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
1990
+ [2023-02-25 20:21:07,649][36780] Avg episode reward: [(0, '26.195')]
1991
+ [2023-02-25 20:21:12,643][36780] Fps is (10 sec: 3276.5, 60 sec: 3276.8, 300 sec: 3415.6). Total num frames: 5517312. Throughput: 0: 779.6. Samples: 378568. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
1992
+ [2023-02-25 20:21:12,649][36780] Avg episode reward: [(0, '26.725')]
1993
+ [2023-02-25 20:21:12,661][36994] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001347_5517312.pth...
1994
+ [2023-02-25 20:21:12,796][36994] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001146_4694016.pth
1995
+ [2023-02-25 20:21:13,377][37007] Updated weights for policy 0, policy_version 1348 (0.0023)
1996
+ [2023-02-25 20:21:17,642][36780] Fps is (10 sec: 2458.4, 60 sec: 3208.5, 300 sec: 3401.8). Total num frames: 5529600. Throughput: 0: 778.7. Samples: 380612. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
1997
+ [2023-02-25 20:21:17,649][36780] Avg episode reward: [(0, '26.737')]
1998
+ [2023-02-25 20:21:22,642][36780] Fps is (10 sec: 3277.1, 60 sec: 3140.3, 300 sec: 3387.9). Total num frames: 5550080. Throughput: 0: 802.0. Samples: 385972. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
1999
+ [2023-02-25 20:21:22,645][36780] Avg episode reward: [(0, '25.824')]
2000
+ [2023-02-25 20:21:24,566][37007] Updated weights for policy 0, policy_version 1358 (0.0016)
2001
+ [2023-02-25 20:21:27,642][36780] Fps is (10 sec: 4505.9, 60 sec: 3276.8, 300 sec: 3415.6). Total num frames: 5574656. Throughput: 0: 856.4. Samples: 392452. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
2002
+ [2023-02-25 20:21:27,645][36780] Avg episode reward: [(0, '25.973')]
2003
+ [2023-02-25 20:21:32,644][36780] Fps is (10 sec: 3685.7, 60 sec: 3276.7, 300 sec: 3415.7). Total num frames: 5586944. Throughput: 0: 863.3. Samples: 394844. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
2004
+ [2023-02-25 20:21:32,651][36780] Avg episode reward: [(0, '26.090')]
2005
+ [2023-02-25 20:21:37,438][37007] Updated weights for policy 0, policy_version 1368 (0.0030)
2006
+ [2023-02-25 20:21:37,642][36780] Fps is (10 sec: 2867.2, 60 sec: 3276.8, 300 sec: 3443.4). Total num frames: 5603328. Throughput: 0: 863.2. Samples: 398878. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
2007
+ [2023-02-25 20:21:37,648][36780] Avg episode reward: [(0, '27.051')]
2008
+ [2023-02-25 20:21:42,642][36780] Fps is (10 sec: 3277.2, 60 sec: 3345.0, 300 sec: 3457.3). Total num frames: 5619712. Throughput: 0: 890.1. Samples: 404172. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
2009
+ [2023-02-25 20:21:42,648][36780] Avg episode reward: [(0, '28.320')]
2010
+ [2023-02-25 20:21:47,642][36780] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3443.4). Total num frames: 5640192. Throughput: 0: 889.1. Samples: 407434. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
2011
+ [2023-02-25 20:21:47,644][36780] Avg episode reward: [(0, '27.490')]
2012
+ [2023-02-25 20:21:47,675][37007] Updated weights for policy 0, policy_version 1378 (0.0023)
2013
+ [2023-02-25 20:21:52,643][36780] Fps is (10 sec: 3686.1, 60 sec: 3481.5, 300 sec: 3429.5). Total num frames: 5656576. Throughput: 0: 865.8. Samples: 413080. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
2014
+ [2023-02-25 20:21:52,647][36780] Avg episode reward: [(0, '27.297')]
2015
+ [2023-02-25 20:21:57,642][36780] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3429.5). Total num frames: 5668864. Throughput: 0: 852.7. Samples: 416940. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
2016
+ [2023-02-25 20:21:57,648][36780] Avg episode reward: [(0, '27.868')]
2017
+ [2023-02-25 20:22:01,327][37007] Updated weights for policy 0, policy_version 1388 (0.0027)
2018
+ [2023-02-25 20:22:02,642][36780] Fps is (10 sec: 3277.3, 60 sec: 3413.3, 300 sec: 3443.4). Total num frames: 5689344. Throughput: 0: 856.5. Samples: 419156. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
2019
+ [2023-02-25 20:22:02,644][36780] Avg episode reward: [(0, '28.201')]
2020
+ [2023-02-25 20:22:07,642][36780] Fps is (10 sec: 4096.0, 60 sec: 3413.5, 300 sec: 3443.4). Total num frames: 5709824. Throughput: 0: 878.8. Samples: 425518. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
2021
+ [2023-02-25 20:22:07,649][36780] Avg episode reward: [(0, '29.350')]
2022
+ [2023-02-25 20:22:11,719][37007] Updated weights for policy 0, policy_version 1398 (0.0012)
2023
+ [2023-02-25 20:22:12,645][36780] Fps is (10 sec: 3685.4, 60 sec: 3481.5, 300 sec: 3429.5). Total num frames: 5726208. Throughput: 0: 853.7. Samples: 430870. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
2024
+ [2023-02-25 20:22:12,647][36780] Avg episode reward: [(0, '27.573')]
2025
+ [2023-02-25 20:22:17,647][36780] Fps is (10 sec: 3275.2, 60 sec: 3549.6, 300 sec: 3443.4). Total num frames: 5742592. Throughput: 0: 847.0. Samples: 432964. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
2026
+ [2023-02-25 20:22:17,653][36780] Avg episode reward: [(0, '27.818')]
2027
+ [2023-02-25 20:22:22,642][36780] Fps is (10 sec: 3277.7, 60 sec: 3481.6, 300 sec: 3443.4). Total num frames: 5758976. Throughput: 0: 856.4. Samples: 437416. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
2028
+ [2023-02-25 20:22:22,649][36780] Avg episode reward: [(0, '28.267')]
2029
+ [2023-02-25 20:22:24,316][37007] Updated weights for policy 0, policy_version 1408 (0.0022)
2030
+ [2023-02-25 20:22:27,642][36780] Fps is (10 sec: 3688.3, 60 sec: 3413.3, 300 sec: 3457.3). Total num frames: 5779456. Throughput: 0: 881.9. Samples: 443858. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
2031
+ [2023-02-25 20:22:27,649][36780] Avg episode reward: [(0, '27.919')]
2032
+ [2023-02-25 20:22:32,642][36780] Fps is (10 sec: 3686.4, 60 sec: 3481.7, 300 sec: 3429.5). Total num frames: 5795840. Throughput: 0: 880.5. Samples: 447058. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
2033
+ [2023-02-25 20:22:32,645][36780] Avg episode reward: [(0, '27.537')]
2034
+ [2023-02-25 20:22:35,808][37007] Updated weights for policy 0, policy_version 1418 (0.0013)
2035
+ [2023-02-25 20:22:37,642][36780] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3443.4). Total num frames: 5812224. Throughput: 0: 848.4. Samples: 451258. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
2036
+ [2023-02-25 20:22:37,646][36780] Avg episode reward: [(0, '26.920')]
2037
+ [2023-02-25 20:22:42,642][36780] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3443.4). Total num frames: 5828608. Throughput: 0: 866.8. Samples: 455944. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
2038
+ [2023-02-25 20:22:42,650][36780] Avg episode reward: [(0, '26.853')]
2039
+ [2023-02-25 20:22:47,028][37007] Updated weights for policy 0, policy_version 1428 (0.0015)
2040
+ [2023-02-25 20:22:47,642][36780] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3457.3). Total num frames: 5849088. Throughput: 0: 891.8. Samples: 459288. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
2041
+ [2023-02-25 20:22:47,647][36780] Avg episode reward: [(0, '26.227')]
2042
+ [2023-02-25 20:22:52,642][36780] Fps is (10 sec: 4096.1, 60 sec: 3550.0, 300 sec: 3443.4). Total num frames: 5869568. Throughput: 0: 897.9. Samples: 465924. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0)
2043
+ [2023-02-25 20:22:52,644][36780] Avg episode reward: [(0, '27.641')]
2044
+ [2023-02-25 20:22:57,644][36780] Fps is (10 sec: 3276.1, 60 sec: 3549.8, 300 sec: 3443.4). Total num frames: 5881856. Throughput: 0: 871.2. Samples: 470074. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
2045
+ [2023-02-25 20:22:57,656][36780] Avg episode reward: [(0, '27.899')]
2046
+ [2023-02-25 20:22:59,309][37007] Updated weights for policy 0, policy_version 1438 (0.0013)
2047
+ [2023-02-25 20:23:02,642][36780] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3443.4). Total num frames: 5898240. Throughput: 0: 870.6. Samples: 472136. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
2048
+ [2023-02-25 20:23:02,649][36780] Avg episode reward: [(0, '27.259')]
2049
+ [2023-02-25 20:23:07,642][36780] Fps is (10 sec: 3687.1, 60 sec: 3481.6, 300 sec: 3457.3). Total num frames: 5918720. Throughput: 0: 900.9. Samples: 477956. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
2050
+ [2023-02-25 20:23:07,645][36780] Avg episode reward: [(0, '27.029')]
2051
+ [2023-02-25 20:23:09,920][37007] Updated weights for policy 0, policy_version 1448 (0.0014)
2052
+ [2023-02-25 20:23:12,642][36780] Fps is (10 sec: 4096.0, 60 sec: 3550.0, 300 sec: 3443.4). Total num frames: 5939200. Throughput: 0: 898.5. Samples: 484292. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
2053
+ [2023-02-25 20:23:12,647][36780] Avg episode reward: [(0, '26.202')]
2054
+ [2023-02-25 20:23:12,659][36994] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001450_5939200.pth...
2055
+ [2023-02-25 20:23:12,827][36994] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001247_5107712.pth
2056
+ [2023-02-25 20:23:17,642][36780] Fps is (10 sec: 3276.8, 60 sec: 3481.9, 300 sec: 3429.6). Total num frames: 5951488. Throughput: 0: 871.7. Samples: 486284. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
2057
+ [2023-02-25 20:23:17,647][36780] Avg episode reward: [(0, '27.058')]
2058
+ [2023-02-25 20:23:22,642][36780] Fps is (10 sec: 2867.1, 60 sec: 3481.6, 300 sec: 3443.4). Total num frames: 5967872. Throughput: 0: 866.4. Samples: 490246. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
2059
+ [2023-02-25 20:23:22,645][36780] Avg episode reward: [(0, '26.485')]
2060
+ [2023-02-25 20:23:23,432][37007] Updated weights for policy 0, policy_version 1458 (0.0028)
2061
+ [2023-02-25 20:23:27,642][36780] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3457.3). Total num frames: 5988352. Throughput: 0: 894.0. Samples: 496172. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0)
2062
+ [2023-02-25 20:23:27,645][36780] Avg episode reward: [(0, '25.915')]
2063
+ [2023-02-25 20:23:31,005][36994] Stopping Batcher_0...
2064
+ [2023-02-25 20:23:31,007][36994] Loop batcher_evt_loop terminating...
2065
+ [2023-02-25 20:23:31,005][36780] Component Batcher_0 stopped!
2066
+ [2023-02-25 20:23:31,009][36994] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001466_6004736.pth...
2067
+ [2023-02-25 20:23:31,056][37007] Weights refcount: 2 0
2068
+ [2023-02-25 20:23:31,076][36780] Component InferenceWorker_p0-w0 stopped!
2069
+ [2023-02-25 20:23:31,076][37007] Stopping InferenceWorker_p0-w0...
2070
+ [2023-02-25 20:23:31,087][37007] Loop inference_proc0-0_evt_loop terminating...
2071
+ [2023-02-25 20:23:31,092][36780] Component RolloutWorker_w0 stopped!
2072
+ [2023-02-25 20:23:31,095][36780] Component RolloutWorker_w2 stopped!
2073
+ [2023-02-25 20:23:31,098][37011] Stopping RolloutWorker_w2...
2074
+ [2023-02-25 20:23:31,098][37011] Loop rollout_proc2_evt_loop terminating...
2075
+ [2023-02-25 20:23:31,100][37016] Stopping RolloutWorker_w7...
2076
+ [2023-02-25 20:23:31,100][37016] Loop rollout_proc7_evt_loop terminating...
2077
+ [2023-02-25 20:23:31,100][36780] Component RolloutWorker_w7 stopped!
2078
+ [2023-02-25 20:23:31,108][36780] Component RolloutWorker_w4 stopped!
2079
+ [2023-02-25 20:23:31,108][37013] Stopping RolloutWorker_w4...
2080
+ [2023-02-25 20:23:31,099][37010] Stopping RolloutWorker_w0...
2081
+ [2023-02-25 20:23:31,115][37015] Stopping RolloutWorker_w6...
2082
+ [2023-02-25 20:23:31,114][36780] Component RolloutWorker_w6 stopped!
2083
+ [2023-02-25 20:23:31,119][37010] Loop rollout_proc0_evt_loop terminating...
2084
+ [2023-02-25 20:23:31,111][37013] Loop rollout_proc4_evt_loop terminating...
2085
+ [2023-02-25 20:23:31,123][37015] Loop rollout_proc6_evt_loop terminating...
2086
+ [2023-02-25 20:23:31,129][36780] Component RolloutWorker_w1 stopped!
2087
+ [2023-02-25 20:23:31,137][37014] Stopping RolloutWorker_w5...
2088
+ [2023-02-25 20:23:31,137][36780] Component RolloutWorker_w5 stopped!
2089
+ [2023-02-25 20:23:31,129][37009] Stopping RolloutWorker_w1...
2090
+ [2023-02-25 20:23:31,150][37012] Stopping RolloutWorker_w3...
2091
+ [2023-02-25 20:23:31,150][36780] Component RolloutWorker_w3 stopped!
2092
+ [2023-02-25 20:23:31,138][37014] Loop rollout_proc5_evt_loop terminating...
2093
+ [2023-02-25 20:23:31,142][37009] Loop rollout_proc1_evt_loop terminating...
2094
+ [2023-02-25 20:23:31,151][37012] Loop rollout_proc3_evt_loop terminating...
2095
+ [2023-02-25 20:23:31,175][36994] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001347_5517312.pth
2096
+ [2023-02-25 20:23:31,185][36994] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001466_6004736.pth...
2097
+ [2023-02-25 20:23:31,392][36780] Component LearnerWorker_p0 stopped!
2098
+ [2023-02-25 20:23:31,394][36780] Waiting for process learner_proc0 to stop...
2099
+ [2023-02-25 20:23:31,403][36994] Stopping LearnerWorker_p0...
2100
+ [2023-02-25 20:23:31,404][36994] Loop learner_proc0_evt_loop terminating...
2101
+ [2023-02-25 20:23:33,398][36780] Waiting for process inference_proc0-0 to join...
2102
+ [2023-02-25 20:23:33,904][36780] Waiting for process rollout_proc0 to join...
2103
+ [2023-02-25 20:23:35,031][36780] Waiting for process rollout_proc1 to join...
2104
+ [2023-02-25 20:23:35,037][36780] Waiting for process rollout_proc2 to join...
2105
+ [2023-02-25 20:23:35,045][36780] Waiting for process rollout_proc3 to join...
2106
+ [2023-02-25 20:23:35,049][36780] Waiting for process rollout_proc4 to join...
2107
+ [2023-02-25 20:23:35,050][36780] Waiting for process rollout_proc5 to join...
2108
+ [2023-02-25 20:23:35,053][36780] Waiting for process rollout_proc6 to join...
2109
+ [2023-02-25 20:23:35,054][36780] Waiting for process rollout_proc7 to join...
2110
+ [2023-02-25 20:23:35,056][36780] Batcher 0 profile tree view:
2111
+ batching: 13.9036, releasing_batches: 0.0135
2112
+ [2023-02-25 20:23:35,057][36780] InferenceWorker_p0-w0 profile tree view:
2113
+ wait_policy: 0.0000
2114
+ wait_policy_total: 282.2654
2115
+ update_model: 3.9689
2116
+ weight_update: 0.0013
2117
+ one_step: 0.0023
2118
+ handle_policy_step: 281.5777
2119
+ deserialize: 8.0065, stack: 1.5816, obs_to_device_normalize: 60.4016, forward: 138.2783, send_messages: 14.1621
2120
+ prepare_outputs: 44.6963
2121
+ to_cpu: 27.5721
2122
+ [2023-02-25 20:23:35,060][36780] Learner 0 profile tree view:
2123
+ misc: 0.0026, prepare_batch: 12.2238
2124
+ train: 41.1266
2125
+ epoch_init: 0.0076, minibatch_init: 0.0054, losses_postprocess: 0.3181, kl_divergence: 0.3252, after_optimizer: 1.9096
2126
+ calculate_losses: 13.5311
2127
+ losses_init: 0.0018, forward_head: 1.0308, bptt_initial: 8.6948, tail: 0.5480, advantages_returns: 0.1486, losses: 1.7926
2128
+ bptt: 1.1595
2129
+ bptt_forward_core: 1.1141
2130
+ update: 24.6911
2131
+ clip: 0.7190
2132
+ [2023-02-25 20:23:35,062][36780] RolloutWorker_w0 profile tree view:
2133
+ wait_for_trajectories: 0.1817, enqueue_policy_requests: 78.4674, env_step: 438.6337, overhead: 12.2173, complete_rollouts: 4.2548
2134
+ save_policy_outputs: 11.4993
2135
+ split_output_tensors: 5.4428
2136
+ [2023-02-25 20:23:35,063][36780] RolloutWorker_w7 profile tree view:
2137
+ wait_for_trajectories: 0.1953, enqueue_policy_requests: 80.0619, env_step: 438.7543, overhead: 11.5066, complete_rollouts: 3.6980
2138
+ save_policy_outputs: 11.7847
2139
+ split_output_tensors: 5.5995
2140
+ [2023-02-25 20:23:35,067][36780] Loop Runner_EvtLoop terminating...
2141
+ [2023-02-25 20:23:35,069][36780] Runner profile tree view:
2142
+ main_loop: 618.6015
2143
+ [2023-02-25 20:23:35,070][36780] Collected {0: 6004736}, FPS: 3231.2
2144
+ [2023-02-25 20:23:35,214][36780] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json
2145
+ [2023-02-25 20:23:35,217][36780] Overriding arg 'num_workers' with value 1 passed from command line
2146
+ [2023-02-25 20:23:35,219][36780] Adding new argument 'no_render'=True that is not in the saved config file!
2147
+ [2023-02-25 20:23:35,221][36780] Adding new argument 'save_video'=True that is not in the saved config file!
2148
+ [2023-02-25 20:23:35,222][36780] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file!
2149
+ [2023-02-25 20:23:35,227][36780] Adding new argument 'video_name'=None that is not in the saved config file!
2150
+ [2023-02-25 20:23:35,228][36780] Adding new argument 'max_num_frames'=1000000000.0 that is not in the saved config file!
2151
+ [2023-02-25 20:23:35,229][36780] Adding new argument 'max_num_episodes'=10 that is not in the saved config file!
2152
+ [2023-02-25 20:23:35,230][36780] Adding new argument 'push_to_hub'=False that is not in the saved config file!
2153
+ [2023-02-25 20:23:35,231][36780] Adding new argument 'hf_repository'=None that is not in the saved config file!
2154
+ [2023-02-25 20:23:35,233][36780] Adding new argument 'policy_index'=0 that is not in the saved config file!
2155
+ [2023-02-25 20:23:35,235][36780] Adding new argument 'eval_deterministic'=False that is not in the saved config file!
2156
+ [2023-02-25 20:23:35,238][36780] Adding new argument 'train_script'=None that is not in the saved config file!
2157
+ [2023-02-25 20:23:35,239][36780] Adding new argument 'enjoy_script'=None that is not in the saved config file!
2158
+ [2023-02-25 20:23:35,241][36780] Using frameskip 1 and render_action_repeat=4 for evaluation
2159
+ [2023-02-25 20:23:35,285][36780] Doom resolution: 160x120, resize resolution: (128, 72)
2160
+ [2023-02-25 20:23:35,290][36780] RunningMeanStd input shape: (3, 72, 128)
2161
+ [2023-02-25 20:23:35,295][36780] RunningMeanStd input shape: (1,)
2162
+ [2023-02-25 20:23:35,325][36780] ConvEncoder: input_channels=3
2163
+ [2023-02-25 20:23:36,110][36780] Conv encoder output size: 512
2164
+ [2023-02-25 20:23:36,116][36780] Policy head output size: 512
2165
+ [2023-02-25 20:23:38,730][36780] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001466_6004736.pth...
2166
+ [2023-02-25 20:23:39,998][36780] Num frames 100...
2167
+ [2023-02-25 20:23:40,128][36780] Num frames 200...
2168
+ [2023-02-25 20:23:40,238][36780] Num frames 300...
2169
+ [2023-02-25 20:23:40,350][36780] Num frames 400...
2170
+ [2023-02-25 20:23:40,461][36780] Num frames 500...
2171
+ [2023-02-25 20:23:40,575][36780] Num frames 600...
2172
+ [2023-02-25 20:23:40,691][36780] Num frames 700...
2173
+ [2023-02-25 20:23:40,811][36780] Num frames 800...
2174
+ [2023-02-25 20:23:40,924][36780] Num frames 900...
2175
+ [2023-02-25 20:23:41,038][36780] Num frames 1000...
2176
+ [2023-02-25 20:23:41,170][36780] Num frames 1100...
2177
+ [2023-02-25 20:23:41,294][36780] Num frames 1200...
2178
+ [2023-02-25 20:23:41,410][36780] Num frames 1300...
2179
+ [2023-02-25 20:23:41,528][36780] Num frames 1400...
2180
+ [2023-02-25 20:23:41,652][36780] Num frames 1500...
2181
+ [2023-02-25 20:23:41,767][36780] Num frames 1600...
2182
+ [2023-02-25 20:23:41,883][36780] Num frames 1700...
2183
+ [2023-02-25 20:23:42,016][36780] Avg episode rewards: #0: 47.639, true rewards: #0: 17.640
2184
+ [2023-02-25 20:23:42,018][36780] Avg episode reward: 47.639, avg true_objective: 17.640
2185
+ [2023-02-25 20:23:42,072][36780] Num frames 1800...
2186
+ [2023-02-25 20:23:42,196][36780] Num frames 1900...
2187
+ [2023-02-25 20:23:42,312][36780] Num frames 2000...
2188
+ [2023-02-25 20:23:42,433][36780] Num frames 2100...
2189
+ [2023-02-25 20:23:42,547][36780] Num frames 2200...
2190
+ [2023-02-25 20:23:42,660][36780] Num frames 2300...
2191
+ [2023-02-25 20:23:42,723][36780] Avg episode rewards: #0: 28.525, true rewards: #0: 11.525
2192
+ [2023-02-25 20:23:42,726][36780] Avg episode reward: 28.525, avg true_objective: 11.525
2193
+ [2023-02-25 20:23:42,847][36780] Num frames 2400...
2194
+ [2023-02-25 20:23:42,960][36780] Num frames 2500...
2195
+ [2023-02-25 20:23:43,077][36780] Num frames 2600...
2196
+ [2023-02-25 20:23:43,198][36780] Num frames 2700...
2197
+ [2023-02-25 20:23:43,325][36780] Num frames 2800...
2198
+ [2023-02-25 20:23:43,444][36780] Num frames 2900...
2199
+ [2023-02-25 20:23:43,563][36780] Num frames 3000...
2200
+ [2023-02-25 20:23:43,688][36780] Num frames 3100...
2201
+ [2023-02-25 20:23:43,805][36780] Num frames 3200...
2202
+ [2023-02-25 20:23:43,918][36780] Num frames 3300...
2203
+ [2023-02-25 20:23:44,030][36780] Num frames 3400...
2204
+ [2023-02-25 20:23:44,159][36780] Num frames 3500...
2205
+ [2023-02-25 20:23:44,273][36780] Num frames 3600...
2206
+ [2023-02-25 20:23:44,351][36780] Avg episode rewards: #0: 29.723, true rewards: #0: 12.057
2207
+ [2023-02-25 20:23:44,352][36780] Avg episode reward: 29.723, avg true_objective: 12.057
2208
+ [2023-02-25 20:23:44,449][36780] Num frames 3700...
2209
+ [2023-02-25 20:23:44,562][36780] Num frames 3800...
2210
+ [2023-02-25 20:23:44,679][36780] Num frames 3900...
2211
+ [2023-02-25 20:23:44,792][36780] Num frames 4000...
2212
+ [2023-02-25 20:23:44,905][36780] Num frames 4100...
2213
+ [2023-02-25 20:23:45,024][36780] Num frames 4200...
2214
+ [2023-02-25 20:23:45,138][36780] Num frames 4300...
2215
+ [2023-02-25 20:23:45,296][36780] Avg episode rewards: #0: 26.212, true rewards: #0: 10.962
2216
+ [2023-02-25 20:23:45,298][36780] Avg episode reward: 26.212, avg true_objective: 10.962
2217
+ [2023-02-25 20:23:45,319][36780] Num frames 4400...
2218
+ [2023-02-25 20:23:45,440][36780] Num frames 4500...
2219
+ [2023-02-25 20:23:45,551][36780] Num frames 4600...
2220
+ [2023-02-25 20:23:45,662][36780] Num frames 4700...
2221
+ [2023-02-25 20:23:45,779][36780] Num frames 4800...
2222
+ [2023-02-25 20:23:45,895][36780] Num frames 4900...
2223
+ [2023-02-25 20:23:46,010][36780] Num frames 5000...
2224
+ [2023-02-25 20:23:46,124][36780] Num frames 5100...
2225
+ [2023-02-25 20:23:46,244][36780] Num frames 5200...
2226
+ [2023-02-25 20:23:46,364][36780] Num frames 5300...
2227
+ [2023-02-25 20:23:46,479][36780] Num frames 5400...
2228
+ [2023-02-25 20:23:46,583][36780] Avg episode rewards: #0: 25.682, true rewards: #0: 10.882
2229
+ [2023-02-25 20:23:46,585][36780] Avg episode reward: 25.682, avg true_objective: 10.882
2230
+ [2023-02-25 20:23:46,665][36780] Num frames 5500...
2231
+ [2023-02-25 20:23:46,781][36780] Num frames 5600...
2232
+ [2023-02-25 20:23:46,892][36780] Num frames 5700...
2233
+ [2023-02-25 20:23:47,006][36780] Num frames 5800...
2234
+ [2023-02-25 20:23:47,124][36780] Num frames 5900...
2235
+ [2023-02-25 20:23:47,239][36780] Num frames 6000...
2236
+ [2023-02-25 20:23:47,404][36780] Num frames 6100...
2237
+ [2023-02-25 20:23:47,567][36780] Num frames 6200...
2238
+ [2023-02-25 20:23:47,645][36780] Avg episode rewards: #0: 23.682, true rewards: #0: 10.348
2239
+ [2023-02-25 20:23:47,648][36780] Avg episode reward: 23.682, avg true_objective: 10.348
2240
+ [2023-02-25 20:23:47,795][36780] Num frames 6300...
2241
+ [2023-02-25 20:23:47,958][36780] Num frames 6400...
2242
+ [2023-02-25 20:23:48,117][36780] Num frames 6500...
2243
+ [2023-02-25 20:23:48,285][36780] Num frames 6600...
2244
+ [2023-02-25 20:23:48,446][36780] Num frames 6700...
2245
+ [2023-02-25 20:23:48,609][36780] Num frames 6800...
2246
+ [2023-02-25 20:23:48,771][36780] Num frames 6900...
2247
+ [2023-02-25 20:23:48,945][36780] Num frames 7000...
2248
+ [2023-02-25 20:23:49,107][36780] Num frames 7100...
2249
+ [2023-02-25 20:23:49,270][36780] Num frames 7200...
2250
+ [2023-02-25 20:23:49,437][36780] Num frames 7300...
2251
+ [2023-02-25 20:23:49,604][36780] Num frames 7400...
2252
+ [2023-02-25 20:23:49,777][36780] Num frames 7500...
2253
+ [2023-02-25 20:23:49,946][36780] Num frames 7600...
2254
+ [2023-02-25 20:23:50,116][36780] Num frames 7700...
2255
+ [2023-02-25 20:23:50,283][36780] Num frames 7800...
2256
+ [2023-02-25 20:23:50,455][36780] Num frames 7900...
2257
+ [2023-02-25 20:23:50,627][36780] Num frames 8000...
2258
+ [2023-02-25 20:23:50,741][36780] Avg episode rewards: #0: 27.904, true rewards: #0: 11.476
2259
+ [2023-02-25 20:23:50,744][36780] Avg episode reward: 27.904, avg true_objective: 11.476
2260
+ [2023-02-25 20:23:50,856][36780] Num frames 8100...
2261
+ [2023-02-25 20:23:50,984][36780] Num frames 8200...
2262
+ [2023-02-25 20:23:51,099][36780] Num frames 8300...
2263
+ [2023-02-25 20:23:51,225][36780] Num frames 8400...
2264
+ [2023-02-25 20:23:51,343][36780] Num frames 8500...
2265
+ [2023-02-25 20:23:51,463][36780] Num frames 8600...
2266
+ [2023-02-25 20:23:51,590][36780] Num frames 8700...
2267
+ [2023-02-25 20:23:51,703][36780] Num frames 8800...
2268
+ [2023-02-25 20:23:51,834][36780] Num frames 8900...
2269
+ [2023-02-25 20:23:51,948][36780] Num frames 9000...
2270
+ [2023-02-25 20:23:52,064][36780] Num frames 9100...
2271
+ [2023-02-25 20:23:52,185][36780] Num frames 9200...
2272
+ [2023-02-25 20:23:52,323][36780] Num frames 9300...
2273
+ [2023-02-25 20:23:52,443][36780] Num frames 9400...
2274
+ [2023-02-25 20:23:52,555][36780] Num frames 9500...
2275
+ [2023-02-25 20:23:52,673][36780] Num frames 9600...
2276
+ [2023-02-25 20:23:52,786][36780] Num frames 9700...
2277
+ [2023-02-25 20:23:52,875][36780] Avg episode rewards: #0: 30.411, true rewards: #0: 12.161
2278
+ [2023-02-25 20:23:52,876][36780] Avg episode reward: 30.411, avg true_objective: 12.161
2279
+ [2023-02-25 20:23:52,965][36780] Num frames 9800...
2280
+ [2023-02-25 20:23:53,089][36780] Num frames 9900...
2281
+ [2023-02-25 20:23:53,212][36780] Num frames 10000...
2282
+ [2023-02-25 20:23:53,339][36780] Num frames 10100...
2283
+ [2023-02-25 20:23:53,466][36780] Num frames 10200...
2284
+ [2023-02-25 20:23:53,591][36780] Num frames 10300...
2285
+ [2023-02-25 20:23:53,659][36780] Avg episode rewards: #0: 28.117, true rewards: #0: 11.450
2286
+ [2023-02-25 20:23:53,661][36780] Avg episode reward: 28.117, avg true_objective: 11.450
2287
+ [2023-02-25 20:23:53,771][36780] Num frames 10400...
2288
+ [2023-02-25 20:23:53,885][36780] Num frames 10500...
2289
+ [2023-02-25 20:23:54,007][36780] Num frames 10600...
2290
+ [2023-02-25 20:23:54,126][36780] Num frames 10700...
2291
+ [2023-02-25 20:23:54,250][36780] Num frames 10800...
2292
+ [2023-02-25 20:23:54,364][36780] Num frames 10900...
2293
+ [2023-02-25 20:23:54,486][36780] Num frames 11000...
2294
+ [2023-02-25 20:23:54,613][36780] Num frames 11100...
2295
+ [2023-02-25 20:23:54,727][36780] Num frames 11200...
2296
+ [2023-02-25 20:23:54,840][36780] Num frames 11300...
2297
+ [2023-02-25 20:23:54,956][36780] Num frames 11400...
2298
+ [2023-02-25 20:23:55,068][36780] Num frames 11500...
2299
+ [2023-02-25 20:23:55,193][36780] Avg episode rewards: #0: 28.561, true rewards: #0: 11.561
2300
+ [2023-02-25 20:23:55,195][36780] Avg episode reward: 28.561, avg true_objective: 11.561
2301
+ [2023-02-25 20:25:11,633][36780] Replay video saved to /content/train_dir/default_experiment/replay.mp4!
2302
+ [2023-02-25 20:25:12,374][36780] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json
2303
+ [2023-02-25 20:25:12,376][36780] Overriding arg 'num_workers' with value 1 passed from command line
2304
+ [2023-02-25 20:25:12,378][36780] Adding new argument 'no_render'=True that is not in the saved config file!
2305
+ [2023-02-25 20:25:12,380][36780] Adding new argument 'save_video'=True that is not in the saved config file!
2306
+ [2023-02-25 20:25:12,382][36780] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file!
2307
+ [2023-02-25 20:25:12,383][36780] Adding new argument 'video_name'=None that is not in the saved config file!
2308
+ [2023-02-25 20:25:12,385][36780] Adding new argument 'max_num_frames'=100000 that is not in the saved config file!
2309
+ [2023-02-25 20:25:12,386][36780] Adding new argument 'max_num_episodes'=10 that is not in the saved config file!
2310
+ [2023-02-25 20:25:12,387][36780] Adding new argument 'push_to_hub'=True that is not in the saved config file!
2311
+ [2023-02-25 20:25:12,388][36780] Adding new argument 'hf_repository'='SergejSchweizer/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file!
2312
+ [2023-02-25 20:25:12,389][36780] Adding new argument 'policy_index'=0 that is not in the saved config file!
2313
+ [2023-02-25 20:25:12,390][36780] Adding new argument 'eval_deterministic'=False that is not in the saved config file!
2314
+ [2023-02-25 20:25:12,391][36780] Adding new argument 'train_script'=None that is not in the saved config file!
2315
+ [2023-02-25 20:25:12,392][36780] Adding new argument 'enjoy_script'=None that is not in the saved config file!
2316
+ [2023-02-25 20:25:12,393][36780] Using frameskip 1 and render_action_repeat=4 for evaluation
2317
+ [2023-02-25 20:25:12,418][36780] RunningMeanStd input shape: (3, 72, 128)
2318
+ [2023-02-25 20:25:12,420][36780] RunningMeanStd input shape: (1,)
2319
+ [2023-02-25 20:25:12,437][36780] ConvEncoder: input_channels=3
2320
+ [2023-02-25 20:25:12,501][36780] Conv encoder output size: 512
2321
+ [2023-02-25 20:25:12,503][36780] Policy head output size: 512
2322
+ [2023-02-25 20:25:12,533][36780] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001466_6004736.pth...
2323
+ [2023-02-25 20:25:13,263][36780] Num frames 100...
2324
+ [2023-02-25 20:25:13,424][36780] Num frames 200...
2325
+ [2023-02-25 20:25:13,577][36780] Num frames 300...
2326
+ [2023-02-25 20:25:13,734][36780] Num frames 400...
2327
+ [2023-02-25 20:25:13,885][36780] Num frames 500...
2328
+ [2023-02-25 20:25:14,062][36780] Num frames 600...
2329
+ [2023-02-25 20:25:14,213][36780] Num frames 700...
2330
+ [2023-02-25 20:25:14,373][36780] Num frames 800...
2331
+ [2023-02-25 20:25:14,552][36780] Num frames 900...
2332
+ [2023-02-25 20:25:14,733][36780] Num frames 1000...
2333
+ [2023-02-25 20:25:14,906][36780] Num frames 1100...
2334
+ [2023-02-25 20:25:15,082][36780] Num frames 1200...
2335
+ [2023-02-25 20:25:15,246][36780] Num frames 1300...
2336
+ [2023-02-25 20:25:15,428][36780] Num frames 1400...
2337
+ [2023-02-25 20:25:15,599][36780] Num frames 1500...
2338
+ [2023-02-25 20:25:15,762][36780] Num frames 1600...
2339
+ [2023-02-25 20:25:15,916][36780] Num frames 1700...
2340
+ [2023-02-25 20:25:16,122][36780] Num frames 1800...
2341
+ [2023-02-25 20:25:16,298][36780] Num frames 1900...
2342
+ [2023-02-25 20:25:16,491][36780] Num frames 2000...
2343
+ [2023-02-25 20:25:16,697][36780] Num frames 2100...
2344
+ [2023-02-25 20:25:16,754][36780] Avg episode rewards: #0: 57.999, true rewards: #0: 21.000
2345
+ [2023-02-25 20:25:16,757][36780] Avg episode reward: 57.999, avg true_objective: 21.000
2346
+ [2023-02-25 20:25:16,963][36780] Num frames 2200...
2347
+ [2023-02-25 20:25:17,133][36780] Num frames 2300...
2348
+ [2023-02-25 20:25:17,299][36780] Num frames 2400...
2349
+ [2023-02-25 20:25:17,491][36780] Num frames 2500...
2350
+ [2023-02-25 20:25:17,692][36780] Num frames 2600...
2351
+ [2023-02-25 20:25:17,885][36780] Num frames 2700...
2352
+ [2023-02-25 20:25:18,087][36780] Num frames 2800...
2353
+ [2023-02-25 20:25:18,284][36780] Num frames 2900...
2354
+ [2023-02-25 20:25:18,483][36780] Num frames 3000...
2355
+ [2023-02-25 20:25:18,667][36780] Num frames 3100...
2356
+ [2023-02-25 20:25:18,855][36780] Num frames 3200...
2357
+ [2023-02-25 20:25:19,033][36780] Num frames 3300...
2358
+ [2023-02-25 20:25:19,224][36780] Num frames 3400...
2359
+ [2023-02-25 20:25:19,409][36780] Num frames 3500...
2360
+ [2023-02-25 20:25:19,599][36780] Num frames 3600...
2361
+ [2023-02-25 20:25:19,800][36780] Num frames 3700...
2362
+ [2023-02-25 20:25:19,988][36780] Num frames 3800...
2363
+ [2023-02-25 20:25:20,211][36780] Num frames 3900...
2364
+ [2023-02-25 20:25:20,410][36780] Num frames 4000...
2365
+ [2023-02-25 20:25:20,618][36780] Num frames 4100...
2366
+ [2023-02-25 20:25:20,816][36780] Num frames 4200...
2367
+ [2023-02-25 20:25:20,869][36780] Avg episode rewards: #0: 58.499, true rewards: #0: 21.000
2368
+ [2023-02-25 20:25:20,870][36780] Avg episode reward: 58.499, avg true_objective: 21.000
2369
+ [2023-02-25 20:25:21,027][36780] Num frames 4300...
2370
+ [2023-02-25 20:25:21,197][36780] Num frames 4400...
2371
+ [2023-02-25 20:25:21,351][36780] Num frames 4500...
2372
+ [2023-02-25 20:25:21,507][36780] Num frames 4600...
2373
+ [2023-02-25 20:25:21,647][36780] Num frames 4700...
2374
+ [2023-02-25 20:25:21,777][36780] Num frames 4800...
2375
+ [2023-02-25 20:25:21,933][36780] Avg episode rewards: #0: 44.286, true rewards: #0: 16.287
2376
+ [2023-02-25 20:25:21,934][36780] Avg episode reward: 44.286, avg true_objective: 16.287
2377
+ [2023-02-25 20:25:21,956][36780] Num frames 4900...
2378
+ [2023-02-25 20:25:22,079][36780] Num frames 5000...
2379
+ [2023-02-25 20:25:22,216][36780] Num frames 5100...
2380
+ [2023-02-25 20:25:22,329][36780] Num frames 5200...
2381
+ [2023-02-25 20:25:22,442][36780] Num frames 5300...
2382
+ [2023-02-25 20:25:22,561][36780] Num frames 5400...
2383
+ [2023-02-25 20:25:22,682][36780] Num frames 5500...
2384
+ [2023-02-25 20:25:22,803][36780] Num frames 5600...
2385
+ [2023-02-25 20:25:22,916][36780] Num frames 5700...
2386
+ [2023-02-25 20:25:23,034][36780] Num frames 5800...
2387
+ [2023-02-25 20:25:23,150][36780] Avg episode rewards: #0: 37.864, true rewards: #0: 14.615
2388
+ [2023-02-25 20:25:23,152][36780] Avg episode reward: 37.864, avg true_objective: 14.615
2389
+ [2023-02-25 20:25:23,223][36780] Num frames 5900...
2390
+ [2023-02-25 20:25:23,348][36780] Num frames 6000...
2391
+ [2023-02-25 20:25:23,479][36780] Num frames 6100...
2392
+ [2023-02-25 20:25:23,616][36780] Num frames 6200...
2393
+ [2023-02-25 20:25:23,732][36780] Num frames 6300...
2394
+ [2023-02-25 20:25:23,850][36780] Num frames 6400...
2395
+ [2023-02-25 20:25:23,964][36780] Num frames 6500...
2396
+ [2023-02-25 20:25:24,078][36780] Num frames 6600...
2397
+ [2023-02-25 20:25:24,198][36780] Num frames 6700...
2398
+ [2023-02-25 20:25:24,312][36780] Num frames 6800...
2399
+ [2023-02-25 20:25:24,429][36780] Num frames 6900...
2400
+ [2023-02-25 20:25:24,552][36780] Num frames 7000...
2401
+ [2023-02-25 20:25:24,667][36780] Num frames 7100...
2402
+ [2023-02-25 20:25:24,790][36780] Avg episode rewards: #0: 36.112, true rewards: #0: 14.312
2403
+ [2023-02-25 20:25:24,793][36780] Avg episode reward: 36.112, avg true_objective: 14.312
2404
+ [2023-02-25 20:25:24,851][36780] Num frames 7200...
2405
+ [2023-02-25 20:25:24,967][36780] Num frames 7300...
2406
+ [2023-02-25 20:25:25,088][36780] Num frames 7400...
2407
+ [2023-02-25 20:25:25,208][36780] Num frames 7500...
2408
+ [2023-02-25 20:25:25,324][36780] Num frames 7600...
2409
+ [2023-02-25 20:25:25,438][36780] Num frames 7700...
2410
+ [2023-02-25 20:25:25,556][36780] Num frames 7800...
2411
+ [2023-02-25 20:25:25,676][36780] Num frames 7900...
2412
+ [2023-02-25 20:25:25,794][36780] Num frames 8000...
2413
+ [2023-02-25 20:25:25,904][36780] Num frames 8100...
2414
+ [2023-02-25 20:25:26,020][36780] Num frames 8200...
2415
+ [2023-02-25 20:25:26,090][36780] Avg episode rewards: #0: 34.686, true rewards: #0: 13.687
2416
+ [2023-02-25 20:25:26,092][36780] Avg episode reward: 34.686, avg true_objective: 13.687
2417
+ [2023-02-25 20:25:26,202][36780] Num frames 8300...
2418
+ [2023-02-25 20:25:26,321][36780] Num frames 8400...
2419
+ [2023-02-25 20:25:26,443][36780] Num frames 8500...
2420
+ [2023-02-25 20:25:26,565][36780] Num frames 8600...
2421
+ [2023-02-25 20:25:26,677][36780] Num frames 8700...
2422
+ [2023-02-25 20:25:26,795][36780] Num frames 8800...
2423
+ [2023-02-25 20:25:26,910][36780] Num frames 8900...
2424
+ [2023-02-25 20:25:27,019][36780] Avg episode rewards: #0: 31.640, true rewards: #0: 12.783
2425
+ [2023-02-25 20:25:27,023][36780] Avg episode reward: 31.640, avg true_objective: 12.783
2426
+ [2023-02-25 20:25:27,087][36780] Num frames 9000...
2427
+ [2023-02-25 20:25:27,221][36780] Num frames 9100...
2428
+ [2023-02-25 20:25:27,360][36780] Num frames 9200...
2429
+ [2023-02-25 20:25:27,484][36780] Num frames 9300...
2430
+ [2023-02-25 20:25:27,606][36780] Num frames 9400...
2431
+ [2023-02-25 20:25:27,723][36780] Num frames 9500...
2432
+ [2023-02-25 20:25:27,844][36780] Num frames 9600...
2433
+ [2023-02-25 20:25:27,958][36780] Avg episode rewards: #0: 29.440, true rewards: #0: 12.065
2434
+ [2023-02-25 20:25:27,960][36780] Avg episode reward: 29.440, avg true_objective: 12.065
2435
+ [2023-02-25 20:25:28,025][36780] Num frames 9700...
2436
+ [2023-02-25 20:25:28,152][36780] Num frames 9800...
2437
+ [2023-02-25 20:25:28,281][36780] Num frames 9900...
2438
+ [2023-02-25 20:25:28,398][36780] Num frames 10000...
2439
+ [2023-02-25 20:25:28,517][36780] Num frames 10100...
2440
+ [2023-02-25 20:25:28,632][36780] Avg episode rewards: #0: 27.047, true rewards: #0: 11.269
2441
+ [2023-02-25 20:25:28,633][36780] Avg episode reward: 27.047, avg true_objective: 11.269
2442
+ [2023-02-25 20:25:28,706][36780] Num frames 10200...
2443
+ [2023-02-25 20:25:28,821][36780] Num frames 10300...
2444
+ [2023-02-25 20:25:28,934][36780] Num frames 10400...
2445
+ [2023-02-25 20:25:29,049][36780] Num frames 10500...
2446
+ [2023-02-25 20:25:29,169][36780] Num frames 10600...
2447
+ [2023-02-25 20:25:29,284][36780] Num frames 10700...
2448
+ [2023-02-25 20:25:29,398][36780] Num frames 10800...
2449
+ [2023-02-25 20:25:29,517][36780] Num frames 10900...
2450
+ [2023-02-25 20:25:29,640][36780] Num frames 11000...
2451
+ [2023-02-25 20:25:29,763][36780] Num frames 11100...
2452
+ [2023-02-25 20:25:29,890][36780] Num frames 11200...
2453
+ [2023-02-25 20:25:30,010][36780] Num frames 11300...
2454
+ [2023-02-25 20:25:30,129][36780] Num frames 11400...
2455
+ [2023-02-25 20:25:30,255][36780] Num frames 11500...
2456
+ [2023-02-25 20:25:30,375][36780] Num frames 11600...
2457
+ [2023-02-25 20:25:30,496][36780] Num frames 11700...
2458
+ [2023-02-25 20:25:30,622][36780] Num frames 11800...
2459
+ [2023-02-25 20:25:30,742][36780] Num frames 11900...
2460
+ [2023-02-25 20:25:30,865][36780] Num frames 12000...
2461
+ [2023-02-25 20:25:31,032][36780] Num frames 12100...
2462
+ [2023-02-25 20:25:31,199][36780] Num frames 12200...
2463
+ [2023-02-25 20:25:31,328][36780] Avg episode rewards: #0: 30.142, true rewards: #0: 12.242
2464
+ [2023-02-25 20:25:31,331][36780] Avg episode reward: 30.142, avg true_objective: 12.242
2465
+ [2023-02-25 20:26:53,706][36780] Replay video saved to /content/train_dir/default_experiment/replay.mp4!